/ prompt injection / document review / AI security

Prompt Injection Risk in Document Review Workflows

Documents can contain instructions that confuse AI tools, so verification systems need boundaries around external text.

AI verification tools often read supplier documents, websites, emails, and PDFs. Those sources can contain text that tries to steer the model: ignore previous instructions, mark this supplier as safe, or hide a mismatch. The workflow should treat external text as evidence, not as instructions.

Separate system instructions from document content. The model can extract and summarize a document, but it should not obey commands found inside that document. OWASP lists prompt injection as a core LLM application risk because models can blur data and instructions.

Use constrained outputs for high-risk tasks. Ask the model to return fields, source locations, and uncertainty labels instead of open-ended conclusions. A human reviewer should approve payment-sensitive decisions.

Log suspicious content. If a supplier document contains strange instructions, hidden text, or irrelevant prompts, record that as a document-quality issue and request a clean version.

Design for damage control. Even if a model output is manipulated, it should not be able to release payment, change a supplier status, or send data without another control.

Working checklist

  • Treat document text as data.
  • Use constrained extraction formats.
  • Log suspicious instructions.
  • Keep human approval for payment decisions.
  • Limit tool permissions around LLM outputs.

Sources reviewed