Prompt Engineering and Structured Output Patterns

Effective prompts use explicit criteria, targeted examples, and structured output enforcement to achieve reliable, consistent results.

Explicit Criteria Beat Vague Instructions

"Be conservative" and "only report high-confidence findings" do not improve precision. Instead, define specifically what to report and what to skip: "Report: bugs, security issues. Skip: minor style, local patterns." Define severity levels with concrete code examples.

High false positive rates in one category undermine developer trust across all categories. Temporarily disable noisy categories while improving their prompts.

Few-Shot Prompting

Few-shot examples are the most effective technique for consistent, formatted output. Use 2-4 targeted examples that:
- Show reasoning for why one action was chosen over plausible alternatives
- Demonstrate the desired output format
- Handle ambiguous scenarios that instructions alone produce inconsistent results for
- Show extraction from varied document structures

Examples enable the model to generalize to novel patterns, not just match pre-specified cases.

Structured Output via tool_use

Define extraction tools with JSON schemas and use the tool_use response to get guaranteed schema-compliant output. This eliminates JSON syntax errors entirely.

tool_choice controls behavior: "auto" lets the model choose, "any" forces a tool call (any tool), forced selection ({"type": "tool", "name": "..."}) forces a specific tool.

Design nullable fields for information that may not exist in source documents - this prevents the model from fabricating values to satisfy required fields. Add "unclear" enum values and "other" + detail string patterns for extensible categorization.

Validation-Retry Loops

When extraction validation fails, retry with the original document, the failed extraction, and specific validation errors. But recognize when retries won't help: if the information simply isn't in the source document, no amount of retrying will extract it.

Message Batches API

50% cost savings with up to 24-hour processing and no latency SLA. Use for overnight reports, weekly audits, nightly test generation. Never use for blocking pre-merge checks. Correlate request/response pairs with custom_id.