SchemaGuidedLLM 1.0.0¶
Overview¶
v1.0.0 Native
Description¶
Guides an LLM to respond according to a specific JSON schema.
Configuration Options¶
| Name | Data Type | Description | Default Value |
|---|---|---|---|
| use_thread_history | bool | Whether to include previous conversation history when generating the response | false |
Inputs¶
| Name | Data Type | Description |
|---|---|---|
| message | list[ContentItem] or ContentItem or str | Input message to be processed by the LLM - can be text, content items, or mixed media |
| response_schema | dict[str, Any] | JSON schema defining the structure and format of the expected LLM response |
Outputs¶
| Name | Data Type | Description |
|---|---|---|
| response | Any | LLM-generated response conforming to the specified JSON schema structure |
Examples¶
# Basic structured data extraction
steps:
- name: extract_contact_info
type: SchemaGuidedLLM
config:
llm_config:
model: "gpt-4o"
api_key: "sk-proj-abc123..."
temperature: 0.3
max_tokens: 300
pre_prompt: "Extract contact information from the provided text in the specified format"
use_thread_history: false
inputs:
message: "Please contact John Smith at john.smith@company.com or call him at (555) 123-4567. He works at Tech Corp Inc."
response_schema:
type: "object"
properties:
name:
type: "string"
description: "Full name of the person"
email:
type: "string"
format: "email"
phone:
type: "string"
company:
type: "string"
required: ["name"]
Error Handling¶
Schema Validation Error
Error: LLM tried to call a tool but returned empty arguments
Cause: The LLM failed to generate a response matching the provided JSON schema
Solution: Simplify your schema, ensure all required fields are clearly defined, and check that the schema is valid JSON. Consider providing examples in your pre_prompt
Tool Call Processing Error
Error: Invalid JSON response from LLM tool call
Cause: The LLM generated malformed JSON that doesn't parse correctly
Solution: Lower the temperature (0.1-0.3) for more consistent formatting, add explicit JSON formatting instructions in your pre_prompt, or simplify complex nested schemas
Schema Type Mismatch
Error: Response structure doesn't match expected schema type
Cause: The schema definition doesn't align with how the LLM interprets the task
Solution: Review your schema for clarity, ensure property types match expected data, and validate that enum values are appropriate for your use case
FAQ¶
How do I create effective JSON schemas for LLM responses?
Design clear, well-structured schemas for consistent results:
- Start simple: Use basic types (string, number, boolean) before complex nested objects
- Be specific: Use enums for categorical data and format constraints for strings
- Add descriptions: Help the LLM understand field purposes and expected values
- Mark required fields: Specify which properties are mandatory vs optional
- Test incrementally: Start with simple schemas and gradually add complexity
Valid JSON Schema specification ensures better LLM compliance.
Which LLM models perform best with structured outputs?
Model capabilities vary for structured response generation:
- GPT-4o: Excellent for complex schemas with nested objects and arrays
- Claude-3.5-Sonnet: Strong performance with detailed business documents and financial data
- GPT-3.5-Turbo: Reliable for simple schemas but may struggle with complex nesting
For critical applications requiring perfect schema compliance, use GPT-4o or Claude-3.5-Sonnet.
How do I handle schema validation failures and improve compliance?
Implement strategies to improve schema adherence:
- Temperature tuning: Use lower values (0.1-0.3) for strict schema compliance
- Pre_prompt engineering: Include schema examples and explicit formatting instructions
- Schema simplification: Break complex schemas into smaller, manageable parts
- Validation logic: Implement post-processing to validate and fix minor issues
- Fallback schemas: Design simpler backup schemas for error recovery
When should I use thread history vs disable it for schema-guided responses?
Thread history usage depends on your application requirements:
- Enable for: Progressive data extraction, multi-step analysis, context-dependent schemas
- Disable for: Independent document processing, batch operations, consistent formatting
- Performance impact: History increases token usage and may affect schema compliance
- Consistency needs: Disable for applications requiring identical output formats
For most structured extraction tasks, disabling history provides more predictable results.
How do I integrate schema-guided responses with downstream systems?
SchemaGuidedLLM enables seamless integration patterns:
- API integration: Use structured output directly as API payloads for external systems
- Database insertion: Map schema properties to database columns for automated data entry
- Workflow routing: Use extracted categories or priorities to trigger conditional logic
- Data validation: Implement schema-based validation before processing downstream
- Format transformation: Convert LLM output to other formats (CSV, XML) using schema metadata
The consistent structure ensures reliable data flow throughout your applications.