LLM 1.0.0¶

Overview¶

Agent Beginner

v1.0.0 Native

Available Versions: 1.0.1 | 1.0.0 (current)

Description¶

Processes chat messages using an LLM model with function calling capabilities, featuring two steps: chat for message processing and handle_tool_result for managing tool execution results.

Configuration Options¶

Name	Data Type	Description	Default Value
use_thread_history	`bool`	Whether to include previous conversation history in the context for maintaining conversational flow across multiple interactions	false

Inputs¶

Name	Data Type	Description
message	`list[ContentItem] or ContentItem or str`	Input message that can be a string, single content item (text/image), or list of content items for processing by the LLM

Outputs¶

Name	Data Type	Description
response	`ResponseSchemaT`	Output of the LLM, which will be a string if the response schema is a string, or a dictionary if the response schema is an object
sources	`list[Source]`	List of sources used in the response when citations are enabled through dataset items
reasoning	`str`	Reasoning content (chain of thought) from the LLM if available and reasoning_effort is configured

Version History¶

1.0.1 - Native implementation
1.0.0 (Current) - Native implementation

Examples¶

# Simple text generation with GPT-4o
- name: generate_response
  block: LLM_1_0_0
  config:
    use_thread_history: false
    llm_config:
      model: "gpt-4o"
      api_key: "sk-proj-abc123..."
      temperature: 0.7
      max_tokens: 500
      pre_prompt: "You are a helpful assistant that provides clear and concise answers."
  input:
    message: "Explain the benefits of cloud computing for small businesses."

    

# Advanced LLM configuration with conversational history and structured output
- name: analyze_with_context
  block: LLM_1_0_0
  config:
    use_thread_history: true  # Maintains conversation history
    llm_config:
      model: "claude-3.5-sonnet"
      api_key: "sk-ant-api03-abc123..."
      temperature: 0.3
      max_tokens: 2048
      pre_prompt: "You are a business analyst. Provide structured analysis with clear recommendations."
    reasoning_effort: "medium"  # Enable chain-of-thought reasoning
    response_schema:
      type: "object"
      properties:
        analysis:
          type: "string"
          description: "Detailed analysis of the business scenario"
        recommendations:
          type: "array"
          items:
            type: "string"
          description: "List of actionable recommendations"
        risk_level:
          type: "string"
          enum: ["low", "medium", "high"]
          description: "Overall risk assessment"
  input:
    message:
      - text: "Our quarterly sales data shows a 15% decline in the retail segment but 25% growth in online sales."
      - image: "quarterly_chart.png"  # Content item with image

    

# LLM with function calling capabilities
- name: intelligent_assistant
  block: LLM_1_0_0
  config:
    use_thread_history: true
    llm_config:
      model: "gpt-4o"
      api_key: "sk-proj-abc123..."
      temperature: 0.9
      max_tokens: 1500
      pre_prompt: "You are an AI assistant with access to external tools. Use them when needed to provide accurate and helpful responses."
    tools:
      search_database:
        description: "Search the knowledge base for relevant information"
        parameters:
          query: "string"
          limit: "integer"
      calculate:
        description: "Perform mathematical calculations"
        parameters:
          expression: "string"
  input:
    message: "I need to find information about our top 10 customers and calculate their total revenue contribution."

    

Error Handling¶

LLM API Authentication Errors

Cause: Invalid or expired API key, incorrect model name, or insufficient API quota/permissions for the specified LLM provider.

Solution: Verify your API credentials and model access:

config:
  llm_config:
    model: "gpt-4o"  # Ensure model name is correct
    api_key: "sk-proj-abc123..."  # Valid API key with sufficient quota
    # Check your provider's documentation for correct model names
    # OpenAI: gpt-4o, gpt-3.5-turbo
    # Anthropic: claude-3.5-sonnet, claude-3-haiku

Token Limit Exceeded

Cause: The combination of input message, conversation history, and system prompts exceeds the model's context window limit.

Solution: Optimize token usage by adjusting configuration:

config:
  use_thread_history: false  # Disable history if not needed
  llm_config:
    max_tokens: 1024  # Set appropriate output limit
    pre_prompt: "Brief response."  # Keep system prompts concise
# For large inputs, consider chunking or summarizing content first

Tool Call Processing Failures

Cause: LLM generates invalid tool calls, tool execution fails, or tool response format is incompatible with expected schema.

Solution: Ensure proper tool configuration and error handling:

config:
  tools:
    my_tool:
      description: "Clear, specific tool description"  # Help LLM understand when to use
      parameters:
        required_param: "string"  # Define clear parameter types
  llm_config:
    temperature: 0.3  # Lower temperature for more predictable tool calls

FAQ¶

What's the difference between LLM v1.0.0 and v1.0.1?

The key differences between versions:

Base Class: v1.0.0 extends WorkSpaceBlock, while v1.0.1 extends Block with DATASET scope
Configuration: v1.0.0 includes use_thread_history config option, v1.0.1 does not
Scope: v1.0.1 is designed specifically for dataset operations
Thread History: v1.0.0 allows configurable conversation history, v1.0.1 uses fixed behavior
Use Case: Choose v1.0.0 for general chat applications, v1.0.1 for dataset-focused workflows

How do I configure different LLM providers?

The block supports multiple LLM providers through the llm_config parameter:

OpenAI: Use models like "gpt-4o", "gpt-3.5-turbo" with API key format "sk-proj-..."
Anthropic: Use models like "claude-3.5-sonnet", "claude-3-haiku" with API key format "sk-ant-api03-..."
Parameters: All providers support temperature (0.0-2.0), max_tokens, and pre_prompt
Reasoning: Set reasoning_effort to "low", "medium", or "high" for supported models (currently some OpenAI models)

Always check your provider's documentation for the latest model names and authentication requirements.

How does conversation history work with use_thread_history?

When use_thread_history is enabled in v1.0.0:

Context Preservation: Previous messages and responses are included in subsequent calls
Memory Management: The block automatically manages token limits by truncating old messages when needed
Tool Interactions: Tool calls and responses are preserved in the conversation context
Performance Impact: Longer histories consume more tokens and may slow responses
Best Practice: Enable for multi-turn conversations, disable for independent single queries

Note: v1.0.1 does not have this configuration option and uses default history behavior.

What are the best practices for prompt engineering with this block?

Effective prompt engineering techniques:

Clear Instructions: Use specific, actionable language in pre_prompt and input messages
Context Setting: Define the LLM's role and expected behavior clearly
Output Format: Use response_schema to enforce structured outputs when needed
Temperature Control: Use low values (0.1-0.3) for factual tasks, higher (0.7-0.9) for creative tasks
Token Management: Keep prompts concise while providing necessary context
Examples: Include few-shot examples in your pre_prompt for consistent formatting

How do I optimize performance and manage costs?

Performance and cost optimization strategies:

Model Selection: Use smaller models (gpt-3.5-turbo, claude-3-haiku) for simple tasks
Token Limits: Set appropriate max_tokens to avoid unnecessary generation
History Management: Disable use_thread_history when conversation context isn't needed
Batch Processing: Process multiple similar requests together when possible
Caching: Implement caching layers for repeated queries with identical inputs
Monitoring: Track token usage and API costs to identify optimization opportunities

Remember that reasoning_effort increases token consumption but may improve response quality for complex tasks.