Skip to content

RowChunk 2.0.0

Overview

Function Beginner

Version Source

Available Versions: 2.0.0 (current) | 1.0.0

Description

(v2) Splits a table document into one JSON chunk per row, computes embeddings, and emits a single ChunkGroup containing all rows.

Configuration Options

No configuration options available.

Inputs

NameData TypeDescription
dataFile or strTable data source. Supports File objects for CSV/XLSX/documents, or markdown table strings. Documents are converted using document intelligence with enhanced table extraction

Outputs

NameData TypeDescription
chunksChunksStructured Chunks object containing a ChunkGroup with all table rows as individual chunks. Includes computed embeddings and parent-child relationships

Version History

  • 2.0.0 (Current) - Native implementation
  • 1.0.0 - Native implementation

Examples

blocks:
  - name: process_inventory_table
    type: RowChunk_2_0_0
    input:
      data: "inventory_report.csv"

# Input CSV:
# product_id,name,category,stock,price,status
# INV001,Laptop Computer,Electronics,45,899.99,active
# INV002,Wireless Mouse,Electronics,120,29.99,active
# INV003,Office Chair,Furniture,23,159.99,active

# Output: Chunks object with ChunkGroup containing:
# - Parent metadata (file info, content summary)
# - 3 child chunks with JSON content and computed embeddings
# - Structured relationships for downstream processing

Error Handling

No Markdown Table Found

Error Code
BlockError
Common Cause
Document conversion or string input doesn't contain recognizable markdown table format
Solution
Ensure document contains table structures or markdown input follows proper table syntax with pipe separators

Unsupported File Type

Error Code
BlockError
Common Cause
File extension is not supported for table processing or document intelligence conversion
Solution
Use supported formats: CSV, XLSX, or document formats (PDF, DOCX, PPTX, images) convertible by document intelligence

Chunk Building Error

Error Code
ProcessingError
Common Cause
Failed to build chunks or ChunkGroup structure during processing
Solution
Check data integrity and ensure table rows contain valid data that can be JSON serialized

FAQ

What are the key improvements in RowChunk v2.0.0?

Version 2.0.0 adds structured Chunks output with ChunkGroup organization, automatic embedding computation, enhanced markdown table extraction, and improved parent-child relationships for better data lineage and searchability.

How does the ChunkGroup structure work?

All table rows are grouped under a single parent object containing file metadata. Each row becomes a child chunk with computed embeddings, maintaining relationships that enable context-aware processing and semantic search capabilities.

Does v2.0.0 handle multiple tables in documents better?

Yes, v2.0.0 includes enhanced table extraction methods that can identify and process multiple markdown tables within a single document, combining all rows into one cohesive ChunkGroup while preserving table boundaries.

What embedding capabilities are included?

Each chunk automatically receives computed embeddings based on its JSON content, enabling vector similarity search, semantic clustering, and AI-powered data analysis workflows that weren't available in v1.0.0.

Can I migrate from RowChunk v1.0.0 to v2.0.0?

Migration requires updating downstream blocks to handle the Chunks model instead of individual Chunk objects. The core table processing logic remains compatible, but output structure is now organized as ChunkGroups with embeddings.