RowChunk 2.0.0¶
Overview¶
Available Versions: 2.0.0 (current) | 1.0.0
Description¶
(v2) Splits a table document into one JSON chunk per row, computes embeddings, and emits a single ChunkGroup containing all rows.
Configuration Options¶
No configuration options available.
Inputs¶
| Name | Data Type | Description |
|---|---|---|
| data | File or str | Table data source. Supports File objects for CSV/XLSX/documents, or markdown table strings. Documents are converted using document intelligence with enhanced table extraction |
Outputs¶
| Name | Data Type | Description |
|---|---|---|
| chunks | Chunks | Structured Chunks object containing a ChunkGroup with all table rows as individual chunks. Includes computed embeddings and parent-child relationships |
Version History¶
- 2.0.0 (Current) - Native implementation
- 1.0.0 - Native implementation
Examples¶
blocks:
- name: process_inventory_table
type: RowChunk_2_0_0
input:
data: "inventory_report.csv"
# Input CSV:
# product_id,name,category,stock,price,status
# INV001,Laptop Computer,Electronics,45,899.99,active
# INV002,Wireless Mouse,Electronics,120,29.99,active
# INV003,Office Chair,Furniture,23,159.99,active
# Output: Chunks object with ChunkGroup containing:
# - Parent metadata (file info, content summary)
# - 3 child chunks with JSON content and computed embeddings
# - Structured relationships for downstream processing
Error Handling¶
No Markdown Table Found
- Error Code
BlockError- Common Cause
- Document conversion or string input doesn't contain recognizable markdown table format
- Solution
- Ensure document contains table structures or markdown input follows proper table syntax with pipe separators
Unsupported File Type
- Error Code
BlockError- Common Cause
- File extension is not supported for table processing or document intelligence conversion
- Solution
- Use supported formats: CSV, XLSX, or document formats (PDF, DOCX, PPTX, images) convertible by document intelligence
Chunk Building Error
- Error Code
ProcessingError- Common Cause
- Failed to build chunks or ChunkGroup structure during processing
- Solution
- Check data integrity and ensure table rows contain valid data that can be JSON serialized
FAQ¶
What are the key improvements in RowChunk v2.0.0?
Version 2.0.0 adds structured Chunks output with ChunkGroup organization, automatic embedding computation, enhanced markdown table extraction, and improved parent-child relationships for better data lineage and searchability.
How does the ChunkGroup structure work?
All table rows are grouped under a single parent object containing file metadata. Each row becomes a child chunk with computed embeddings, maintaining relationships that enable context-aware processing and semantic search capabilities.
Does v2.0.0 handle multiple tables in documents better?
Yes, v2.0.0 includes enhanced table extraction methods that can identify and process multiple markdown tables within a single document, combining all rows into one cohesive ChunkGroup while preserving table boundaries.
What embedding capabilities are included?
Each chunk automatically receives computed embeddings based on its JSON content, enabling vector similarity search, semantic clustering, and AI-powered data analysis workflows that weren't available in v1.0.0.
Can I migrate from RowChunk v1.0.0 to v2.0.0?
Migration requires updating downstream blocks to handle the Chunks model instead of individual Chunk objects. The core table processing logic remains compatible, but output structure is now organized as ChunkGroups with embeddings.