Skip to main content
RichDocument, Table, and related helpers backed by Docling. RichDocument wraps a DoclingDocument (e.g. produced by converting a PDF or Markdown file) and renders it as Markdown for a language model. Table represents a single table within a Docling document and provides transpose, to_markdown, and query/transform helpers. Use RichDocument.from_document_file to convert a PDF or other supported format, and get_tables() to extract structured table data for downstream LLM-driven Q&A or transformation tasks.

Classes

CLASS RichDocument

A RichDocument is a block of content backed by a DoclingDocument. Provides helper functions for working with the document and extracting parts such as tables. Use from_document_file to convert PDFs or other formats, and save/load for persistence. Args:
  • doc: The underlying Docling document to wrap.
Methods:

FUNC parts

parts(self) -> list[Component | CBlock]
Return the constituent parts of this document. Currently always returns an empty list. Future versions may support chunking the document into constituent parts. Returns:
  • list[Component | CBlock]: Always an empty list.

FUNC format_for_llm

format_for_llm(self) -> TemplateRepresentation | str
Return the document content as a Markdown string. No template is needed; the full document is exported to Markdown directly. Returns:
  • TemplateRepresentation | str: The full document rendered as Markdown.

FUNC docling

docling(self) -> DoclingDocument
Return the underlying DoclingDocument. Returns:
  • The wrapped Docling document instance.

FUNC to_markdown

to_markdown(self)
Get the full text of the document as markdown.

FUNC get_tables

get_tables(self) -> list[Table]
Return all tables found in this document. Returns:
  • list[Table]: A list of Table objects extracted from the document.

FUNC save

save(self, filename: str | Path) -> None
Save the underlying DoclingDocument to a JSON file for later reuse. Args:
  • filename: Destination file path for the serialized document.

FUNC load

load(cls, filename: str | Path) -> RichDocument
Load a RichDocument from a previously saved DoclingDocument JSON file. Args:
  • filename: Path to a JSON file previously created by RichDocument.save.
Returns:
  • A new RichDocument wrapping the loaded document.

FUNC from_document_file

from_document_file(cls, source: str | Path | DocumentStream) -> RichDocument
Convert a document file to a RichDocument using Docling. Args:
  • source: Path or stream for the source document (e.g. a PDF or Markdown file).
Returns:
  • A new RichDocument wrapping the converted document.

CLASS TableQuery

A Query component specialised for Table objects. Formats the table as Markdown alongside the query string so the LLM receives both the structured table content and the natural-language question. Args:
  • obj: The table to query.
  • query: The natural-language question to ask about the table.
Methods:

FUNC parts

parts(self) -> list[Component | CBlock]
Return the constituent parts of this table query. Returns:
  • list[Component | CBlock]: A list containing the wrapped Table
  • object.

FUNC format_for_llm

format_for_llm(self) -> TemplateRepresentation
Format this table query for the language model. Renders the table as Markdown alongside the query string, and forwards any tools and fields from the table’s own representation. Returns:
  • Template args containing the query string
  • and the Markdown-rendered table.

CLASS TableTransform

A Transform component specialised for Table objects. Formats the table as Markdown alongside the transformation instruction so the LLM receives both the structured table content and the mutation description. Args:
  • obj: The table to transform.
  • transformation: Natural-language description of the desired mutation.
Methods:

FUNC parts

parts(self) -> list[Component | CBlock]
Return the constituent parts of this table transform. Returns:
  • list[Component | CBlock]: A list containing the wrapped Table
  • object.

FUNC format_for_llm

format_for_llm(self) -> TemplateRepresentation
Format this table transform for the language model. Renders the table as Markdown alongside the transformation description, and forwards any tools and fields from the table’s own representation. Returns:
  • Template args containing the transformation
  • description and the Markdown-rendered table.

CLASS Table

A Table represents a single table within a larger Docling Document. Args:
  • ti: The Docling TableItem extracted from the document.
  • doc: The parent DoclingDocument. Passing None may cause downstream Docling functions to fail.
Methods:

FUNC from_markdown

from_markdown(cls, md: str) -> Table | None
Create a Table from a Markdown string by round-tripping through Docling. Wraps the Markdown in a minimal document, converts it with Docling, and returns the first table found. Args:
  • md: A Markdown string containing at least one table.
Returns:
  • Table | None: The first Table extracted from the Markdown, or
  • None if no table could be found.

FUNC parts

parts(self)
Return the constituent parts of this table component. The current implementation always returns an empty list because the table is rendered entirely through format_for_llm. Returns:
  • list[Component | CBlock]: Always an empty list.

FUNC to_markdown

to_markdown(self) -> str
Export this table as a Markdown string. Returns:
  • The Markdown representation of this table.

FUNC transpose

transpose(self) -> Table | None
Transpose this table and return the result as a new Table. Returns:
  • Table | None: A new transposed Table, or None if the
  • transposed Markdown cannot be parsed back into a Table.

FUNC format_for_llm

format_for_llm(self) -> TemplateRepresentation | str
Return the table representation for the Formatter. Returns:
  • TemplateRepresentation | str: A [TemplateRepresentation](../../../core/base#class-templaterepresentation) that
  • renders the table as its Markdown string using a \{\{table\}\}
  • template.