#atom

A structured format for representing extracted document elements with positional information

Core Idea: DocTags is a markup format used by document understanding models like SmolDocling to represent both the content and the structural information of processed documents, including element types and spatial positioning.

Key Elements

<doc> <text loc="[x1, y1, x2, y2]">Extracted text content</text> <list> <item loc="[x1, y1, x2, y2]">List item 1</item> <item loc="[x1, y1, x2, y2]">List item 2</item> </list> <code loc="[x1, y1, x2, y2]"> def example(): return "code block" </code> </doc> 

Applications

Advantages

Further Processing

Connections

References

  1. SmolDocling paper and documentation
  2. DocTags format specification
  3. Document understanding literature

#DocumentMarkup #StructuredData #DocumentAI #InformationExtraction #DocumentRepresentation


Connections:


Sources: