#atom

A small-parameter document understanding model from Hugging Face and IBM

Core Idea: SmolDocling is a 256 million parameter vision-language model designed for document understanding, OCR, and conversion tasks that can run on GPUs with limited VRAM.

Key Elements

Use Cases

Implementation

Limitations

Connections

References

  1. Hugging Face SmolDocling repository
  2. SmolDocling research paper
  3. Hugging Face blog post on Smol VLMs

#DocumentAI #OCR #SmolModels #HuggingFace #VisionLanguageModels


Connections:


Sources: