Back to changelog

Document Ingestion now supports XML, DOC, and Markdown files

Document ingestion now supports XML, legacy DOC, and Markdown files with the same parsing capabilities as existing formats.

Key Highlights

  • Native XML parsing for config files and structured data exports
  • Legacy DOC file support for older document repositories
  • Markdown processing for documentation and technical specs

What's new#

Document ingestion now handles three additional file formats: XML documents, legacy DOC files (Microsoft Word 97-2003), and native Markdown files. These join our existing support for PDF, DOCX, images, and other formats in a unified parsing pipeline.

Why it matters#

  • XML files are everywhere in enterprise workflows (config files, data exports, structured documents)
  • Legacy DOC files still appear in legacy systems and older document repositories
  • Markdown files are standard for documentation, README files, and technical specs
  • Unified processing means fewer custom preprocessing steps in your pipeline

Highlights#

  • Native parsing preserves document structure and metadata
  • Same extraction and classification capabilities as other formats
  • Automatic format detection - no manual format specification required
  • Full compatibility with structured extraction and summarization features

How to use#

Works automatically when you upload any of these file types. No configuration changes needed.

1doc_ai = DocumentAI() 2 3# All of these now work seamlessly 4xml_file_id = doc_ai.upload(path="/path/to/config.xml") 5xml_result = doc_ai.parse_and_wait(xml_file_id) 6 7doc_file_id = doc_ai.upload(path="/path/to/legacy_report.doc") 8doc_result = doc_ai.parse_and_wait(doc_file_id) 9 10md_file_id = doc_ai.upload(path="/path/to/README.md") 11md_result = doc_ai.parse_and_wait(md_file_id)

Status#

✅ Live now. All existing parsing features work across the new formats.

This website uses cookies to enhance your browsing experience. By clicking "Accept All Cookies", you consent to the use of ALL cookies. By clicking "Decline", only essential cookies will be used. Read our Privacy Policy for more details.