What's New#
Tensorlake’s Document Ingestion API now detects and decodes barcodes as part of the standard parsing flow. Turn it on by setting a single flag in your parsing_options and get back:
- The barcode type (e.g.
PDF417) - The decoded barcode value
- The bounding box of each barcode on the page
No extra service, tooling, or post-processing required — barcodes are just another structured fragment in your DocumentAI output.
Why It Matters#
Barcodes are everywhere in operational documents:
- Shipping labels and packing slips
- Lab reports and sample labels
- Insurance documents and claim IDs
- Utility bills, tickets, and receipts
Until now, extracting barcode data usually meant bolting on a separate barcode library, wiring it into your ingestion pipeline, and stitching the results back to the original pages.
With Tensorlake, barcode extraction becomes a built-in capability:
- Less glue code – One API handles OCR, layout, and barcode reading
- Better context – Barcodes arrive alongside text, tables, and images for the same page
- Easier alignment – Bounding boxes let you link barcode values back to nearby text (e.g., “tracking number” labels)
The Problem#
Typical document parsing workflows treat barcodes as an afterthought:
- OCR engines often ignore them entirely
- Barcode SDKs usually operate on raw images, not documents with pages, chunks, and layout
- You have to manually keep track of where each barcode came from and which document region it belongs to
This leads to brittle pipelines and metadata drift. you can decode the barcode, but you don’t know which claim, shipment, or form section it was attached to.
How It Works#
When you enable barcode_detection in your parsing options and use the model03 OCR model, the pipeline:
- Parses each page into fragments (text, tables, barcodes, etc.)
- Runs a barcode detector/decoder over the page image
- Emits
fragment_type: "barcode"entries right alongside normaltextfragments - Includes bounding boxes and page dimensions so you can position or highlight barcodes in a viewer
Barcode fragments show up alongside other page fragments in the response.
Example JSON fragment#
1
2{
3 "fragment_type": "barcode",
4 "content": {
5 "content": "PDF417: 4QGDkVjpF7nuGhQiOgLHwc",
6 "html": null
7 },
8 "reading_order": 9,
9 "bbox": {
10 "y1": 444,
11 "x2": 207,
12 "x1": 2,
13 "y2": 475
14 }
15}
16Getting Started#
Barcode detection is available starting in the Tensorlake Python SDK version
0.2.91. Make sure you upgrade the SDK before running the examples.
1pip install --upgrade tensorlake1. Enable barcode detection in parsing_options
Below is a sample Python snippet using the Tensorlake SDK. The key change is adding barcode_detection="true" to ParsingOptions.
1
2...
3
4doc_ai = DocumentAI(
5 api_key="YOUR_TENSORLAKE_CLOUD_API_KEY"
6)
7
8file_id = doc_ai.upload(path="barcode_file_008.pdf")
9
10
11parsing_options = ParsingOptions(
12 ocr_model="model03",
13 barcode_detection="true",
14)
15
16parse_id = doc_ai.read(
17 file_id=file_id,
18 parsing_options=parsing_options,
19)
20
21result = doc_ai.wait_for_completion(parse_id)
22...
232. Use barcode output in your workflows#
Once you have the decoded barcodes and bounding boxes, you can:
- Match barcodes to internal IDs (shipment, claim, order, patient, etc.)
- Validate that the barcode value matches a printed text ID
- Flag documents where the barcode is missing or unreadable
- Visualize barcodes as overlays in your document viewer
Try it#
Sample Cookbook: Barcode Detection Demo
Documentation: Parsing Documents
Status#
✅ Live now in the Document Ingestion API and SDKs
✅ Supported in the model03 OCR model
