A PDF Parser is a program or a library that enables end-users and organizations to parse data from native PDF documents. Often, organizations need to parse PDF documents for specific fields such as Account Number, Date, Address, Bill to/from information … Read More
machine learning

Convert PDF to JSON – Convert PDF Documents to Structured JSON Objects
Table of Contents Introduction Organizations in various industries widely use PDF documents, since no doubt PDF is a common document format for businesses to transfer data. Purchase orders, Invoices, Agreements and many more document types are interchanged in PDF formats. … Read More

Extract handwritten text from scanned PDFs and images
Optical Character Recognition (OCR) engines are primarily focused on machine-printed text and may produce low accuracy for handwritten text. Intelligent Character Recognition (ICR) is an advanced recognition system that is used to recognize handwritten text. This allows the automatic conversion … Read More

A guide on extracting tables from low quality scanned documents
Many companies deal with thousands of documents every month. Document workflow automation becomes vital for such companies as the number of documents increases. One of the most frequent and at the same time tedious operations when processing documents is reading … Read More

Extract tables from PDF and scanned documents
Extraction of tables from PDF documents is always a tedious task, especially when the documents are scanned PDFs. Even when the documents are computer-generated PDFs, it still can be a complex and annoying task, since copying a tabular text from … Read More