ML model training for detecting objects in the documents using AlgoDocs

In this video, we will demonstrate AI-based document data extraction for detecting objects in our documents. AI features of AlgoDocs are based on machine learning models, which you can train for your own documents. We will demonstrate the steps you need to follow when creating an AI-based extractor for detecting objects in your documents and converting them into textual values. Easily extract any complicated data and tables from PDF or scanned documents using AlgoDocs


Feel free to start a free subscription right now and parse your PDF documents. You can use AlgoDocs for free forever, with 50 pages per month. If you need to process a higher number of pages, then please see our affordable pricing plans.

If you have specific requirements and need a custom solution, please contact us.


Assume that we need to detect some icons and have their corresponding values as an output in the form of text. For example, if we have this icon anywhere on the document, we want the “warranty” field to have a value of “5 years”, or if we have this icon on the document, then the “listings” field should be set to “ETL.” As an example, in this video, we will create the extractor and train the model for detecting three icons only, but you can train your model for as many objects as you need. We begin by creating the extractor, which we will name “Icons to Text”, and uploading a sample document. After our sample document passes preprocessing operations, we click on the “Manage” button. Keep in mind that you can create hybrid extractors that can contain extracting rules and ML models. For example, let us quickly create a rule-based field for capturing the revision information, which is followed by a “REV” keyword on the document. Here, we use default filters, which are “Specify Start” and “End” positions. We name the field “Revision” and save. Now, we continue with the training of the AI model for detecting our icons. The first step is to define labels and their values. Every label represents a unique icon on the document. Therefore, we need to define a label for each icon we want to detect. If we go back, we will see that the labels we have just created also appear here, because every label is actually a field. We can go back to the model by clicking on any of these label fields. Now, we continue with the dataset for our model. The dataset should contain files that we will annotate with the labels we created. Click on the “Add Files” button to add files to the dataset. As you can see, we have just one file that contains three pages. This is definitely not enough for the dataset, since our dataset must contain at least 100 files that contain objects we need to detect. Therefore, we need to go to the File Manager and upload more files. After uploading files in File Manager, we can select them for the dataset of our model. Before we continue with the annotation of our files, we need to remove files that do not contain any of the icons we need to detect. We can remove them from the dataset table right here or go to the annotation panel and remove them there. Removing unwanted files may take some time, so I will do this behind the scenes and come back after I am done. Now, our dataset is ready and contains files with the icons we need to label. We need to label all files in our dataset. When labeling our files, we need to draw boundaries around the icons we want the model to detect. After drawing a boundary, we select the label name for this icon and save it. We repeat these steps for other icons if they exist and continue in the same manner with all other files. Again, I will pause the recording here and return back when I am done with the entire dataset. Now, our dataset is ready, and we can start training our model. To start training the model, we go to the “Training Status” tab and click the “Start Training” button. That’s it; we are done! Please note that training may take some time depending on your dataset size and files. You will receive an email notification when training is completed. I will pause the video again and come back when training is completed. The training is complete, and our model is ready to detect icons. We can go to the “Extracted Data” section to see the output produced by our extractor. As we can see, we have four fields in total: a “revision”, which is a rule-based field, and three icon fields, which are detected by our trained model.