Credit Application Form

How to extract data from an application form

This is an example of a Credit Application form.

The purpose of this process is to extract various fields on the form, including the First Name, Last Name, Date of Birth, Physical Address, E-mail Address, Employment Details, and Banking Information, to name a few. The detailed steps are as follows:

Create a new or select an existing Project.
Create a new Service.
Select MS Forms Recognizer from the Extract & Verify group.
The Configuration Wizard for the service will appear.
(Optional) Rename the service name and service description.
Click on Next: Step 2 - Categories (or click on Step 2 Wizard Step).
Add a new Category to upload the training documents.
Click on Next: Step 3 - Layout Analysis (or click on Step 3 Wizard Step).
The Document Data Grid will appear. Select Upload Training Documents to open a dialogue to upload training documents for Credit Application. Files can also be dragged and dropped onto the grid.
Select the Category of the training documents.
Once the files are loaded, note that they are in the Received state.
Select the documents you want to include in training the service (a minimum of 5 is required for MS Forms Recognizer), and select Analyze Documents.
Select the desired Execution Parameters for analyzing the documents. Normally, the default values will suffice. Click Process.
A progress dialogue will be displayed during the processing of the request. The Progress Logs can be expanded to scrutinize detailed logging of the progress. Once the operationis complete, the dialogue can be closed.
After the documents are analyzed, the layout of the document is available in a structured format. These documents are demarcated by the Analyzed status. By using Custom Labelling, the service can be trained to detect values relative to the document layout. Click Next: Step 4 - Custom Labeling (or click on Step 4 Wizard Step).
The Document Designer is presented in step 4. A preview of the document is shown with the overlayed analyzed layout data.
Details and metadata of the detected layout Parameters can be viewed by clicking on the bounding box of the text.
In the case of Social Security cards, we are interested in extracting the SSN, the Full Name and the Date of Issue fields. To start labeling, click on the preview of the document in the Document Designer, and drag the region where a field can be found.
When a region is selected, a dialogue is presented to create a Parameter Definition for each of the fields. An existing field can be used, or a new one can be created.
Once all the fields that required extraction are labeled, the Parameters with the detected values are displayed in the Document Data view. Once all the labels have been drawn, the document can be saved by clicking on Save Custom Labels.
Repeat steps 18 - 20 for all the documents uploaded for training.
Once all the documents are labeled, the model of the service can be trained. Click on Next: Step 5 - Training (or click on Step 5 Wizard Step).
Now that the documents have been labeled, we can request the train operation. In step 5, click on Train Service. Again the progress dialogue will show the progress of the training operation. Note that after training has been completed successfully, the status will change to Trained.
At this stage, the service is trained, and documents can be loaded into the Inbox of the service and can be processed.
The OCR results can be viewed in the Outbox **** of the service.