ABBYY Classification

The ABBYY Classification Service is an advanced Classification service designed to automatically categorize files and images based on their contents. This advanced classification service leverages modern technologies such as machine learning and natural language processing to detect even subtlest of differences among different files and images. The ABBYY Classification Service allows easy training of a flexible and scalable classification process that can granularly distinguish among many different User Defined Categories.

Possible use cases

Distinguish between different types of bank statements
Distinguish between different types of Identification documents
Distinguish between different types of invoices
Distinguish between objects in images

Service Setup

Open the Project Detail View of the project you would like to add the service to.
Click on the Add Service button in the command bar.
Select ABBYY Classification Service from the available Service Types.\
A new Service Configuration Wizard will open:
(When navigating the Wizard, please make sure to use the Next Step button in the command bar to save any changes made).

Step 1 - Allows configuration of various service settings, including the name and description. The default settings are sufficient for most use cases.
Step 2 - Allows adding User Defined Categories to train the service on.
Step 3 - Training *
1. Click Upload Training Documents in the command bar
2. Select the User Defined Category you want to upload documents to.
  Demo training files are available here.
3. Upload files for each User Defined Category you wish to train the service on.
4. Once you have uploaded all your documents, click the Train Service button in the command bar to train your service.
5. Click Process on the dialog window that appears. Leave all settings as default.
6. A progress dialog will appear displaying the progress of the training.
  Training times can vary depending on the number of files that have been uploaded for training.
7. The progress dialog should automatically close once the training has completed.
Step 4 - The Definition Document should be created after the Service has been trained successfully.
Click on the Complete button in the command bar to validate your service configuration and close the wizard.

Service Configuration Settings

The Microsoft OCR Service can be configured by the user as a flexible solution. The following Settings are available:

Setting	Required Type	Description
ArchivingStrategy	Optional	Days before documents get deleted.
BatchSize	Hidden	Processing batch size.
DocumentProcessedStatus	Optional	Document status used to denote that a document has been processed.
Enabled	Hidden	Enable or disable the service.
ExecuteBeforeProcess		When set up as a child service, specify whether this service should be executed before the parent service gets executed.
ExecuteAfterProcess		When set up as a child service, specify whether this service should be executed after the parent service gets executed.
Password	Optional	Used for service authentication. Custom Code can be used to set the password. Can be set per document.
RemoveComments	Optional	Remove human comments from a document.

Add and Process Documents

In your Classification Service Card click on the Inbox button.
Drag and drop files over the Inbox grid or click on the Upload button in the command bar.
The demo test files are available here.
When prompted to select a category, select the No Selection option in the bottom right of the dialog window.
To process single documents, click on the document’s Process action button.
To process multiple documents, select the documents you want to process and click on Process Documents in the command bar.

When testing a newly trained service it is recommended to only process a couple of documents at a time and to review the process results.
Should the results not appear satisfactory, then additional documents may need to be trained.

View Processed Documents

In your Classification Service Card click on the Outbox button.
Classification results will appear as new documents/files in the outbox with the Category column indicating the final document classification.