Recognize files Version 2 (Net)
Description
Extracting text from a file using Robin OCR service with Soica
Action icon
Параметры
Входные параметры
URL Host address
Логин Login for authorization
Пароль Password for authorization
Класс пакета Package class for recognition
Файл Path to file for recognition
Тип результата Format of result data
Профиль распознавания Profile for recognition
Тайм-аут Time-out is set in milliseconds to determine execution time of action
Выходные параметры
Результат Result of recognition in json/xml format
Статус Status of package
Settings
Property | Description | Type | Filling example | Mandatory field |
---|---|---|---|---|
Parameters | ||||
Url | Host address | Robin.String | Yes | |
Login | Login for authorization | Robin.String | Yes | |
Password | Password for authorization | Robin.Password | Yes | |
Package class | Package class for recognition | Robin.String | Yes | |
File | Path to file for recognition. Support exstensions: JPEG, PDF, TIFF, BMP, PNG, DOCX, GIF | Robin.FilePath | Yes | |
Result type | Format of result data Выпадающий список из элементов: XML, JSON Значение по умолчанию: XML | Robin.String | No | |
Recognition profile | Profile for recognition Профили создаются в самой Сойке и пользователь заранее знает, какой необходимо выбрать. Значение по умолчанию задается системой при создании класса пакетов и называется default | Robin.String | No | |
Timeout | Time-out is set in milliseconds to determine execution time of action | Robin.Numeric | No | |
Results | ||||
Status | Статус распознавания документа | Robin.String | ||
Result | Коллекция json-объектов или xml-контекстов, содержащих распознанные данные. Если истек указанный тайм-аут, а сервис не закончил распознавание, этот параметр вернется пустым. Если распознание документа еще в процессе, то результат не заполнен. | Robin.Collection |
Special conditions of use
General principles of working with ROBIN OCR:
To send a document for recognition, at least 2 requests must be made. First, a request to create a package is sent, the only image of the package or the first image of the package is passed to the request. The request returns the GUID of the package. If there are several images in the package, then in subsequent requests these images are added to the package (one by one). The final request is to launch the package for processing. The GUID of the created package is passed to the second and subsequent requests.
The format of the result is customized in advance, in the scenario.
The user will receive the result as a collection of json objects or xml-contexts. You can work with the obtained results by studio actions.
The list of package classes the user must know before launching the action.
Package classes will be configured in the system by the engineer, you need to select a class suitable for image processing. Package class name is the name of the customized project. Package class name must be specified when creating a package (mandatory). Package name must be specified in the request.
When the robot completes with an error, the cause of the error will be displayed in the error text.
If the status of the document is not "export", the robot will not be able to get the result and will skip the document. The user himself will have to move the document to the "export" status on the server. It is necessary to validate the file manually and send it for export by making and accepting changes in it.
Statuses:
- import => pending status change;
- recognize =>pending status change;
- validation=> manually change status in the Soica system;
- export => ready for unloading;
- deleted - package was deleted ;
- inaccessible - package unavailable;
- quality control - if the user sent in the wrong scenario, manually change the status in the Soica system.
- If the timeout expires before we get the recognized text an empty result will be obtained, the action will not terminate with an error.
Действие отправляет на проверку документы и сразу получает результаты:
- rest-сервис экспорта отвечает за получение результата;
- json или xml результат будет. Это настраивается внутри сценария обработки пакета в Сойке;
- В действии надо вернуть не строку, а json-объект или сразу xml-контекст. Xml контекст нужно закрывать, Json не надо;
- Документ предварительно должен быть выгружен модулем экспорта;
- Действие должно ожидать пока статус документа станет "export". Тогда только запускать получение результата.
Example of use
Task
Recognize text in a document.
Solution
Use the "Recognize files" action.
Implementation
- Set the "Recognize" action to the workspace.
Set the action parameters with the correct data.
- Launch the robot using the "Start" button in the top panel.
Result
The robot returned the processed files. The result is represented as a collection with json objects or xml contexts. Export status.