Action group: Text recognition
Description
This action is designed to recognize text from the specified page of a PDF document and save the recognized text to a variable.
PDF file path - Path to PDF file to be recognized
Language of the text - Languages that the recognizer expects in the text
Page number - The number of the page of the file from which the text will be read
Result - Variable into which the recognized text will be saved
Property | Description | Type | Filling example | Mandatory field |
---|---|---|---|---|
Parameters | ||||
PDF file path | The path to the PDF file for recognition | Robin.FilePath | Yes | |
Language of the text | Expected languages of the text in the PDF file A dropdown list of items:
Default value - Russian | Robin.Collection | No | |
Page number | The page number of the file from which the text will be read | Robin.Numeric | No | |
Additional language | An additional language required for document recognition A dropdown list of items:
The default value is No If the same option is selected in the "Language" and "Additional language" parameters, there will be no error. The duplicate will be counted as 1 language | Robin.Collection | No | |
Trained model | Tesseract trained model file in .taineddata format. Allows you to load your own model trained on the required fonts. If the parameter is populated, it will be prioritized over the "Language" and "Additional language" parameters | |||
Results | ||||
Result | Received text from a specific page from PDF. If the document does not contain the specified page, a blank value will be stored. | Robin.Collection |
None.
There is a document in pdf format, need to get the text from 2 pages of the document.
Use the "Get text from PDF" action.
Result
The program robot completed successfully. The text from page 2 of the document has been retrieved.