Action group: Text recognition
The action performs text recognition on the image and returns it as a result
Property | Description | Type | Filling example | Mandatory field |
---|---|---|---|---|
Parameters | ||||
Image | Path to image file. Supported image formats: (jpeg, jpg, bmp, png, tif, tiff) | Robin.Image | C:\doc\img.png | Yes |
Expected languages of text in the image | Expected languages of text in the image Expected languages of the text in the PDF file A dropdown list of items:
Default value - Russian | Robin.String | Yes | |
Additional language | An additional language required for document recognition A dropdown list of items:
The default value is No If the same option is selected in the "Language" and "Additional Language" parameters, there will be no error. The duplicate will be counted as 1 language | Robin.Collection | No | |
Content format | Expected text content format. Available text formats: (Line, Block, Page) | Robin.String | Yes | |
Trained model | Tesseract trained model file in .taineddata format. Allows you to load your own model trained on the required fonts. If the parameter is populated, it will be prioritized over the "Language" and "Additional language" parameters | |||
Options | Configuration options for OCR) | Robin.String | No | |
Results | ||||
Result | Text (string) recognized from image | Robin.String |
The default mode in the "Parameters" field is --psm 3.
All parameters are listed with a space in the format --parameter value_parameter.
List of all parameters: https://muthu.co/all-tesseract-ocr-options/.
Parameter | Default value | Description |
---|---|---|
Main parameters | ||
oem | 3 |
|
psm | 3 |
|
Additional parameters | ||
edges_min_nonhole | 14 | Minimum number of box pixels to recognize |
textord_space_size_is_variable | 0 | If true (1) is set, word delimiter spaces are assumed to be of variable width, even if the characters are of fixed pitch |
textord_tabfind_find_tables | 1 | Launch table detection |
textord_force_make_prop_words | 0 | Apply proportional word segmentation to all strings |
textord_width_limit | 8 | Maximum width of blocks for creating rows |
tessedit_pageseg_mode | 6 |
|
textord_max_noise_size | 7 | Maximum noise size in pixels |
tessedit_dont_blkrej_good_wds | 0 | If true (1) is set, the word segmentation quality score is used |
tessedit_char_blacklist | Blacklisting characters that cannot be recognized | |
tessedit_char_whitelist | White list of characters to recognize | |
List of chars to override tessedit_char_blacklist | List of symbols to override tessedit_char_blacklist |
Read the text in the image
Use the "Read text" action
2. Click on the "Start" button in the top panel.
Result
The program robot completed successfully. The text is read from the image.