История страницы
Get text from PDF Version 11 (Python)
Action group^ group: Text recognition
...
Description
...
Parameters
Input parameters:
PDF file path - Path to PDF file to be recognized
Language of the text - Languages that the recognizer expects in the text
Page number - The number of the page of the file from which the text will be read
Output parameters:
Result - Variable into which the recognized text will be saved
Settings
Property | Description | Type | Filling example | Mandatory field | ||
---|---|---|---|---|---|---|
Parameters | ||||||
PDF file path | The path to the PDF file for recognition | Robin.FilePath | Yes | |||
Language of the text | Expected languages of the text in the PDF file Выпадающий список из элементов:
Значение по умолчанию – Русский | Robin.Collection | No | |||
Page number | The page number of the file from which the text will be read | Robin.Numeric | No | |||
Additional language | An additional language required for document recognition Выпадающий список из элементовA dropdown list of items:
Значение по умолчанию – Нет Если в параметрах "Язык" и "Дополнительный язык" выбран один и тот же вариант, то ошибки не будет. Дубль будет считаться как 1 язык
The default value is No If the same option is selected in the "Language" and "Additional language" parameters, there will be no error. The duplicate will be counted as 1 language | Robin.Collection | No | |||
Trained model | Tesseract trained model file in .taineddata format. Allows you to load your own model trained on the required fonts. If the parameter is populated, it will be prioritized over the "Language" and "Additional language" parameters | Обученная модель | Файл с обученной моделью Tesseract в формате .taineddata. Позволяет загрузить собственную модель, натренированную на необходимые шрифты. Если параметр заполнен, то будет считаться приоритетнее, чем параметры "Язык" и "Дополнительный язык" | |||
Results | ||||||
Result | Received text from a specific page from PDF. If the document does not contain the specified page, a blank value will be stored. | Robin.Collection |
Special conditions of use
None.
Example of use
Task
There is a document in pdf format, need to get the text from 2 pages of the document.
Solution
Use the "Get text from PDF" action.
Реализация
- Move the "Get text from PDF" action to the workspace.
- Set "Get text from PDF" action parameters
- Click on the "Start" button in the top panel.
Result
The program robot completed successfully. The text from page 2 of the document has been retrieved.