Find page in PDF Version 11 (Python)

Action group: Text recognition 


Description

This action is designed to search for a page by the entered value and retrieve its page number.

Action icon

Settings

PropertyDescriptionTypeFilling exampleMandatory field
Parameters
PDF file pathPath to PDF fileRobin.FilePath
Yes
TextThe text that the search page should containRobin.String
Yes
Languages of text in the PDF file

Expected languages of text in the PDF file

A dropdown list of items:

  • Russian
  • English
  • Vietnamese
  • Arabic
  • Spanish
  • Portuguese
  • Persian
  • Turkish
  • Kazakh
  • Belarusian

Default value - Russian

Collection
No
Additional language

An additional language required for document recognition

A dropdown list of items:

  • No
  • Russian
  • English
  • Vietnamese
  • Arabic
  • Spanish
  • Portuguese
  • Indonesian
  • Persian
  • Turkish
  • Kazakh
  • Belarusian

The default value is No

If the same option is selected in the Language and Additional Language parameters, there will be no error. The duplicate will be counted as 1 language

Robin.String
No
Trained model

A file with a trained Tesseract model in the format .tessdata.

Allows you to load your own model trained on the required fonts

If the parameter is filled in, it will be prioritized over the “Language” and “Additional Language” parameters

Robin.FilePath
No
Results
Page numbersPage numbers where search text was foundRobin.Collection

Special conditions of use

None.

Example of use

Task

There is a document in PDF format , you need to find the pages where there is the text "Purpose and conditions of use". 

Solution 

Use the "Find a page in PDF" action. 

Implementation

  1. Move the "Find a page in PDF" action to the workspace. 


  2. Set "Find page in PDF" action parameters

  3. Click on the "Start" button in the top panel. 

Result

The program robot completed successfully. Pages with this text in the document were found. 


  • Нет меток