Find page in PDF Tesseract OCR

Find page in PDF Version 11 (Python)

Action group: Text recognition

Description

This action is designed to search for a page by the entered value and retrieve its page number.

Property	Description	Type	Mandatory field
Parameters
PDF file path	Path to PDF file	Robin.FilePath	Yes
Text	The text that the search page should contain	Robin.String	Yes
Languages of text in the PDF file	Expected languages of text in the PDF file A dropdown list of items: Russian English Vietnamese Arabic Spanish Portuguese Persian Turkish Kazakh Belarusian Default value - Russian	Collection	No
Additional language	An additional language required for document recognition A dropdown list of items: No Russian English Vietnamese Arabic Spanish Portuguese Indonesian Persian Turkish Kazakh Belarusian The default value is No If the same option is selected in the Language and Additional Language parameters, there will be no error. The duplicate will be counted as 1 language	Robin.String	No
Trained model	A file with a trained Tesseract model in the format .tessdata. Allows you to load your own model trained on the required fonts If the parameter is filled in, it will be prioritized over the “Language” and “Additional Language” parameters	Robin.FilePath	No
Results
Page numbers	Page numbers where search text was found	Robin.Collection

None.

There is a document in PDF format , you need to find the pages where there is the text "Purpose and conditions of use".

Use the "Find a page in PDF" action.

The program robot completed successfully. Pages with this text in the document were found.