Get text from PDF Tesseract OCR

Get text from PDF Version 9 (python)

Action group^ Text recognition

Description

This action is designed to recognize text from the specified page of a PDF document and save the recognized text to a variable.

PDF file path - Path to PDF file to be recognized

Language of the text - Languages that the recognizer expects in the text

Page number - The number of the page of the file from which the text will be read

Result - Variable into which the recognized text will be saved

Property	Description	Type	Mandatory field
Parameters
PDF file path	The path to the PDF file for recognition.	Robin.FilePath	Yes
Language of the text	Expected languages of the text in the PDF file	Robin.String	Yes
Page number	The page number of the file from which the text will be read.	Robin.Numeric	Yes
Results
Result	Received text from a specific page from PDF. If the document does not contain the specified page, a blank value will be stored.	Robin.String

None.

There is a document in pdf format, need to get the text from 2 pages of the document.

Use the "Get text from PDF" action.

Result

The program robot completed successfully. The text from page 2 of the document has been retrieved.