Optical Character Recognition (OCR), or text recognition, allows for the translation of scanned PDF documents into searchable data.
OCR cannot be run on PDFs that have been certified or digitally signed.
Note: OCR is only available in Revu Complete.
- Go to Document > OCR or press CTRL+SHIFT+O. The OCR dialog box appears.
Alternatively, go to Batch > OCR.
- The OCR function will also be invoked when the Create PDF from Scanner or Camera function in Revu is used, opening the OCR dialog box automatically.
The active PDF, if any, is automatically added to the process. To add more PDFs, click Add and use one or more of the following methods:
- Files: Adds individual files from a network or local drive. Selecting this option will cause the Open dialog box to appear. Navigate to the appropriate location and select the desired files.
- Open Files: Adds all files currently open in Revu.
- Open Set: Adds all files contained in the current Set.
- Folder: Adds all files in a selected folder on a network or local drive, but not files contained in any of its subfolders. Selecting this option will cause the Select Folder dialog box to appear. Navigate to the desired folder and select it.
- Folder and Subfolders: Adds all files in a selected folder on a network or local drive as well as all files within any of its subfolders. Selecting this option will cause the Select Folder dialog box to appear. Navigate to the desired folder and select it.
To run the process on specific pages only for one or more of the PDFs, select the desired PDF and choose one of the following from its Pages dropdown:
- All Pages: Sets the range to all pages.
- Current: Sets the range to the current page only. The current page number will appear in parentheses, for example, Current (2) if page 2 is the current page.
- Selected: Sets the range to the current selection. This option only appears if pages were selected prior to invoking the command.
- Custom: Sets the range to a custom value.
When this option is selected, the field acts like a text box. Delete any text left in the field and enter the page or pages to be printed directly.To enter a custom range:
- Use a dash between page numbers to define those two pages and all pages in between.
- Use a comma to define pages that are separated.
For example: 1-3, 5, 9 will include pages 1, 2, 3, 5 and 9.
Even Pages: Limits the process to only even pages.
Odd Pages: Limits the process to only odd pages.
Landscape Pages: Limits the process to only landscape-oriented pages.
Portrait Pages: Limits the process to only portrait-oriented pages.
Set the OCR configuration Options, as desired:
- Language: Select the languages used by the OCR process. Multiple languages can be used on the same PDF.
- Document Type: Use to optimize the OCR process for the selected document type. The CAD Drawing setting tends to ignore text formatting, for example, while the Text Document setting does not.
- Optimize For: Choose whether to optimize the OCR process for Accuracy of Speed.
- Correct Skew: Enable to correct angular deviations in scanned documents.
- Detect Orientation: Enable to detect the page orientation (90, 180 and 270 degrees) of each page and correct it if needed.
- Detect Vertical Text: Enable to detect text that is oriented vertically.
- Detect Text in Pictures and Drawings: Enable to detect text in graphics.
- Skip Vector Pages: Enable to skip processing of pages with vector content.
Page Chunk Size: Use to determine the maximum number of pages sent to the OCR engine at one time. Increasing chunk size can increase speed, but will also consume more of the computer's resources.
Note: Enabling Page Chunk Size and setting it to 1 is recommended for OCR jobs performed on PDFs that have a large number of pages, are of substantial file size or contain large format drawings. If OCR is run on a PDF with no results, running it again with a Page Chunk Size of 1 can correct the problem.
- Max Vector Size: Use to set the maximum vector size that will be analyzed during the OCR process; any vectors larger than this setting will be discarded in pre-processing. Decreasing this value can increase speed, but might also cause larger text (for example, larger fonts) to be inadvertently ignored.
- Click OK to run OCR.