OCR
Optical Character Recognition (OCR), or text recognition, allows for the translation of scanned PDF documents into searchable data.
OCR cannot be run on PDFs that have been certified or digitally signed.
Note: OCR is only available in Bluebeam Revu eXtreme. The OCR feature, menu and toolbar items will not appear in Bluebeam Revu Standard or Bluebeam Revu CAD.
Running OCR on a single document
- Open the document on which OCR is to be run.
- Go to Document > OCR or press CTRL+SHIFT+O. The OCR dialog box appears.
- The languages that will be used by the OCR process are shown under Recognition Languages. The American English library is loaded by default. To add other libraries, click Add. To remove a library, select it and click Remove. Multiple libraries can be used on the same document.
-
Set the OCR Configuration options, as desired:
- Correct Skew: Enable to correct angular deviations in scanned documents.
- Detect Orientation: Enable to detect the page orientation (90, 180 and 270 degrees) of each page and correct it if needed.
- Detect Text in Pictures and Drawings: Enable to detect text in graphics.
- Rotate Markups: If Correct Skew is enabled, use this option to also adjust existing markups so they line up with skew-corrected text or images.
- Skip Vector Pages: Enable to skip processing of pages with vector content.
-
Page Chunk Size: Use to determine the maximum number of pages sent to the OCR engine at one time. Increasing chunk size can increase speed, but will also consume more of the computer's resources.
Note: Enabling Page Chunk Size and setting it to 1 is recommended for OCR jobs performed on PDFs that have a large number of pages, are of substantial file size or contain large format drawings. If OCR is run on a PDF with no results, running it again with a Page Chunk Size of 1 can correct the problem.
- Max Vector Size: Use to set the maximum vector size that will be analyzed during the OCR process; any vectors larger than this setting will be discarded in pre-processing. Decreasing this value can increase speed, but might also cause larger text (for example, larger fonts) to be inadvertently ignored.
- Optimize for: Use to optimize the OCR process for the selected document type. The CAD Drawing setting tends to ignore text formatting, for example, while the Text Document setting does not.
-
To select a Page Range, click the Pages menu and select from the following:
- All Pages: Sets the range to all pages.
- Current: Sets the range to the current page only. The current page number will appear in parentheses, for example, Current (2) if page 2 is the current page.
- Selected: Sets the range to the current selection. This option only appears if pages were selected prior to invoking the command.
- Custom: Sets the range to a custom value. When this option is selected the list becomes a text box. To enter a custom range:
- Use a dash between page numbers to define those two pages and all pages in between.
- Use a comma to define pages that are separated.
For example: 1-3, 5, 9 will include pages 1, 2, 3, 5 and 9.
- Click OK to run OCR.
Running OCR on multiple documents
-
Go to File > Batch > OCR. The Batch: OCR dialog box appears.
-
Add documents using one (or both) of the following methods:
- To add all PDFs that are currently open in Revu, click Add Open Files.
- To select files from a local or network drive, click Add.
-
To select a Page Range, click the Pages menu and select from the following:
- All Pages: Sets the range to all pages.
- Custom: Sets the range to a custom value. When this option is selected the list becomes a text box. To enter a custom range:
- Use a dash between page numbers to define those two pages and all pages in between.
- Use a comma to define pages that are separated.
For example: 1-3, 5, 9 will include pages 1, 2, 3, 5 and 9.
- From the Apply To menus, select among Even Pages Only, Odd Pages Only or Odd and Even Pages and among Landscape Pages, Portrait Pages or Landscape and Portrait Pages. These selections work in conjunction to form the filter, so any pages to be processed must meet both criteria selected.
- Select the next PDF in the File List and repeat steps 3 and 4 until Page Range and Page Filter options have been set for each PDF.
-
Click OK. The OCR dialog box appears.
- The languages that will be used by the OCR process are shown under Recognition Languages. The American English library is loaded by default. To add other libraries, click Add. To remove a library, select it and click Remove. Multiple libraries can be used on the same document.
-
Set the OCR Configuration options, as desired:
- Correct Skew: Enable to correct angular deviations in scanned documents.
- Detect Orientation: Enable to detect the page orientation (90, 180 and 270 degrees) of each page and correct it if needed.
- Detect Text in Pictures and Drawings: Enable to detect text in graphics.
- Rotate Markups: If Correct Skew is enabled, use this option to also adjust existing markups so they line up with skew-corrected text or images.
- Skip Vector Pages: Enable to skip processing of pages with vector content.
-
Page Chunk Size: Use to determine the maximum number of pages sent to the OCR engine at one time. Increasing chunk size can increase speed, but will also consume more of the computer's resources.
Note: Enabling Page Chunk Size and setting it to 1 is recommended for OCR jobs performed on PDFs that have a large number of pages, are of substantial file size or contain large format drawings. If OCR is run on a PDF with no results, running it again with a Page Chunk Size of 1 can correct the problem.
- Max Vector Size: Use to set the maximum vector size that will be analyzed during the OCR process; any vectors larger than this setting will be discarded in pre-processing. Decreasing this value can increase speed, but might also cause larger text (for example, larger fonts) to be inadvertently ignored.
- Optimize for: Use to optimize the OCR process for the selected document type. The CAD Drawing setting tends to ignore text formatting, for example, while the Text Document setting does not.
- Click OK to run OCR.
Related topics
Creating a PDF from Scanner or Camera
Revu 2017 & Below
Help Guide
Document Processing