Chapter 1. Introduction

PDF OCR X is a simple utility for running optical character recognition (OCR) on your imageand PDF files to extract text or make them searchable. It is currently available for Mac OS X and Windows and includes language support for over 30 languages.

1.1. Features

  • Drag-and-drop: You can drag and drop image and PDF files onto the PDF OCR X application icon to initiate an OCR conversion.

  • Plain text output: An option to extract all text from the input images/PDFs and save them in a plain text file that can be opened and edited in any text editor including TextEdit, MS Word, Notepad, and others.

  • Searchable PDF Output : An option to convert scanned PDFs into searchable PDFs.

  • Support for over 30 languages. Install additional language packs with the click of a button.

  • Supports many common image formats (GIF, TIF, JPEG, PNG, PICT) as well as PDF files as inputs.

  • "No Prompt" Mode : (Enterprised Edition Only) When "no prompt mode" is enabled, PDF OCR X will use predetermined settings for the file conversion. This allows you to use automator or apple script to automate the OCR conversion without being interrupted by pop-up dialogs and prompts.

  • Auto-open : With "auto-open" selected, converted files will automatically be opened in the appropriate application after the conversion is complete. On Mac, this means that searchable PDFs would be opened in Preview and text files would be opened in Text Edit. On Windows, text files would likely be opened in Notepad, and PDF files in Adobe Reader.

  • Batch Conversion (Enterprise Edition Only) : Convert multiple files as a batch.

1.1.1. Community Edition vs Enterprise Edition

There are two versions of PDF OCR X:

  1. Community Edition : Supports only single-page PDFs and image files as inputs.

  2. Enterprise Edition : All of the same features as the Community Edition plus support for multi-page PDF files, batch conversion, and no-prompt mode (which allows you to automate conversion using tools like Automator and Applescript.


