PDF OCR is a simple drag-and-drop utility that converts PDFs and images into text documents. It uses advanced OCR (optical character recognition) technology to extract the text of the PDF or image. This is particularly useful for dealing with PDFs and images that were created via a scan-to-PDF function in a scanner or photo copier. It uses the Tesseract engine to perform OCR, and currently supports over 20 languages for OCR.
| Tags | OCR pdf to text optical character recognition |
|---|---|
| Licenses | Commercial |
| Operating Systems | Mac OS X |
| Implementation | Java |
| Translations | English |
Recent releases


Release Notes: Improved support for PDFs with special characters, PDFs that have been cropped, and PDFs with a 270 rotation.


Release Notes: This release fixes an issue with searchable PDFs with special characters not opening correctly in Adobe Reader. Language packs can now be installed by double-clicking the language-pack file.


Release Notes: This release adds improved handling of PDFs which have a rotation built-in and fixes minor bugs related to OS X Lion compatibility.


Release Notes: This release added options for overwriting original PDFs with the searchable result, specifying a custom extension for searchable PDF output, and disabling the auto-opening of the converted PDF files when the conversion is complete.


Release Notes: Improvements to searchable PDF output so that it can handle text inside brackets .