How to Use iText pdfOCR to Recognize Text in Scanned Documents

André V. Lemos

9 Jul 2020CPOL8 min read

13K

A tutorial for generating searchable, archivable PDFs for your workflow with iText pdfOCR

iText pdfOCR is a new open-source add-on for iText 7, the open-source PDF library for Java and .NET. It allows you to recognize text in scanned documents, PDFs and images, enabling access to text locked away in documents for processing and re-purposing, or to produce PDF/A-3u documents for long-term archiving purposes.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Views

Daily Counts

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

André V. Lemos

Product Manager iText Software

Portugal

André Lemos is the Global Lead of Product and Services at iText, a powerful PDF Toolkit for PDF generation, PDF programming, handling & manipulation.

He has been doing product management for 9 years in areas ranging from health, physiotherapy and biosignals research, with a strong development background.

How to Use iText pdfOCR to Recognize Text in Scanned Documents

Views

License

Comments and Discussions