OcrPy Documentation#

The Core objective of OcrPy is to let users OCR, Archive, Index and Search any documents with ease, with a simple and intuitive interface and a powerful Pipeline API.

ocrpy achieves this by wrapping around various OCR engines like Tesseract OCR, Aws Textract, Google Cloud Vision and Azure Computer Vision. It unifies the multitude of interfaces provided by a wide range of cloud tools & other open-source libraries and provides a simple, easy-to-use interface for the user.

Getting Started#

ocrpy is a Python-only package hosted on PyPI. The recommended installation method is pip

$ python -m pip install ocrpy

Day-to-Day Usage#

Ocrpy Provides various levels of abstraction for the user to perform OCR on various types of documents. The recommended and the best way to use Ocrpy is to use it through it’s pipelines API as shown below.

The Pipeline API can be invoked in two ways. The first method is to define the config for running the pipeline as a yaml file and and then run the pipeline by loading it as follows:

from ocrpy import TextOcrPipeline

ocr_pipeline = TextOcrPipeline.from_config("ocrpy_config.yaml")

alternatively you can also run a pipeline by directly instantiating the pipeline class as follows:

from ocrpy import TextOcrPipeline

pipeline = TextOcrPipeline(source_dir='s3://document_bucket/',
                           credentials_config={"AWS": "path/to/aws-credentials.env/file",
                                        "GCP": "path/to/gcp-credentials.json/file"})


For a more detailed explanation of the pipelines API, please refer to the ocrpy.pipelines section of the documentation & please check out the examples and tutorials for a more in-depth understanding of how to leverage ocrpy.

Full Table of Contents#


For a full index of all the functionalities, see the Index