Powered by TensorFlow 2 and PyTorch, anyone can seamlessly access OCR
Get the pre-trained model
End-to-end OCR is implemented in docTR using a two-stage approach: text detection (locating words), then text recognition (recognizing all characters in words). Therefore, an architecture for text detection and an architecture for text recognition can be selected from the list of available implementations.
from doctr.models import ocr_predictor model = ocr_predictor(det_arch='db_resnet50', reco_arch='crnn_vgg16_bn', pretrained=True)
read file
Documentation can be interpreted from PDF or images:
from doctr.io import DocumentFile # PDF pdf_doc = DocumentFile.from_pdf("path/to/your/doc.pdf").as_images() # Image single_img_doc = DocumentFile.from_images("path/to/your/img.jpg") # Webpage webpage_doc = DocumentFile.from_url("https://www.yoursite.com").as_images() # Multiple page images multi_img_doc = DocumentFile.from_images(["path/to/page1.jpg", "path/to/page2.jpg"])
Take the default pre-trained model as an example:
from doctr.io import DocumentFile from doctr.models import ocr_predictor model = ocr_predictor(pretrained=True) # PDF doc = DocumentFile.from_pdf("path/to/your/doc.pdf").as_images() # Analyze result = model(doc)
Install
Installing docTR requires Python 3.6 (or higher) and pip.
Due to the use of weasyprint, additional dependencies will be required if not running on a Linux system.
For macOS users, they can be installed as follows:
brew install cairo pango gdk-pixbuf libffi
For Windows users, these dependencies are included with GTK.
The latest version
The latest version of the package can be installed using pypi as follows:
#docTR #Homepage #Documentation #Download #OCR #Document #Text #Recognition #Library #News Fast Delivery