Best Method to OCR PDFs in Python with spaCy Layout
Learn the top way to OCR PDFs in Python using spaCy Layout for accurate text extraction. π

Python Tutorials for Digital Humanities
14.2K views β’ Jan 14, 2025

About this video
In this video, I'm going to show you the best way to OCR a PDF in Python with the new spaCy Layout package. The best part about this package is that it gives you access to all the important metadata generated from a spaCy pipeline alongside layout detection and OCR. This means you will have bounding boxes for the labeled regions of text on a given image. You can also do table detection.
spaCy Layout: https://github.com/explosion/spacy-layout
GitHub Repo: https://github.com/wjbmattingly/youtube-spacy-layout/tree/main
Join this channel to get access to perks:
https://www.youtube.com/channel/UC5vr5PwcXiKX_-6NTteAlXw/join
If you enjoy this video, please subscribe.
β Be my Patron: https://www.patreon.com/WJBMattingly
β PayPal: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=AZ73QW52SUX8N¤cy_code=USD&source=url
If there's a specific video you would like to see or a tutorial series, let me know in the comments and I will try and make it.
If you liked this video, check out www.PythonHumanities.com, where I have Coding Exercises, Lessons, on-site Python shells where you can experiment with code, and a text version of the material discussed here.
You can follow me at:
https://twitter.com/wjb_mattingly
spaCy Layout: https://github.com/explosion/spacy-layout
GitHub Repo: https://github.com/wjbmattingly/youtube-spacy-layout/tree/main
Join this channel to get access to perks:
https://www.youtube.com/channel/UC5vr5PwcXiKX_-6NTteAlXw/join
If you enjoy this video, please subscribe.
β Be my Patron: https://www.patreon.com/WJBMattingly
β PayPal: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=AZ73QW52SUX8N¤cy_code=USD&source=url
If there's a specific video you would like to see or a tutorial series, let me know in the comments and I will try and make it.
If you liked this video, check out www.PythonHumanities.com, where I have Coding Exercises, Lessons, on-site Python shells where you can experiment with code, and a text version of the material discussed here.
You can follow me at:
https://twitter.com/wjb_mattingly
Tags and Topics
Browse our collection to discover more content in these categories.
Video Information
Views
14.2K
Likes
391
Duration
15:21
Published
Jan 14, 2025
User Reviews
4.6
(2) Related Trending Topics
LIVE TRENDSRelated trending topics. Click any trend to explore more videos.