Best Method to OCR PDFs in Python with spaCy Layout
Learn the top way to OCR PDFs in Python using spaCy Layout for accurate text extraction. π

Python Tutorials for Digital Humanities
14.2K views β’ Jan 14, 2025

About this video
In this video, I'm going to show you the best way to OCR a PDF in Python with the new spaCy Layout package. The best part about this package is that it gives you access to all the important metadata generated from a spaCy pipeline alongside layout detection and OCR. This means you will have bounding boxes for the labeled regions of text on a given image. You can also do table detection.
spaCy Layout: https://github.com/explosion/spacy-layout
GitHub Repo: https://github.com/wjbmattingly/youtube-spacy-layout/tree/main
Join this channel to get access to perks:
https://www.youtube.com/channel/UC5vr5PwcXiKX_-6NTteAlXw/join
If you enjoy this video, please subscribe.
β Be my Patron: https://www.patreon.com/WJBMattingly
β PayPal: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=AZ73QW52SUX8N¤cy_code=USD&source=url
If there's a specific video you would like to see or a tutorial series, let me know in the comments and I will try and make it.
If you liked this video, check out www.PythonHumanities.com, where I have Coding Exercises, Lessons, on-site Python shells where you can experiment with code, and a text version of the material discussed here.
You can follow me at:
https://twitter.com/wjb_mattingly
spaCy Layout: https://github.com/explosion/spacy-layout
GitHub Repo: https://github.com/wjbmattingly/youtube-spacy-layout/tree/main
Join this channel to get access to perks:
https://www.youtube.com/channel/UC5vr5PwcXiKX_-6NTteAlXw/join
If you enjoy this video, please subscribe.
β Be my Patron: https://www.patreon.com/WJBMattingly
β PayPal: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=AZ73QW52SUX8N¤cy_code=USD&source=url
If there's a specific video you would like to see or a tutorial series, let me know in the comments and I will try and make it.
If you liked this video, check out www.PythonHumanities.com, where I have Coding Exercises, Lessons, on-site Python shells where you can experiment with code, and a text version of the material discussed here.
You can follow me at:
https://twitter.com/wjb_mattingly
Tags and Topics
Browse our collection to discover more content in these categories.
Video Information
Views
14.2K
Likes
391
Duration
15:21
Published
Jan 14, 2025
User Reviews
4.6
(2) Related Trending Topics
LIVE TRENDSRelated trending topics. Click any trend to explore more videos.
No specific trending topics match this video yet.
Explore All Trends