How to extract text from pdf using python | FinTechChef | OCR using python

In this video you will see how to extract text from pdf using python. There are many powerful modules for extracting text from pdf and few of them are tesser...

How to extract text from pdf using python | FinTechChef | OCR using python
AutomationTank
8.1K views • Nov 15, 2019
How to extract text from pdf using python | FinTechChef | OCR using python

About this video

In this video you will see how to extract text from pdf using python. There are many powerful modules for extracting text from pdf and few of them are tesseract, textract, Camelot, pyPDF2, tabula.
But, we are going to use "textract" python module because it has "OCR" functionality and it is very easy to use.

For more information logon to - https://www.FinTechChef.com

Steps for installing "textract": -
1. Press "Win + R", type "cmd" and hit "enter"
2. Run this command (without quotes): - "pip install textract"
3. Download Poppler: - http://blog.alivate.com.au/wp-content/uploads/2018/10/poppler-0.68.0_x86.7z
4. Extract it and paste complete folder here: - "C:\Program Files"
5. Add "C:\Program Files\poppler-0.68.0\bin" to system path variable
6. Your "textract" setup has been completed successfully

How to Install and Use Jupyter Notebook: - https://youtu.be/hi7PMTyT_Xc


Thanks! use that and enjoy :)

Tags and Topics

Browse our collection to discover more content in these categories.

Video Information

Views

8.1K

Likes

71

Duration

7:58

Published

Nov 15, 2019

User Reviews

4.2
(1)
Rate:

Related Trending Topics

LIVE TRENDS

Related trending topics. Click any trend to explore more videos.

No specific trending topics match this video yet.

Explore All Trends