How To Convert PDFs Into AudioBooks by@jitendraballa2015

How To Convert PDFs Into AudioBooks

Jitendra Singh HackerNoon profile picture

Jitendra Singh

Learning Data Science

Isn’t it interesting? So basically after reading this article you don’t need to read long PDFs which contains more than 10 pages and only with help of 12 lines Python code. I’ll explain every line of code so even who doesn’t have knowledge of Python, can understand the code. So let’s start!!

Here’re the steps to follow for making an AudioBook:

First of all you must have Python and it’s IDE installed on your PC If you don’t have these installed you can do it just by clicking on Python and IDE. There are many IDEs but I’m giving you link of PyCharm because it is easy to use and open-source. If you don’t know IDE than no worry about it’s just a code editor for Python.

After downloading both Python and PyCharm you have to set it’s environment and for that there are many videos on YouTube so take help from them.

Let’s open IDE and create a new file and give it a name whatever you want.

Click on Terminal option then just installed Python text to speech

library just like this :

#make sure you are connected with Internet
pip install pyttsx3 

Also install PyPDF2 library in same way:

pip install PyPDF2

Here is the full code :

    import pyttsx3
    import PyPDF2
    pdf_book = open('book.pdf','rb')
    pdf_book_reader = PyPDF2.PdfFileReader(pdf_book)
    pages_no = pdf_book_reader.numPages
    speaker = pyttsx3.init()
    for number in range(8,pages_no):
      page_start = 
      text = page_start.extractText()

    Explanation of code:

  1. In line 1 & 2 import required libraries.
  2. In line 3 , create a variable called 
     which is used to open the pdf file which we want to make speak and ‘rb’ is to read in binary.
  3. In line 4,
     pdf_book_reader = PyPDF2.PdfFileReader(pdf_book)
     read the file which we want to make speak using 
     function of 
  4. In line 5,
     pages_no = pdf_book_reader.numPages
     counting total number of pages in the file using numPages function.
  5. In line 6, 
     displays total number of pages in the file.
  6. In line 7, 
    speaker = pyttsx3.init()
     initiate the pyttsx3 library to speak the file.
  7. In line 8, 
    for number in range(8,pages_no):
     a for loop is initiated to iterate through whole pdf file from starting page to ending page using range function.
  8. In line 9, 
    page_start = pdf_book_reader.getPage(8)
     starting page from where we want to make Python to read.
  9. In line 10, text =
     extracting the text from starting page using 
  10. In line 11, 
    speaking the text from PDF file pages using say function.
  11. In line 12,
     making sure that it speaks all pages for PDF file

So now you can lay on your bed and just listen whatever PDF you want to speak anytime, anywhere from Python just by running this program.

Note: I’ll suggest you to put the PDF file in the same folder where you are writing code and it’ll reduce your trouble.

Thank you so much for reading! follow the writer of the article for more stuff on Python and Data Science.


Signup or Login to Join the Discussion


Related Stories