Working with Python is a bliss in itself, most of the times. I have been involved with Python for more than two and a half years now, and for most of that time I hardly had any issues until I had to work with spreadsheets.
Python has easy to write syntax and dynamic style, which is why it is a great language for beginners to start with. But the benefits do not end there. Python can, and is used in many big products, such as the framework Django.
Django powers social media apps such as Instagram. Other examples are BitTorrent, Google App Engine, and Ubuntu Software Center. Even the core functionality of YouTube is written in Python.
All these examples testify to the robustness and usefulness of Python.
Being a great scripting language, developers prefer to use Python in scenarios where they have to play with and manipulate data. In my experience of working in a product based company, I have come across projects that require the data collected from products to be reported back to the clients.
This requires fetching data from the database and manipulating it into a presentable form, which is generally in a spreadsheet format. Having some experience with Python, I decided to apply this to making spreadsheets.
These are tips for those who work with an abundance of data and want to harness that using spreadsheets. I have come across a multitude of issues as a developer and would like to share some of those here, and how you can work your way around them.
There are many third-party packages available to work with spreadsheets using Python. Some of them are listed at:
Python Excel_A website for people who need to work with Excel files in Python_www.python-excel.org
You may know that Microsoft Excel provides two file extensions for spreadsheets, one is .xls
and other being .xlsx
. The former corresponds to files for Microsoft Excel 2003 or earlier, and the latter is for the newer versions.
The first problem that arises in working with spreadsheets using Python is because of these two different extensions. The package xlwt
supports the .xls
extension of Excel, and openpyxl
supports the .xlsx
extensions of Excel.
I will be demonstrating how to use the xlwt
package for working with spreadsheets. You can install this package using the following command on terminal or command prompt (Prerequisite: pip
must be installed on your system. For installation instructions of pip
, visit this link)
pip install xlwt
Note: This article uses xlwt==1.3.0
After the installation is complete, open a text editor to create a new file and give it any name you like; I am giving it the name excelScript.py
.
Inside the script, write the following code:
import xlwt
def main():book = xlwt.Workbook()sheet = book.add_sheet('Sheet 1')book.save('Sample.xls')
if __name__ == '__main__':main()
The code above creates a blank Excel file with the name “Sample.xls” , containing a single sheet named “Sheet 1". This was an example of simple .xls
file creation. You can run the script above using
python excelScript.py
in your command line.
To write data to the Excel sheet, change the code to the following:
import xlwt
def main():book= xlwt.Workbook()sheet = book.add_sheet('Sheet 1')
# sheet.write(r, c, <text>)
sheet.write(0, 0, 'sample')
book.save('Sample.xls')
if __name__ == '__main__':main()
This will write “sample” on zeroth row and zeroth column. Keep in mind that the xlwt
package reads row and column indices starting at zero. This way you can write into your Excel file and create as many sheets you like. Run this script using the command
python excelScript.py
in your command line. This is a common way to write to Excel using the xlwt
package where you can merge the columns or rows. You can give it styling also. The following code demonstrates how to do so:
import xlwt
def main():book= xlwt.Workbook()sheet = book.add_sheet('Sheet 1')
# sheet.write\_merge(r1, r2, c1, c2, <text>)
# For styling add
# xlwt.easyxf('align: horz center, vert center')
# add the above after <text> as 6th argument to write\_merge
sheet.write\_merge(0, 1, 0, 1, 'sample 1', xlwt.easyxf('align: horz center, vert center'))
book.save('Sample.xls')
if __name__ == '__main__':main()
This will merge columns (0, 1) and rows (0, 1) to make a single column and write “sample 1” into it which will be both horizontally and vertically center aligned. xlwt.easyxf
does the styling for you. If you open the xls
file, you will see that the previously written sample is gone.
For information on more methods provided by xlwt
you can read the documentation here.
Please note that if you use _xlwt_
and if a file with the same name exists then it will be replaced.
The procedure above is good if you want to create a new .xls
file, and if you are working with an older format of the Microsoft Excel spreadsheet.
However, this package has its limitations as you can only create 65536
rows and 256
columns in the older versions of Excel. Similarly, you can write only these many rows and columns using this package. If you have more data to write than this, then unfortunately this package cannot help you. You will have to use another Python package called openpyxl
.
openpyxl - A Python library to read/write Excel 2010 xlsx/xlsm files - openpyxl 2.4.2 documentation_Edit description_openpyxl.readthedocs.io
You can read more about how to use openpyxl
to work with Excel files in part 2 of this article.
Continuing with the xlwt
package, I found that it is very easy to create new .xls
files. But what if you already have an existing .xls
file and want to append new data to the existing file. In such a case the xlwt
package does not comply. Instead, you have to work around this. Note that if you have a .xls
file then you cannot use openpyxl
, which does support data appending into already existing files, because that is incompatible with older versions of Excel files.
For the fix I am going to explain, you need two more Python packages. Namely, the xlrd
package for reading the Excel file, and the xlutils
package to make a copy of the Excel object into the xlwt
object so that you can write into it.
To install these packages use the following commands:
pip install xlrdpip install xlutils
The following procedure can be applied to append data to an existing .xls
file using xlwt
. In the excelScript.py
file, change the code to the following:
from xlutils.copy import copyimport xlrdimport xlwt
def main():rb = xlrd.open_workbook('sample.xls')
wb = copy(rb)
sheet = wb.get\_sheet(0)
sheet.write(5, 5, 'sample 2')
sheet.write\_merge(3, 4, 3, 4, 'sample 3', xlwt.easyxf('align: horse center, vert center'))
wb.save('sample.xls')
if __name__ == '__main__';main()
When you run this script using the
python excelScript.py
command, and open the Excel file created, you will notice that “sample 1” was written to .xls
previously present there and “sample 2” and “sample 3” were also added to that file. Voila! This is the functionality we needed. But how did we achieve this?
Here is what happened. First you read “sample.xls” using the xlrd
package and saved it into a variable rb
in an object format. Then you copied that object into a variable wb
using the copy
method of xlutils
package. This means that at that moment, both variables rb
& wb
contain the same object data. Now, we access the first sheet of Excel using the index zero and store its object data into variable sheet
, and now you can use all the functional capabilities of the xlwt
package to finally save your Excel sheet with the same name as before, “sample.xls”.
If you have carefully followed then you may have figured out that we really didn’t append data to an existing file. Instead what we did was to copy the data of the existing file into memory using xlrd
, then writing it to a new Excel file using the copy
method of xlutils
package, and then writing the new data into that new Excel file, while saving it with same name as before so that it can be overwritten.
I am afraid there isn’t a more elegant way to achieve this if you are working with older .xls
files.
The story doesn’t end here. This was all about working with .xls
files but what if you wanted to do the same thing but this time you have got .xlsx
files to work with. In such a case, you are in luck because you can work with openpyxl
. You can read more about it in part 2 of this article.
Hope you found this useful! See you in part 2.
Thanks for reading!
If you enjoyed reading this and would like to read some pieces of life through my eyes, please visit here.