How to Export a Python Pandas Dataframe to Excel

Python Pandas

For those of you familiar with the Python language, you may have heard of Pandas before. Pandas is a Python library that is used for data analysis. It has functions you can use to analyze, clean, explore and manipulate data sets. And its use in data science is not complete without a dataframe.

Data frames are two-dimensional data structures. Two-dimensional structures are popular because we can use them to create tables.

A Pandas data frame is a 2-dimensional array or a table complete with labelled axes (whose columns are of different types) open for manipulation using python. You can call it a very advanced form of Excel. Not only is it very fast, but it also allows you to stretch the limits of data manipulation with great efficiency.

The major advantage Pandas dataframe has over other dataframes is that it is mutable, which means it can be changed. In other words, you can use it to carry out very complex operations with half the stress of using a conventional method.

Dataframes are important to the Pandas library, but it is often important to be able to export them as an excel document. In this article, we cover three different ways to export a dataframe to excel, as well as the advantages and disadvantages of different methods of doing so. Each will have code examples so you can follow along. We also provide a comparison of xlsx and csv files to help you decide which one is more suitable for you to export to.

Why Export Pandas Dataframe to Excel?

For the End User

Excel makes it easy for anyone with some knowledge of computers to be able to use it with a large amount of functionality. The end user can be anyone who does not know about pandas or python. The end user can be a customer, boss, or board who has no time to look at information via code. Ultimately, the data has to be processed into something everyone is used to – Excel.

For Presentations

It is easier to grasp information more quickly from an Excel spreadsheet than from a pandas dataframe. Having to type in commands before seeing what you want to present can be unnerving for people with stage fright. Asides from that, toggling through spreadsheets for a presentation is easier with excel, unlike python which the layman finds complicated.

For printing purposes

Being a programmer, you can print your work directly from python. However, most python newbies opt for Excel when printing because of its apparent simplicity.

How to Export a Pandas Dataframe to Excel

There are three common methods that you can use to export a pandas dataframe to an Excel format:

  1. The dataframe.to_excel() function
  2. Excelwrite method
  3. Exporting the dataframe to a csv file.

Start By

Before you can use any of the three methods, you must have the pandas package, to begin with. Some software for python like Anaconda comes with the Pandas package embedded in it. Others require you to download and install pandas yourself. In that case, you can use the pip command to install Pandas framework on your computer using the following pip command:

pip install pandas

#1 Using the dataframe.to_excel() function

Functions are packs of code which can be used over and over again for a particular task. The dataframe.to_excel() function writes a Pandas dataframe object into the excel sheet. This method is pretty simple and is only a three-step process.

To start with, we need to install the openpyxl library, which will allow us to read and write to Exel files using python:

pip install openpyxl

  • Create a dataframe: this is if you are starting from scratch. Below is a sample code for python dataframe creation.
  • Note that you want to specify each column label when creating the dataframe. See the example below.

Next we will use the following command to import the pandas library (as ‘pd’ here):

import pandas as pd

#DataFrame Creation

name_of_dataframe = pd.Dataframe(dataArray{})

#In the above code, dataArray is the data you are inputting into your table

  • Specify a name for the excel file: Giving the excel file a target file name. You have to create/name your excel file first so you can take your code there. Below is a sample code for specifying the name of the excel file. In this example, we set the file name to the sheet name in the document.

# Specifying a name for the excel file

file_name = name_of_sheet.xlsx

  • Using the .to_excel() function: This is the final part of this method which saves the data into a new sheet.

# Saving the data into the specified sheet

file_name = name_of_dataframe.to_excel(file_name)

print(‘Dataframe successfully exported to Excel spreadsheet’)

Example:

In the following code, note that we specify the column names when we construct the original dataframe.

import pandas as pd

data = {'Groceries': ['Milk','Bananas','Bread','Oranges'],
        'Price': [3.00,.75,1.50,1.00]
        }

df = pd.DataFrame(data, columns= ['Groceries', 'Price'])

myGroceryList = myGroceryList.xlsx

myGroceryList = df.to_excel(myGroceryList)

print (df)

Note that we specify the common names but the row names are determined by the first, i.e. the index column.

#2 Export data frame using the Excelwrite() method

ExcelWriter() is a class used to save a dataframe to an excel sheet using an excelwriter object. It is useful for saving multiple sheets at once and appending data to old sheets. The difference between this method and the previous one is that it generates the sheet in excel first before data is exported in bulk to the sheet. The method uses these three simple steps:

  • Create an excel sheet

The first step in this method is to create the spreadsheet in excel via python. The code you can use for this is given below.

import pandas as pd

sheet = pd.ExcelWriter(‘Export.xlsx’, engine=’xlsxwriter’) #line creates the spreadsheet in excel

  • Add the data to the Excel sheet direct from python: According to the amount of data you have, you input them as arrays which determine the frame of the table.

# Dataframe X and Y columns

z = pd.DataFrame({‘X’ : [‘a’, ‘b’ ‘c’], ‘Y’ : [1, 2, 3]})

# convert dataframe (z) to an xlsxWriter Excel object

z.to_excel(sheet, index=False)

writer.save() #close the Pandas Excel writer and output the Excel file.

#3 Saving a Pandas dataframe as a csv file

Comma separated values (csv) file is another effective way to save tabular data. Records in the csv format contain fields separated by commas which do the work of column lines. Each line of the file is a cell row. The file gets its name from its use of the comma.

Most data scientists prefer using this file type to save dataframes over the xlsx file type for reasons which would be covered later in this post. The simplest way to copy the dataframe to a csv file is by using the python code below:

z.to_csv(‘Export.csv’)

# Export.csv is the file name

To avoid having an index automatically generated for you by the system, set the index to false like this:

z.to_csv(‘Export.csv’, index=False)

You can open this csv file using excel for display and further manipulation.

In the following section we will cover the differences between the xlsx (excel output) and csv formats.

File Formats xlsx vs csv

The first two methods described saving the dataframe as xlsx files while the last method saves the dataframe using the csv file extension. Depending on the person using them, one may be favored over the other. Let us explore their merits and demerits and let you decide which one you prefer.

xlsx file format

Advantages of Exporting to xlsx:

  • Xslx being the primary file type of Microsoft Excel opens in the Excel spreadsheet upon request. As most people are familiar with Microsoft Excel, xslx can easily be opened by them without much confusion.
  • Xslx files can easily be manipulated by anyone since they open into the excel workbook by default. With a well-known interface, Microsoft Excel can be used to cause changes to the xslx file.
  • Since Microsoft Excel comes with a defined appearance, xslx files (as they open directly into Excel) are great for quick presentations by the user(s) or a third party who is ignorant of file types and their uses.

Disadvantages of Exporting to xslx:

  • Exporting to xslx is time and space-consuming as it is a large file.
  • Xslx files cannot be opened or edited by any other application except excel.
  • It is limited to Windows.
  • Since data can easily be manipulated on xslx files (by excel), it is not uncommon to have mix-ups when exported and opened. For instance, your credit card numbers can be jumbled.
  • The use of xslx files for programming purposes is complicated, difficult and slow.

csv file format

Advantages of exporting to csv:

  • It can be opened and edited by text editors such as notepad and textpad.
  • It can also be opened by many applications such as excel, google sheets, Mobi Systems OfficeSuite, LibreOffice, Fresh Books etc.
  • Parsing csv files is easy using any programming language.
  • It is a platform-independent file. You can create and handle csv files on Windows, Linux and Mac environments.
  • As data cannot be manipulated directly on csv files, the data is stored exactly as entered.
  • It consumes less memory space and is fast to read on all applications.
  • Since they are simple text files, they are very easy to use for other programming purposes.

Disadvantages of exporting to csv:

  • Csv does not have formatting tools. To format the file, you will have to open it in a text editor or excel.
  • End users find reading large csv files cumbersome especially when they are opened in text editors. Since the file rows are defined by commas, it gets hard to read the information when the information contained within is large.
  • Csv cannot handle complex data.
  • It does not have much support for special characters such as certain punctuation marks and math symbols.
  • Csv files have no type. Every entry is a string.
  • There is no universal standard rule for handling csv files.
  • There is no separation of numeric data from text data.

Exporting Pandas Dataframe to Other Formats

Depending on what you want to use information contained in your Pandas dataframe, there are other file formats that accept Pandas dataframe and concerts it to their style.

Dataframe to PDF

Exporting a pandas dataframe to PDF is useful for the reason that it remains as it is. It cannot be manipulated by any application except those with the right tools. It is best used when the information contained is not subject to any changes. It can be exported directly from the python program.

Dataframe to JSON

The purpose of JSON files is to store JavaScript code as text. Storing pandas dataframe as JSON files can only be managed when the information inside is of a small quantity. Large files in JSON can be very difficult to read and understand. The best way to export a pandas dataframe to a JSON file is to first convert it to an HTML file.

Dataframe to PNG

Being an image file, the best way to export a pandas dataframe to png file format is via conversion from HTML. Hence, the dataframe has to be converted to HTML before conversion to PNG.

Dataframe to TXT

Exporting pandas dataframe to TXT requires the use of the .to_csv function which is the same function used for csv files. It is easy to read and use on Windows and Mac.

Dataframe to SQLITE

One of the most popular database engines available, SQLITE is frequently used for data manipulation. Pandas dataframe is converted to the SQLITE format using the pd.sql(~) method.

Dataframe to MYSQL

The easiest way to export a pandas dataframe to MYSQL is to create a table in MYSQL database and connect the database and the panda dataframe. The result of this connection is the collection of data into the MySQL database as it is being entered in the panda dataframe.

Dataframe to HTML

Exporting data to HTML is quite simple. It even serves as a middle man for exporting data to some other formats like PNG. The dataframe when exported becomes a table on a web page.

Summary – Exporting a Pandas Dataframe to Excel

Exporting Pandas dataframe to excel (xslx) comes in handy for most office situations. However, for further work with data science, it is wise to export to the file type that makes it easy to work with in the future. To that end, csv steals the show with its flexibility and small size. For more information, we recommend starting with the official documentation for Pandas.