Use pandas

How to sort data by column in a .csv file with Python pandas

How to sort data by column in a .csv file with Python pandas

Sorting data by a column value is a very common task for Data analysts who use Python pandas.

For this example, let's say you're trying to sort a .csv file that contains housing data. 🏠 In particular, you're wanting to sort from highest to lowest, based on price.

You start with a .csv for this task that looks like this:

homes.csv

AddressPriceBedrooms
123 Main St99,0001
4981 Anytown Rd199,0004
132 Walrus ave299,0012
1506 Guido St784,0493
491 Python St293,9234
938 Zeal Rd148,3982
247 Fort St299,2383
992 Settled St823,0494

Sort data in a .csv file with Python pandas

import pandas as pd
# Read in your .csv files as dataframes
# df is a common standard for naming a dataframe. You can
# name them something more descriptive as well.
# Using a descriptive name is helpful when you are dealing
# with multiple .csv files.
df = pd.read_csv("C:/Users/kennethcassel/homes.csv")
# the .sort_values method returns a new dataframe, so make sure to
# assign this to a new variable.
sorted_df = df.sort_values(by=["price"], ascending=False)
# Index=False is a flag that tells pandas not to write
# the index of each row to a new column. If you'd like
# your rows to be numbered explicitly, leave this as
# the default, True
sorted_df.to_csv('homes_sorted.csv', index=False)

homes_sorted.csv

Here is the resulting csv file from just a few lines of pandas code!

AddressPriceBedrooms
992 Settled St823,0494
1506 Guido St784,0493
247 Fort St299,2383
132 Walrus Ave299,0012
491 Python St293,9234
4981 Anytown Rd199,0004
938 Zeal Rd148,3982
123 Main St99,0001
Edit this page on GitHub