Use pandas

How to measure Kurtosis in Python pandas

How to measure Kurtosis in Python pandas

Kurtosis! It's a neat statistical measure that tells you how different from a normal distribution a given set of data is. In particular it measures if data are heavy-tailed or light tailed when compared to a normal distribution.

The lower the number is, the less outliers exist in the data. The higher it is, the more outliers exist.

Let's take a look at the kurtosis for the price column in the following .csv of housing data.

homes_sorted.csv

AddressPriceBedrooms
992 Settled St823,0494
1506 Guido St784,0493
247 Fort St299,2383
132 Walrus Ave299,0012
491 Python St293,9234
4981 Anytown Rd199,0004
938 Zeal Rd148,3982
123 Main St99,0001

How to measure kurtosis with Python pandas

import pandas as pd
df = pd.read_csv('/Users/kennethcassel/homes_sorted.csv')
df['price'].kurtosis()

Output: -0.29610470855022797

Conclusion:

It's super easy to analyze data to find kurtosis using python pandas.

Our dataset had a low kurtosis measurement. A normal distribution is 3. Anything below 1 is considered a light-tailed set of data. Anything higher than 1 is heavy tailed.

Edit this page on GitHub