Handling Missing Data in Pandas

Oftentimes while working with missing data, I prefer working with pandas. Just because pandas makes things so much easier.

Tip:

  1. dropna has arguments subset and how:
df2.dropna(subset=['three', 'four', 'five'], how='all')

As the names suggests:

  • how='all' requires every column (of subset) in the row to be NaN in order to be dropped, as opposed to the default 'any'.
  • subset is those columns to inspect for NaNs.

 

Code:

 

References:

  1. http://stackoverflow.com/questions/13413590/how-to-drop-rows-of-pandas-dataframe-whose-value-of-certain-column-is-nan
  2. http://stackoverflow.com/questions/14991195/how-to-remove-rows-with-null-values-from-kth-column-onward-in-python
  3. http://pandas.pydata.org/pandas-docs/stable/missing_data.html#missing-data-basics

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s