Using Pandas to Work with Data - 🐼#
Why are we using Pandas?
Pandas us to quickly get our comma seperated value (csv) file.
Fill in missing values or NAs
Allow us the ability to preview the date (.head())
Filter our data
import pandas as pd
url="https://gist.githubusercontent.com/dudaspm/e518430a731ac11f52de9217311c674d/raw/4c2f2bd6639582a420ef321493188deebc4a575e/StateCollege2000-2020.csv"
data = []
data=pd.read_csv(url)
data = data.fillna(0) # replace all NAs with 0s
data.head()
DATE | DAY | MONTH | YEAR | PRCP | SNOW | TMAX | TMIN | WT_FOG | WT_THUNDER | WT_SLEET | WT_HAIL | WT_GLAZE | WT_HIGHWINDS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1/1/2000 | 1 | 1 | 2000 | 0.00 | 0.0 | 44.0 | 23 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
1 | 1/2/2000 | 2 | 1 | 2000 | 0.00 | 0.0 | 52.0 | 23 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | 1/3/2000 | 3 | 1 | 2000 | 0.01 | 0.0 | 60.0 | 35 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | 1/4/2000 | 4 | 1 | 2000 | 0.12 | 0.0 | 62.0 | 54 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | 1/5/2000 | 5 | 1 | 2000 | 0.04 | 0.0 | 60.0 | 30 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Acknowledgement#
Cite as: Menne, Matthew J., Imke Durre, Bryant Korzeniewski, Shelley McNeal, Kristy Thomas, Xungang Yin, Steven Anthony, Ron Ray, Russell S. Vose, Byron E.Gleason, and Tamara G. Houston (2012): Global Historical Climatology Network - Daily (GHCN-Daily), Version 3. CITY:US420020. NOAA National Climatic Data Center. doi:10.7289/V5D21VHZ 02/22/2021.
Publications citing this dataset should also cite the following article: Matthew J. Menne, Imke Durre, Russell S. Vose, Byron E. Gleason, and Tamara G. Houston, 2012: An Overview of the Global Historical Climatology Network-Daily Database. J. Atmos. Oceanic Technol., 29, 897-910. doi:10.1175/JTECH-D-11-00103.1.
Use liability: NOAA and NCEI cannot provide any warranty as to the accuracy, reliability, or completeness of furnished data. Users assume responsibility to determine the usability of these data. The user is responsible for the results of any application of this data for other than its intended purpose.
Links: https://data.noaa.gov/onestop/
https://www.ncdc.noaa.gov/cdo-web/search
Numpy Absolute Basics - https://numpy.org/doc/2.2/user/absolute_beginners.html
Filtering#
Both Pandas and D3.js have a type of filtering that are very similar. I will discuss the Pandas version
Filter by year#
data[data.YEAR==2020].head()
DATE | DAY | MONTH | YEAR | PRCP | SNOW | TMAX | TMIN | WT_FOG | WT_THUNDER | WT_SLEET | WT_HAIL | WT_GLAZE | WT_HIGHWINDS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
7274 | 1/1/2020 | 1 | 1 | 2020 | 0.10 | 0.3 | 40.0 | 28 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7275 | 1/2/2020 | 2 | 1 | 2020 | 0.00 | 0.0 | 36.0 | 27 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7276 | 1/3/2020 | 3 | 1 | 2020 | 0.05 | 0.0 | 46.0 | 29 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7277 | 1/4/2020 | 4 | 1 | 2020 | 0.28 | 0.0 | 49.0 | 42 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7278 | 1/5/2020 | 5 | 1 | 2020 | 0.00 | 0.0 | 49.0 | 31 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
data[(data.YEAR==2020) & (data.MONTH==11)].head()
DATE | DAY | MONTH | YEAR | PRCP | SNOW | TMAX | TMIN | WT_FOG | WT_THUNDER | WT_SLEET | WT_HAIL | WT_GLAZE | WT_HIGHWINDS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
7579 | 11/1/2020 | 1 | 11 | 2020 | 0.00 | 0.0 | 46.0 | 38 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7580 | 11/2/2020 | 2 | 11 | 2020 | 0.19 | 0.0 | 50.0 | 32 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7581 | 11/3/2020 | 3 | 11 | 2020 | 0.00 | 0.0 | 40.0 | 33 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7582 | 11/4/2020 | 4 | 11 | 2020 | 0.00 | 0.0 | 55.0 | 33 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7583 | 11/5/2020 | 5 | 11 | 2020 | 0.00 | 0.0 | 69.0 | 34 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Filter by WT_HAIL or WT_HighWinds#
data[(data.WT_HAIL==1) | (data.WT_HIGHWINDS==1)].head()
DATE | DAY | MONTH | YEAR | PRCP | SNOW | TMAX | TMIN | WT_FOG | WT_THUNDER | WT_SLEET | WT_HAIL | WT_GLAZE | WT_HIGHWINDS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
96 | 4/6/2000 | 6 | 4 | 2000 | 0.02 | 0.0 | 47.0 | 30 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 |
135 | 6/15/2000 | 15 | 6 | 2000 | 0.14 | 0.0 | 72.0 | 63 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 |
182 | 8/1/2000 | 1 | 8 | 2000 | 0.00 | 0.0 | 85.0 | 69 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 |
315 | 12/12/2000 | 12 | 12 | 2000 | 0.13 | 0.0 | 42.0 | 29 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
375 | 2/10/2001 | 10 | 2 | 2001 | 0.02 | 0.0 | 59.0 | 31 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |