Advanced StationPlots with Mesonet Data

Unidata Logo

Advanced Surface Observations: Working with Mesonet Data

Unidata Python Workshop


METAR

Questions

  1. How do I read in complicated mesonet data with Pandas?
  2. How do I merge multiple Pandas DataFrames?
  3. What's the best way to make a station plot of data?
  4. How can I make a time series of data from one station?

Objectives

  1. Read Mesonet data with Pandas
  2. Merge multiple Pandas DataFrames together
  3. Plot mesonet data with MetPy and CartoPy
  4. Create time series plots of station data

Reading Mesonet Data

In this notebook, we're going to use the Pandas library to read text-based data. Pandas is excellent at handling text, csv, and other files. However, you have to help Pandas figure out how your data is formatted sometimes. Lucky for you, mesonet data frequently comes in forms that are not the most user-friendly. Through this notebook, we'll see how these complicated datasets can be handled nicely by Pandas to create useful station plots for hand analysis or publication.

In [1]:
# Import Pandas
import pandas as pd

West Texas Mesonet

The West Texas Mesonet is a wonderful data source for researchers and storm chasers alike! We have some 5-minute observations from the entire network on 22 March 2019 that we'll analyze in this notebook.

Pandas can parse time into a nice internal storage format as we read in the file. If the time is specified in the file in a somewhat standard form, pandas will even guess at the format if you tell it which column to use. However, in this case the time is reported in a horrible format: between one and four characters that, if there are four characters, represent hours and minutes as HHMM. Let's turn take a charater string, turn it into an integer, and then use integer string formatting to write out a four character string.

In [2]:
for t in ['0', '05', '100', '1005']:
    print('{0:04d}'.format(int(t)))
0000
0005
0100
1005

Pandas can be told how to parse non-standard dates formats by writing an arbitrary function that takes a string and returns a datetime. Here's what that function looks like in this case. We can use timedelta to convert hours and minutes, and then add them to the start date using date math.

In [3]:
def parse_tx_date(v, start_date=None):
    s = '{0:04d}'.format(int(v)) # regularize the data to a four character string
    hour = pd.to_timedelta(int(s[0:2]), 'hour') 
    minute = pd.to_timedelta(int(s[2:4]), 'minute')
    return start_date + hour + minute
In [4]:
# Read in the data and handle the lines that cause issues

# Get a nice date variable cooresponding to the start time
start_date = pd.datetime.strptime('2019-03-22', '%Y-%m-%d')
print(start_date)

# Pre-apply the start date to our date parsing function, so that pandas only passes one value
from functools import partial
date_parser = partial(parse_tx_date, start_date=start_date)

filename = 'West_Texas_data/FIVEMIN_82.txt'
tx_data = pd.read_csv(filename, delimiter=',', header=None, error_bad_lines=False, warn_bad_lines=False,
                     parse_dates=[2], date_parser=date_parser
                     )
tx_data
2019-03-22 00:00:00
/home/travis/miniconda/envs/unidata/lib/python3.7/site-packages/ipykernel_launcher.py:4: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime module instead.
  after removing the cwd from sys.path.
Out[4]:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
0 1 81 2019-03-22 00:00:00 54 2.337 2.334 84.90 2.864 0.199 2.842 11.810 12.950 12.030 55.080 350.80 0.00 2.855 1.195 0.0
1 2 81 2019-03-22 00:00:00 54 15.050 15.180 14.45 15.480 15.180 0.272 0.380 0.395 0.262 0.000 13.26 2812.00 NaN NaN NaN
2 1 81 2019-03-22 00:00:00 32 1.780 1.774 92.80 4.638 0.181 2.058 6.974 12.090 7.400 71.800 339.50 0.00 2.025 0.786 0.0
3 2 81 2019-03-22 00:00:00 32 11.840 12.540 12.23 11.520 12.860 0.239 0.233 0.165 -0.073 7.650 12.36 2570.00 NaN NaN NaN
4 2 81 2019-03-22 00:00:00 42 9.700 11.140 9.62 9.020 9.790 0.407 0.700 0.094 0.033 1.218 12.49 659.60 NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
22938 1 81 2019-03-22 23:55:00 23 9.340 9.280 269.20 6.259 1.323 13.950 13.550 14.300 13.540 68.380 302.40 0.01 7.450 7.050 0.0
22939 1 81 2019-03-22 23:56:00 60 5.947 5.898 267.00 7.330 0.949 8.560 13.300 13.650 13.460 13.880 229.60 0.00 -13.660 4.250 0.0
22940 1 81 2019-03-22 23:57:00 60 8.720 8.650 277.30 7.250 0.738 10.060 13.270 13.790 13.510 13.730 229.50 0.00 -13.810 6.325 0.0
22941 1 81 2019-03-22 23:58:00 60 9.530 9.440 275.10 8.220 1.509 12.670 13.270 13.730 13.420 14.020 229.60 0.00 -13.560 6.112 0.0
22942 1 81 2019-03-22 23:59:00 60 9.520 9.440 265.30 7.660 1.217 11.370 13.250 13.910 13.570 13.680 229.50 0.00 -13.870 6.675 0.0

22943 rows × 19 columns

In [5]:
# Rename columns to be understandable
tx_data.columns = ['Array_ID', 'QC_flag', 'Time', 'Station_ID', '10m_scalar_wind_speed',
                 '10m_vector_wind_speed', '10m_wind_direction',
                 '10m_wind_direction_std', '10m_wind_speed_std', 
                 '10m_gust_wind_speed', '1.5m_temperature', 
                 '9m_temperature', '2m_temperature', 
                 '1.5m_relative_humidity', 'station_pressure', 'rainfall', 
                 'dewpoint', '2m_wind_speed', 'solar_radiation']
tx_data
Out[5]:
Array_ID QC_flag Time Station_ID 10m_scalar_wind_speed 10m_vector_wind_speed 10m_wind_direction 10m_wind_direction_std 10m_wind_speed_std 10m_gust_wind_speed 1.5m_temperature 9m_temperature 2m_temperature 1.5m_relative_humidity station_pressure rainfall dewpoint 2m_wind_speed solar_radiation
0 1 81 2019-03-22 00:00:00 54 2.337 2.334 84.90 2.864 0.199 2.842 11.810 12.950 12.030 55.080 350.80 0.00 2.855 1.195 0.0
1 2 81 2019-03-22 00:00:00 54 15.050 15.180 14.45 15.480 15.180 0.272 0.380 0.395 0.262 0.000 13.26 2812.00 NaN NaN NaN
2 1 81 2019-03-22 00:00:00 32 1.780 1.774 92.80 4.638 0.181 2.058 6.974 12.090 7.400 71.800 339.50 0.00 2.025 0.786 0.0
3 2 81 2019-03-22 00:00:00 32 11.840 12.540 12.23 11.520 12.860 0.239 0.233 0.165 -0.073 7.650 12.36 2570.00 NaN NaN NaN
4 2 81 2019-03-22 00:00:00 42 9.700 11.140 9.62 9.020 9.790 0.407 0.700 0.094 0.033 1.218 12.49 659.60 NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
22938 1 81 2019-03-22 23:55:00 23 9.340 9.280 269.20 6.259 1.323 13.950 13.550 14.300 13.540 68.380 302.40 0.01 7.450 7.050 0.0
22939 1 81 2019-03-22 23:56:00 60 5.947 5.898 267.00 7.330 0.949 8.560 13.300 13.650 13.460 13.880 229.60 0.00 -13.660 4.250 0.0
22940 1 81 2019-03-22 23:57:00 60 8.720 8.650 277.30 7.250 0.738 10.060 13.270 13.790 13.510 13.730 229.50 0.00 -13.810 6.325 0.0
22941 1 81 2019-03-22 23:58:00 60 9.530 9.440 275.10 8.220 1.509 12.670 13.270 13.730 13.420 14.020 229.60 0.00 -13.560 6.112 0.0
22942 1 81 2019-03-22 23:59:00 60 9.520 9.440 265.30 7.660 1.217 11.370 13.250 13.910 13.570 13.680 229.50 0.00 -13.870 6.675 0.0

22943 rows × 19 columns

The West Texas mesonet provides data on weather, agriculture, and radiation. These different observations are encoded 1, 2, and 3, respectively in the Array ID column. Let's parse out only the meteorological data for this exercise.

In [6]:
# Remove non-meteorological rows
tx_data = tx_data[tx_data['Array_ID'] == 1]
tx_data
Out[6]:
Array_ID QC_flag Time Station_ID 10m_scalar_wind_speed 10m_vector_wind_speed 10m_wind_direction 10m_wind_direction_std 10m_wind_speed_std 10m_gust_wind_speed 1.5m_temperature 9m_temperature 2m_temperature 1.5m_relative_humidity station_pressure rainfall dewpoint 2m_wind_speed solar_radiation
0 1 81 2019-03-22 00:00:00 54 2.337 2.334 84.9 2.864 0.199 2.842 11.810 12.95 12.03 55.08 350.8 0.00 2.855 1.195 0.000
2 1 81 2019-03-22 00:00:00 32 1.780 1.774 92.8 4.638 0.181 2.058 6.974 12.09 7.40 71.80 339.5 0.00 2.025 0.786 0.000
5 1 81 2019-03-22 00:00:00 7 3.103 3.030 189.6 12.440 0.150 3.365 10.680 12.65 11.10 55.47 328.5 0.00 1.927 2.000 0.000
7 1 81 2019-03-22 00:00:00 39 2.423 2.422 118.3 1.353 0.097 2.744 11.740 13.91 12.35 48.61 349.1 0.00 1.091 1.047 0.000
9 1 81 2019-03-22 00:00:00 56 3.464 3.451 124.2 4.929 0.292 4.279 13.950 14.99 14.06 49.66 363.2 0.00 3.354 2.530 0.013
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
22938 1 81 2019-03-22 23:55:00 23 9.340 9.280 269.2 6.259 1.323 13.950 13.550 14.30 13.54 68.38 302.4 0.01 7.450 7.050 0.000
22939 1 81 2019-03-22 23:56:00 60 5.947 5.898 267.0 7.330 0.949 8.560 13.300 13.65 13.46 13.88 229.6 0.00 -13.660 4.250 0.000
22940 1 81 2019-03-22 23:57:00 60 8.720 8.650 277.3 7.250 0.738 10.060 13.270 13.79 13.51 13.73 229.5 0.00 -13.810 6.325 0.000
22941 1 81 2019-03-22 23:58:00 60 9.530 9.440 275.1 8.220 1.509 12.670 13.270 13.73 13.42 14.02 229.6 0.00 -13.560 6.112 0.000
22942 1 81 2019-03-22 23:59:00 60 9.520 9.440 265.3 7.660 1.217 11.370 13.250 13.91 13.57 13.68 229.5 0.00 -13.870 6.675 0.000

17279 rows × 19 columns

Station pressure is 600 hPa lower than it should be, so let's correct that as well!

In [7]:
# Correct presssure 
tx_data['station_pressure'] += 600
tx_data['station_pressure']
/home/travis/miniconda/envs/unidata/lib/python3.7/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
Out[7]:
0        950.8
2        939.5
5        928.5
7        949.1
9        963.2
         ...  
22938    902.4
22939    829.6
22940    829.5
22941    829.6
22942    829.5
Name: station_pressure, Length: 17279, dtype: float64

Finally, let's read in the station metadata file for the West Texas mesonet, so that we can have coordinates to plot data later on.

In [8]:
tx_stations = pd.read_csv('WestTexas_stations.csv')
tx_stations
Out[8]:
Location Area Lat Long Elevation ID 4 Letter ID LDM ID Node Logger ID SHEF ID
0 5ENE Abernathy Abernathy/Hale County 33.87538 -101.75718 3333 ft. ABER KARS XARS 1365 2 ARST2
1 6WSW Adrian Adrian/Oldham County 35.25347 -102.76158 4260 ft. ADRI KAD1 XAD1 1526 109 ADXT2
2 3WSW Aiken Aiken/Hale County 34.13305 -101.56952 3321 ft. AIKE KAI1 XAI1 1496 79 AIKT2
3 7SSE Amarillo Amarillo/Randall County 35.11270 -101.79964 3611 ft. AMAS KAM1 XAM1 1492 75 ASOT2
4 9NNE Amarillo Amarillo/Potter County 35.33597 -101.80627 3346 ft. AMAN KAM2 XAM2 1495 78 AMNT2
... ... ... ... ... ... ... ... ... ... ... ...
121 2E Wall Wall/Tom Green County 31.37882 -100.26628 1870 ft WALL2 KWA1 XWA1 1421 57 WGST2
122 2NNE Weinert Weinert/Haskell County 33.34468 -99.66590 1504 ft WEIN KWE1 XWE1 1490 73 EIST2
123 Welch Welch/Dawson County 32.92570 -102.13232 3121 ft WELC KWE2 XWE2 1512 96 WEHT2
124 6NW White River Lake White River Lake/Crosby County 33.52533 -101.16506 2704 ft. WHIT KWVS XWVS 1390 27 WLST2
125 6SSW Wolfforth Wolfforth/Lubbock County 33.42068 -102.04983 3307 ft. WOLF KWOS XWOS 1411 47 WOST2

126 rows × 11 columns

Oklahoma Data

Try reading in the Oklahoma Mesonet data located in the 201903222300.mdf file using Pandas. Check out the documentation on Pandas if you run into issues! Make sure to handle missing values as well. Also read in the Oklahoma station data from the Oklahoma_stations.csv file. Only read in the station ID, latitude, and longitude columns from that file.

In [9]:
# Your code here
In [10]:
def parse_ok_date(v, start_date=None):
    s = '{0:04d}'.format(int(v)) # regularize the data to a four character string
    minute = pd.to_timedelta(int(s), 'minute')
    return start_date + minute
In [11]:
# %load solutions/read_ok.py


# Cell content replaced by load magic replacement.
ok_data = pd.read_csv('201903222300.mdf', skiprows=2, delim_whitespace=True, na_values=-999,
                                          parse_dates=[2], date_parser=partial(parse_ok_date, start_date=start_date))
ok_stations = pd.read_csv('Oklahoma_stations.csv', usecols=[1,7,8])
print(ok_data.head())
print(ok_stations.head())
   STID  STNM                TIME  RELH  TAIR  WSPD  WVEC  WDIR  WDSD  WSSD  \
0  ACME   110 2019-03-22 23:00:00    31  21.3   7.3   7.2   131   8.7   1.0   
1  ADAX     1 2019-03-22 23:00:00    30  20.9   3.3   3.2   136  17.4   0.8   
2  ALTU     2 2019-03-22 23:00:00    54  20.9   7.9   7.9   151   6.1   1.2   
3  ALV2   116 2019-03-22 23:00:00    35  20.4   8.0   7.9   114   9.1   1.2   
4  ANT2   135 2019-03-22 23:00:00    27  22.0   2.6   2.5   137  19.5   0.7   

   ...  TA9M  WS2M  TS10  TB10  TS05  TS25   TS60  TR05  TR25    TR60  
0  ...  21.3   6.0  15.1  17.2  15.9  11.6   10.8  1.59  1.49    1.41  
1  ...  20.6   2.3  14.9  19.5  16.6  13.1 -998.0  1.51  1.46 -998.00  
2  ...  20.9   6.0  12.9  15.4  15.0  11.6 -998.0  2.25  2.21 -998.00  
3  ...  19.8   6.7  10.3  15.7  11.5   9.0 -998.0  1.50  1.39 -998.00  
4  ...  21.6   1.9  14.4  19.1  16.4  12.5   12.0  1.67  1.38    1.33  

[5 rows x 24 columns]
   stid      nlat      elon
0  ACME  34.80833 -98.02325
1  ADAX  34.79851 -96.66909
2  ALTU  34.58722 -99.33808
3  ALV2  36.70823 -98.70974
4  ALVA  36.77970 -98.67170

Merging DataFrames

We now have two data files per mesonet - one for the data itself and one for the metadata. It would be really nice to combine these DataFrames together into one for each mesonet. Pandas has some built in methods to do this - see here. For this example, we'll be using the merge method. First, let's rename columns in the Oklahoma station DataFrame to be more understandable.

In [12]:
# Rename columns so merging can occur
ok_stations.columns = ['STID', 'LAT', 'LON']

Conveniently, we have a STID column in both DataFrames. Let's base our merge on that and see what we get!

In [13]:
# Merge the two data frames based on the Station ID
ok_data = pd.merge(ok_data, ok_stations, on='STID')
ok_data
Out[13]:
STID STNM TIME RELH TAIR WSPD WVEC WDIR WDSD WSSD ... TS10 TB10 TS05 TS25 TS60 TR05 TR25 TR60 LAT LON
0 ACME 110 2019-03-22 23:00:00 31 21.3 7.3 7.2 131 8.7 1.0 ... 15.1 17.2 15.9 11.6 10.8 1.59 1.49 1.41 34.80833 -98.02325
1 ADAX 1 2019-03-22 23:00:00 30 20.9 3.3 3.2 136 17.4 0.8 ... 14.9 19.5 16.6 13.1 -998.0 1.51 1.46 -998.00 34.79851 -96.66909
2 ALTU 2 2019-03-22 23:00:00 54 20.9 7.9 7.9 151 6.1 1.2 ... 12.9 15.4 15.0 11.6 -998.0 2.25 2.21 -998.00 34.58722 -99.33808
3 ALV2 116 2019-03-22 23:00:00 35 20.4 8.0 7.9 114 9.1 1.2 ... 10.3 15.7 11.5 9.0 -998.0 1.50 1.39 -998.00 36.70823 -98.70974
4 ANT2 135 2019-03-22 23:00:00 27 22.0 2.6 2.5 137 19.5 0.7 ... 14.4 19.1 16.4 12.5 12.0 1.67 1.38 1.33 34.24967 -95.66844
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
115 WILB 105 2019-03-22 23:00:00 26 20.8 3.3 3.3 69 7.0 0.5 ... 13.1 16.0 15.8 11.9 11.3 1.48 1.44 1.47 34.90092 -95.34805
116 WIST 106 2019-03-22 23:00:00 18 21.1 1.5 1.4 80 21.7 0.3 ... 11.8 17.0 13.0 10.4 10.0 1.99 1.51 1.41 34.98426 -94.68778
117 WOOD 107 2019-03-22 23:00:00 27 20.1 8.4 8.3 135 8.4 1.1 ... 11.3 15.0 12.3 10.1 9.5 1.54 1.56 1.79 36.42329 -99.41682
118 WYNO 108 2019-03-22 23:00:00 26 19.0 3.6 3.5 122 11.8 0.5 ... 13.7 17.0 14.9 9.9 -998.0 1.85 1.60 -998.00 36.51806 -96.34222
119 YUKO 142 2019-03-22 23:00:00 28 21.0 4.8 4.7 146 11.3 1.0 ... 12.9 NaN 12.8 11.6 9.9 1.55 1.41 1.40 35.55671 -97.75538

120 rows × 26 columns

That was nice! But what if our DataFrames don't have the same column name, and we want to avoid renaming columns? Check out the documentation for pd.merge and see how we can merge the West Texas DataFrames together. Also, subset the data to only be from 2300 UTC, which is when our Oklahoma data was taken. Call the new DataFrame tx_one_time.

In [14]:
# Your code here
In [15]:
# %load solutions/merge_texas.py


# Cell content replaced by load magic replacement.
# Find common time between TX and OK data
tx_data = pd.merge(tx_data, tx_stations, left_on='Station_ID', right_on='Logger ID')
tx_one_time = tx_data[tx_data['Time'] == '2019-3-22 23:00']
tx_one_time
Out[15]:
Array_ID QC_flag Time Station_ID 10m_scalar_wind_speed 10m_vector_wind_speed 10m_wind_direction 10m_wind_direction_std 10m_wind_speed_std 10m_gust_wind_speed ... Area Lat Long Elevation ID 4 Letter ID LDM ID Node Logger ID SHEF ID
276 1 81 2019-03-22 23:00:00 54 3.201 3.191 124.60 4.494 0.274 3.757 ... San Angelo/Tom Green 31.54263 -100.51328 1957 ft. SASU KASU XASU 1418 54 SJST2
564 1 81 2019-03-22 23:00:00 32 2.363 1.928 283.10 34.770 0.708 3.430 ... Spur/Dickens County 33.48085 -100.87636 2287 ft. SPUR KSP1 XSPR 1396 32 SPST2
852 1 81 2019-03-22 23:00:00 7 11.410 11.310 156.30 7.390 1.815 16.040 ... Roaring S./Motley County 33.93635 -100.84540 2615 ft. ROAR KRRS XRRS 1372 7 RRST2
1140 1 81 2019-03-22 23:00:00 39 2.422 2.392 164.40 9.020 0.535 4.279 ... Jayton (Kent Co. Airport) 33.23241 -100.56778 2010 ft. JAYT KJTS XJTS 1404 39 JTST2
1428 1 81 2019-03-22 23:00:00 56 4.857 4.793 165.20 9.300 0.821 6.664 ... Haskell/Haskell County 33.17473 -99.74420 1605 ft. HASK KHA1 XHA1 1420 56 HAXT2
1716 1 81 2019-03-22 23:00:00 59 6.825 6.811 167.30 3.654 0.781 8.720 ... Knox City/Knox County 33.44557 -99.86497 1470 ft. KNOX KKN1 XKN1 1423 59 KXST2
2004 1 81 2019-03-22 23:00:00 24 5.571 5.456 192.70 11.620 1.399 9.280 ... Post/Garza County 33.20033 -101.36804 2598 ft. POST2 KPT1 XPTS 1387 24 PTST2
2292 1 81 2019-03-22 23:00:00 20 3.585 3.581 320.80 2.779 0.252 4.214 ... Friona/Parmer County 34.65450 -102.69097 4005 ft. FRIO KFAS XFAS 1383 20 FAST2
2580 1 81 2019-03-22 23:00:00 3 5.826 5.721 193.90 10.870 0.996 8.620 ... Plainview/Hale County 34.17872 -101.70788 3358 ft. PLVW KPVS XPVS 1367 3 PVST2
2868 1 81 2019-03-22 23:00:00 17 5.328 5.305 175.90 5.349 0.508 6.893 ... Plains/Yoakum County 33.22814 -102.83936 3711 ft. PLAI KPPS XPPS 1380 17 PPST2
3156 1 81 2019-03-22 23:00:00 30 6.357 6.345 173.20 3.463 0.417 7.450 ... Anton/Hockley County 33.72525 -102.19082 3405 ft. ANTO KAOS XAOS 1393 30 AOST2
3444 1 81 2019-03-22 23:00:00 40 3.763 3.723 182.60 8.360 0.582 5.357 ... Pampa/Gray County 35.53950 -100.92772 3216 ft. PAMP KPMS XPMS 1405 40 PAMT2
3732 1 81 2019-03-22 23:00:00 15 4.048 4.032 193.70 4.998 0.311 4.900 ... Seagraves/Gaines County 32.93644 -102.57442 3360 ft. SEAG KSGV XSGV 1392 15 SGST2
4020 1 81 2019-03-22 23:00:00 12 5.309 5.186 222.80 12.330 0.901 7.770 ... Morton/Cochran County 33.73476 -102.73978 3754 ft. MORT KMNS XMNS 1375 12 MNST2
4308 1 81 2019-03-22 23:00:00 10 5.711 5.679 187.00 6.086 0.616 7.710 ... Brownfield/Terry County 33.15188 -102.27102 3314 ft. BROW KBWS XBWS 17 10 BWST2
4596 1 81 2019-03-22 23:00:00 9 4.280 4.271 181.70 3.595 0.383 5.880 ... Ralls/Crosby County 33.66840 -101.37576 3097 ft. RALL KRLS XRLS 14 9 RAST2
4884 1 81 2019-03-22 23:00:00 19 6.002 5.938 218.30 8.390 0.698 7.640 ... Hart/Castro County 34.42319 -102.10735 3694 ft. HART KHRS XHRS 20 19 HAST2
5172 1 81 2019-03-22 23:00:00 1 4.511 4.499 184.60 4.039 0.333 5.259 ... Reese Center/Lubbock County 33.60759 -102.04597 3343 ft. REES2 KREE XREE 1364 1 REST2
5460 1 81 2019-03-22 23:00:00 130 6.697 6.667 192.10 5.401 0.681 8.200 ... Olton/Lamb County 34.09378 -102.11808 3566 ft. OLTO KONS XONS 19 130 ONST2
5748 1 81 2019-03-22 23:00:00 21 5.958 5.929 204.70 5.675 0.715 7.580 ... Dimmitt/Castro County 34.56751 -102.29317 3876 ft. DIMM KDMS XDMS 21 21 DMST2
6035 1 81 2019-03-22 23:00:00 37 5.043 5.024 164.10 5.022 0.432 6.141 ... Snyder/Scurry County 32.71614 -100.86167 2431 ft. SNYD KSYS XSYS 1401 37 SYST2
6323 1 81 2019-03-22 23:00:00 18 5.015 4.987 169.20 6.077 0.461 6.272 ... Lamesa/Dawson County 32.70592 -101.93617 2956 ft. LAMS KLES XLES 1381 18 LMST2
6611 1 81 2019-03-22 23:00:00 31 4.942 4.931 174.90 3.871 0.389 5.782 ... Northeast Borden County 32.89903 -101.20189 2705 ft. FLUV KFVS XFVS 32 31 FVST2
6899 1 81 2019-03-22 23:00:00 25 4.725 4.711 179.30 4.324 0.544 5.847 ... Seminole/Gaines County 32.74075 -102.63581 3313 ft. SEMI KSM1 XSMS 1388 25 SMST2
7187 1 81 2019-03-22 23:00:00 33 5.804 5.728 171.20 9.230 0.917 8.260 ... Lubbock/Lubbock County 33.60408 -101.89919 3232 ft. LBBW2 KLBW XLBW 1397 33 LWST2
7475 1 81 2019-03-22 23:00:00 16 4.248 4.209 233.70 7.720 0.464 5.325 ... Amherst/Lamb County 34.02178 -102.40453 3647 ft. AMHE KAMH XAMH 15 16 ATST2
7763 1 81 2019-03-22 23:00:00 5 5.211 5.206 184.30 2.629 0.509 6.762 ... Slaton/Lubbock County 33.45690 -101.61723 3065 ft. SLAT KSLS XSLS 1368 5 SLST2
8051 1 81 2019-03-22 23:00:00 6 5.575 5.541 185.70 6.277 0.523 7.090 ... Levelland/Hockley County 33.52650 -102.36000 3496 ft. LEVE KLDS XLDS 16 6 LDST2
8339 1 81 2019-03-22 23:00:00 41 6.002 5.949 151.70 7.630 0.886 8.260 ... Aspermont (Stonewall Co. Airp.) 33.16789 -100.19602 1740 ft ASPE KASR XASR 1402 41 ASST2
8627 1 81 2019-03-22 23:00:00 26 2.817 2.811 161.00 3.893 0.233 3.528 ... Gail/Borden County 32.75508 -101.41439 2547 ft. GAIL KGGS XGGS 1389 26 GGST2
8915 1 81 2019-03-22 23:00:00 49 5.988 5.957 151.00 5.881 0.583 7.220 ... Turkey/Hall County 34.37896 -100.93175 2450 ft. VALL KTUR XTUR 1413 49 TUST2
9203 1 81 2019-03-22 23:00:00 2 7.670 7.650 172.30 3.565 0.772 9.670 ... Abernathy/Hale County 33.87538 -101.75718 3333 ft. ABER KARS XARS 1365 2 ARST2
9491 1 81 2019-03-22 23:00:00 27 6.712 6.630 188.20 8.930 0.669 8.100 ... White River Lake/Crosby County 33.52533 -101.16506 2704 ft. WHIT KWVS XWVS 1390 27 WLST2
9779 1 81 2019-03-22 23:00:00 51 6.559 6.550 252.20 3.006 0.328 7.190 ... Tatum, NM/Lea County 33.23877 -103.35211 4018 ft TATU KTAT XTAT 1415 51 TTSN5
10067 1 81 2019-03-22 23:00:00 11 3.627 3.623 249.70 2.558 0.118 3.920 ... Muleshoe/Bailey County 34.20635 -102.74240 3806 ft. MULE KMUS XMUS 1373 11 MUST2
10355 1 81 2019-03-22 23:00:00 34 4.498 4.474 167.20 5.883 0.545 5.684 ... Guthrie/King County 33.56703 -100.48061 1998 ft. PITC KPFS XPFS 1398 34 PFST2
10643 1 81 2019-03-22 23:00:00 22 7.300 7.260 211.40 5.911 1.009 9.540 ... Tulia/Swisher County 34.54294 -101.74050 3478 ft. TULI KTIS XTIS 1385 22 TLST2
10931 1 81 2019-03-22 23:00:00 47 6.985 6.938 185.00 6.657 0.766 9.280 ... Wolfforth/Lubbock County 33.42068 -102.04983 3307 ft. WOLF KWOS XWOS 1411 47 WOST2
11219 1 81 2019-03-22 23:00:00 52 10.420 10.310 166.10 8.430 1.413 13.980 ... Northfield/Motley County 34.27303 -100.60444 2088 ft. NORT KNOR XNOR 1416 52 NORT2
11507 1 81 2019-03-22 23:00:00 36 5.693 5.661 192.90 6.149 0.667 7.420 ... Paducah/Cottle County 33.89053 -100.39886 2021 ft. PADU KPAD XPAD 1400 36 PADT2
11795 1 81 2019-03-22 23:00:00 58 4.608 4.580 144.20 6.264 0.351 5.521 ... Seymour/Baylor County 33.63233 -99.29098 1302 ft SEYM KSE1 XSE1 1422 58 SEST2
12083 1 81 2019-03-22 23:00:00 14 4.655 4.606 171.30 8.340 0.740 6.403 ... O'Donnell/Lynn County 32.97988 -101.83220 3054 ft. ODON2 KOES XOES 18 14 OEST2
12371 1 81 2019-03-22 23:00:00 55 7.660 7.630 170.30 4.870 0.856 10.090 ... St. Lawrence/Glasscock County 31.65645 -101.60019 2693 ft. STLW KST1 XST1 1419 55 GCMT2
13763 1 81 2019-03-22 23:00:00 60 7.250 7.210 259.50 6.131 0.802 8.530 ... Guadalupe Mtns NP/Culberson 31.89132 -104.80997 5571 ft. GUMO KGU1 XGU1 1424 60 PSST2
14099 1 81 2019-03-22 23:00:00 43 6.963 6.920 163.10 6.384 1.027 9.470 ... Western Hardeman County 34.34389 -99.93972 1644 ft. GOOD KGDS XGDS 1407 43 GDST2
14387 1 81 2019-03-22 23:00:00 53 1.759 1.736 281.10 9.440 0.262 2.287 ... Dora NM/Roosevelt County 33.92005 -103.35778 4340 ft. DORA KDR1 XDR1 1417 53 DORN5
14675 1 81 2019-03-22 23:00:00 44 5.319 5.294 186.80 5.570 0.587 6.795 ... McLean/Gray County 35.23719 -100.57492 2863 ft. MCLE KMCS XMCS 1408 44 MCST2
14963 1 81 2019-03-22 23:00:00 29 4.830 4.797 189.10 6.698 0.614 6.337 ... Southeast Cochran County 33.38912 -102.60994 3625 ft. MALL KSDS XSDS 1395 29 SDST2
15251 1 81 2019-03-22 23:00:00 35 1.890 1.850 114.60 11.690 1.011 3.561 ... Clarendon/Donley County 34.92492 -100.93000 2836 ft. CLAR KCES XCES 1399 35 CEST2
15539 1 81 2019-03-22 23:00:00 23 6.925 6.868 198.50 7.330 0.862 8.590 ... Silverton/Briscoe County 34.44540 -101.19050 3202 ft. SILV KSVS XSVS 1386 23 STST2
15827 1 81 2019-03-22 23:00:00 50 8.820 8.780 173.40 5.479 1.404 12.580 ... Childress/Childress County 34.45650 -100.19891 1943 ft. CHIL KCXS XCXS 1414 50 CXST2
16115 1 81 2019-03-22 23:00:00 8 8.150 8.100 175.60 6.315 1.177 11.430 ... Floydada/Floyd County 34.00158 -101.32588 3179 ft. FLOY KFLS XFLS 1371 8 FLST2
16403 1 81 2019-03-22 23:00:00 28 8.870 8.760 181.70 9.140 1.371 13.030 ... Graham/Garza County 33.08152 -101.51615 2870 ft MACY KGHS XGHS 22 28 GHST2
16691 1 81 2019-03-22 23:00:00 38 4.342 4.260 9.31 11.140 0.911 6.109 ... Memphis/Hall County 34.73136 -100.52543 2057 ft. MEMP KMES XMES 1409 38 MEST2
16979 1 81 2019-03-22 23:00:00 45 4.517 4.480 183.40 7.360 0.430 5.325 ... Denver City/Yoakum County 32.99082 -102.93871 3652 ft. DENV KDVS XDVS 1403 45 DVST2
17267 1 81 2019-03-22 23:00:00 48 5.423 5.367 167.90 8.210 0.905 8.130 ... Andrews/Andrews County 32.32008 -102.51669 3169 ft. ANDR KANS XANS 1412 48 AWST2

56 rows × 30 columns

Creating a Station Plot

Let's say we want to plot temperature, dewpoint, and wind barbs. Given our data from the two mesonets, do we have what we need? If not, use MetPy to calculate what you need!

In [16]:
import metpy.calc as mpcalc
from metpy.units import units

# Your code here
In [17]:
# %load solutions/data_conversion.py


# Cell content replaced by load magic replacement.
ok_dewpoint = mpcalc.dewpoint_rh(ok_data['TAIR'].values * units.degC, ok_data['RELH'].values * units.percent)
ok_u, ok_v = mpcalc.wind_components(ok_data['WSPD'].values * units.mph, ok_data['WDIR'].values * units.degrees)
tx_u, tx_v = mpcalc.wind_components(tx_one_time['10m_scalar_wind_speed'].values * units.mph, tx_one_time['10m_wind_direction'].values * units.degrees)
/home/travis/miniconda/envs/unidata/lib/python3.7/site-packages/metpy/xarray.py:677: MetpyDeprecationWarning: The dewpoint_rh function was deprecated in version 0.12. This function has been renamed dewpoint_from_relative_humidity.
  return func(*args, **kwargs)

Now, let's make a Station Plot with our data using MetPy and CartoPy.

In [18]:
from metpy.plots import StationPlot
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt
In [19]:
# Set up a plot with map features
fig = plt.figure(figsize=(12, 12))
proj = ccrs.Stereographic(central_longitude=-100, central_latitude=35)
ax = fig.add_subplot(1, 1, 1, projection=proj)
ax.add_feature(cfeature.STATES.with_scale('50m'), edgecolor='black')
ax.gridlines()


# Create a station plot pointing to an Axes to draw on as well as the location of points
stationplot = StationPlot(ax, ok_data['LON'].values, ok_data['LAT'].values, transform=ccrs.PlateCarree(),
                          fontsize=10)
stationplot.plot_parameter('NW', ok_data['TAIR'], color='red')
stationplot.plot_parameter('SW', ok_dewpoint, color='green')
stationplot.plot_barb(ok_u, ok_v)

# Texas Data
stationplot = StationPlot(ax, tx_one_time['Long'].values, tx_one_time['Lat'].values, transform=ccrs.PlateCarree(),
                          fontsize=10)
stationplot.plot_parameter('NW', tx_one_time['2m_temperature'], color='red')
stationplot.plot_parameter('SW', tx_one_time['dewpoint'], color='green')
stationplot.plot_barb(tx_u, tx_v)

This is an informative plot, but is rather crowded. Using MetPy's reduce_point_density function, try cleaning up this plot to something that would be presentable/publishable. This function will return a mask, which you'll apply to all arrays in the plotting commands to filter down the data.

In [20]:
# Oklahoma
xy = proj.transform_points(ccrs.PlateCarree(), ok_data['LON'].values, ok_data['LAT'].values)
# Reduce point density so that there's only one point within a 50km circle
ok_mask = mpcalc.reduce_point_density(xy, 50000)

# Texas

# Your code here

# Plot

# Your code here
In [21]:
# %load solutions/reduce_and_plot.py


# Cell content replaced by load magic replacement.
xy = proj.transform_points(ccrs.PlateCarree(), tx_one_time['Long'].values, tx_one_time['Lat'].values)
tx_mask = mpcalc.reduce_point_density(xy, 50000)

#Plot

# Set up a plot with map features
fig = plt.figure(figsize=(12, 12))
proj = ccrs.Stereographic(central_longitude=-100, central_latitude=35)
ax = fig.add_subplot(1, 1, 1, projection=proj)
ax.add_feature(cfeature.STATES.with_scale('50m'), edgecolor='black')
ax.gridlines()

# Create a station plot pointing to an Axes to draw on as well as the location of points
stationplot = StationPlot(ax, ok_data['LON'].values[ok_mask], ok_data['LAT'].values[ok_mask], transform=ccrs.PlateCarree(),
                          fontsize=10)
stationplot.plot_parameter('NW', ok_data['TAIR'][ok_mask], color='red')
stationplot.plot_parameter('SW', ok_dewpoint[ok_mask], color='green')
stationplot.plot_barb(ok_u[ok_mask], ok_v[ok_mask])

# Texas Data
stationplot = StationPlot(ax, tx_one_time['Long'].values[tx_mask], tx_one_time['Lat'].values[tx_mask], transform=ccrs.PlateCarree(),
                          fontsize=10)
stationplot.plot_parameter('NW', tx_one_time['2m_temperature'][tx_mask], color='red')
stationplot.plot_parameter('SW', tx_one_time['dewpoint'][tx_mask], color='green')
stationplot.plot_barb(tx_u[tx_mask], tx_v[tx_mask])

Creating Time Series for Stations

What if we want to take data from all times from a single station to make a time series (or meteogram) plot? How can we easily do that with Pandas without having to aggregate the data by hand?

In [22]:
import numpy as np

# Select daylight hours
tx_daytime = tx_data[(tx_data['Time'] >= '2019-03-22 06:00') & (tx_data['Time'] <= '2019-03-22 20:00')]

# Create sub-tables for each station
tx_grp = tx_daytime.groupby('ID')

# Get data from station DIMM
station_data = tx_grp.get_group('DIMM')

# Create hourly averaged data
# time_bins = pd.cut(station_data['Time'], np.arange(600, 2100, 100))
# xarray has groupby_bins, but pandas has cut
station_data.index=station_data['Time']
station_hourly = station_data.resample('H')


# station_hourly = station_data.groupby(time_bins)
station_hourly_mean = station_hourly.mean()
station_hourly_mean = station_hourly_mean.reset_index() # no longer index by time so that we get it back as a regular variable.

# The times are reported at the beginning of the interval, but really represent 
# the mean symmetric about the half hour. Let's fix that.
# from datetime import timedelta timedelta(minutes=30) #
station_hourly_mean['Time'] += pd.to_timedelta(30, 'minutes')
print(station_hourly_mean['Time'])
print(station_data['Time'])
0    2019-03-22 06:30:00
1    2019-03-22 07:30:00
2    2019-03-22 08:30:00
3    2019-03-22 09:30:00
4    2019-03-22 10:30:00
5    2019-03-22 11:30:00
6    2019-03-22 12:30:00
7    2019-03-22 13:30:00
8    2019-03-22 14:30:00
9    2019-03-22 15:30:00
10   2019-03-22 16:30:00
11   2019-03-22 17:30:00
12   2019-03-22 18:30:00
13   2019-03-22 19:30:00
14   2019-03-22 20:30:00
Name: Time, dtype: datetime64[ns]
Time
2019-03-22 06:00:00   2019-03-22 06:00:00
2019-03-22 06:05:00   2019-03-22 06:05:00
2019-03-22 06:10:00   2019-03-22 06:10:00
2019-03-22 06:15:00   2019-03-22 06:15:00
2019-03-22 06:20:00   2019-03-22 06:20:00
                              ...        
2019-03-22 19:40:00   2019-03-22 19:40:00
2019-03-22 19:45:00   2019-03-22 19:45:00
2019-03-22 19:50:00   2019-03-22 19:50:00
2019-03-22 19:55:00   2019-03-22 19:55:00
2019-03-22 20:00:00   2019-03-22 20:00:00
Name: Time, Length: 169, dtype: datetime64[ns]

Use the data above to make a time series plot of the instantaneous data and the hourly averaged data:

In [23]:
# Your code here
In [24]:
# %load solutions/mesonet_timeseries.py


# Cell content replaced by load magic replacement.
import matplotlib.pyplot as plt
fig = plt.figure()
ax= fig.add_subplot(111)
ax.plot(station_hourly_mean['Time'], station_hourly_mean['solar_radiation'])
ax.plot(station_data['Time'], station_data['solar_radiation'])
Out[24]:
[<matplotlib.lines.Line2D at 0x7f9a2d658e90>]