Mohawk and Hudson River Discharge Data, 1980-1997
General Problem: Understanding patters of river discharge.
Tip 4: Examining long timeseries
- Make several plots with different time scales:
- The whole time series.
- Small sections of the time series.
- When working with several timeseries collected at the
same time, make co-registered plots at the time scale
and the same 'vertical' scale.
- Are there long term cycles in the data?
- Smoothing the data with a running-average can make the
cycles easier too see.
- Averaging ("stacking") many cycles together can reveal the
typical shape of one cycle.
- Subtracting a repeating series of mean cycles from the data
can bring out variations.
- Are there short term events in the data?
- Subtracting a running-average from the data can make
short term events easier to see.
- The statistics of short term events are often interesting,
and different than the statistics of the timeseries as
a whole.
Things to Do:
- Session 1: Familiarization
- Look at the USGS's web site a
http://waterdata.usgs.gov/ to get a feeling about what the data mean and
how they are presented. Check out the "Get Help" link.
- Get one raw data set from the USGS (say the Mohawk at Rome, NY), using the instructions below.
- Look over one raw data file, to understand its contents and format. What
are the units of discharge?
- Plot the 11 Gaging Stations on a map of New York, and work out the
order of the stations along the two rivers. Note where the two rivers join.
- Reformat 1 raw data file (say the Mohawk at Rome, NY) into a tab-separated value file. What are the
best units for time?
- Plot the data for each station as a function of time, using the same horizontal
and vertical scales for each plot. Describe the general behavior. What is the problem with station Newcomb?
- Session 2: Seasonal and Annual Means
- Calculate the total discharge (i.e. in cubic feet per year) for all stations for
the year 1984. Plot it on the NY State map. Does it increase downstream in a sensible way?
- New York State receives about 30 inches of rain per year. What fraction of rain
falling in the Hudson drainage basin makes it into the river?
- Growing 1 kilogram of maize ("corn") requires about 2000 kilograms of water. How
much maize coulde be grown is all the water in the Hudson were used to irrigate maize fields
(ignoring the contribution of rain, of course)?
- Calculate the seasonal discharge (i.e. in cubic feet per season) for the most
northerly and southerly Hudson River stations (excl. Newcomb) for the entire 17 year period.
Plot it versus
time. What seasons have the lowest and highest discharge? (Save this calulation. You will
need it during the next session).
- Session 3: Interannual variability
- Make histograms of the winter, spring, summer and autumn discharge of the two
stations that you used in the last session. How much variability about the mean is there?
- Smooth the daily discharge data for the Hudson at the Ft. Edward site using a 1 week
moving-window average. Overlay plots of the original data and the average.
- Cut the daily discharge data for the Hudson at Ft. Edwards site up into 17 1-year
segments, and overlay plots of them. Describe the similarities between years. Create a dataset
of the minimum, maximum and average discharge for each day of the year.
- Suppose that a factory is dumping waste into the river at a rate that is just-legal
for a river flowing with the average discharge (based on the 17-year history) of the Hudson at
the Ft. Edward site. How many
days during the year will it be out of compliance?
- Session 4: Flooding Statistics
- For each of 4 stations:
- Make a histogram of daily dischage rates using the entire 17-year dataset. The
Histogram gives the number of 'floods' of a given size in 17 years. Divide by 17, to get
an estimate of the number of floods per year. Multiply by 100, to get an estimate of the
number of floods per one-hundred years.
- Sum up the values from largest to smallest. This is a cumulative histogram, the
number of floods per 100 years with a discharge greater to or equal to a given size. Plot it
on a lin-log plot.
- Extrapolate the histogram out to the value of one flood per 1000 years (i.e. 0.1
flood per 100 years). How much bigger than the mean discharge is the predicted value?
- Session 5: Time delays and correlations
- Floods often 'move downstream', as the surge of water from rain the the headlands
flows out to sea. Identify this effect in the data. How fast dows the surge move? Is its
speed related to the size of the flood?
- How similar are the discharge data from neighboring stations? Make a scatter
plot of short time periods (weeks, say) the data from two stations, after accounting for any
delays between the stations.
- Session 6: Summarization
- Use this session to work on your lab report.
- Try to find some interesting aspect of the data that we haven't discussed in class.
Raw data from USGS:
Tab-separated data with 01/01/1980 labeled day 0:
To get the data from the US Geological Survey via the WWW:
- go to the USGS's NWIS-W Data Retrieval Page,
http://waterdata.usgs.gov/
- click on a state (e.g. New York)
- click on 'Search the list of New York Gaging Stations
- enter a Basin name in the form (e.g. hudson or mohawk) and click Search
- a list of stations should appear, click on the desired station name
- a page of station information shold appear. Click on Historical Streamflow Daily Values
- a page with a retrieval form should appear. Set the Dates to retrieve form (e.g.
From 01/01/1980 until 09/30/1997), set the output formal to Tab-delimited text data file
in YYY.MM.DD format, and click on Retrieve Data.
- when the data appears, save it ti a file