Environmental Data Analysis BC ENV 3017
Lab: Histogram and normal distribution
-
Open a new MS Excel workbook and create a column of 100 random numbers
between 0 and 1 using the MS Excel rand() function.
-
In a second column create 100 numbers that are the sum of 10 random numbers
(type: 'rand()+rand()+rand() ....').
-
Paste these numbers into the adjacent column using the 'Paste Special,
values' command.
-
Determine max, min, average, and standard deviation of these lists.
-
Normalize the frequency to '% per bin size' . Calculate the center of each
bin. Make histograms of the two lists of data, using the center of the
bins on the x-axis. If you use a column diagram, note that you might have
to set the x-values manually (=> go to "Series" during the dialogue).
-
How do the two distributions compare?
Now calculate the normal distribution function using the average and
standard deviation that you already have derived as a function of the center
of the bins and plot it with the columns of the second histogram. Do it
both ways, using the actual normal distribution function and the EXCEL
equivalent NORMDIST.
*100
(==NORMDIST(centered bins,average,standar deviation,false)*100)
-
Use the "Custom Types - Line - Column graph" for this and adjust the x-axis
manually again.
-
Are your data normally distributed?
-
Calculate averages of your list using 2, 5, 10, 20, 40, 75, and 100 samples
(entries) on your list and plot the average versus the number of samples
you have used as a basis for calculation of the avarage. What should be
the average if you had an indefinite number of samples? How do you explain
the pattern that you see?