Environmental Data Analysis BC ENV 3017

Lab: Histogram and normal distribution

  1. Open a new MS Excel workbook and create a column of 100 random numbers between 0 and 1 using the MS Excel rand() function.
  2. In a second column create 100 numbers that are the sum of 10 random numbers (type: 'rand()+rand()+rand() ....').
  3. Paste these numbers into the adjacent column using the 'Paste Special, values' command.
  4. Determine max, min, average, and standard deviation of these lists.
  5. Normalize the frequency to '% per bin size' . Calculate the center of each bin. Make histograms of the two lists of data, using the center of the bins on the x-axis. If you use a column diagram, note that you might have to set the x-values manually (=> go to "Series" during the dialogue).
  6. How do the two distributions compare?

  7. Now calculate the normal distribution function using the average and standard deviation that you already have derived as a function of the center of the bins and plot it with the columns of the second histogram. Do it both ways, using the actual normal distribution function and the EXCEL equivalent NORMDIST.
    *100

    (==NORMDIST(centered bins,average,standar deviation,false)*100)

  1. Use the "Custom Types - Line - Column graph" for this and adjust the x-axis manually again.
  2. Are your data normally distributed?
  3. Calculate averages of your list using 2, 5, 10, 20, 40, 75, and 100 samples (entries) on your list and plot the average versus the number of samples you have used as a basis for calculation of the avarage. What should be the average if you had an indefinite number of samples? How do you explain the pattern that you see?