Environmental Data Analysis BC ENV 3017

Histogram and normal distribution

  1. Open a new MS Excel workbook and create a column of 100 random numbers between 0 and 1 using the MS Excel rand() function.
  2. In a second column create 100 numbers that are the average of 10 random numbers (type: '=(rand()+rand()+rand() ....)/10').
  3. Paste these two columns into the adjacent column using the 'Paste Special, values' command (Otherwise EXCEL will update your random numbers after every additional entry).
  4. Determine max, min, average, and standard deviation of these lists.
  5. Make histograms of both columns and overlay the normal distribution using the statplus histogram function.
  6. How do the two histograms/distributions compare?
  7. Are your data normally distributed?
  8. Calculate averages of your lists using 2, 5, 10, 20, 40, 75, and 100 samples (entries) on your list and plot the average versus the number of samples you have used as a basis for calculation of the average. What should be the average if you had an indefinite number of samples? How do you explain the pattern that you see?
  9. (A variation of this would be to make up the (normalized) histograms step by step and overlay the normal distribution function.)