Statistics part, assignment 2 due Sunday 28 Oct
Please elaborate on assignment 1: Further describe your data of assignment 1 (or new data generated in the same way), both verbally, graphically and by using summary statistics. Try to discover and clarify characteristics of the data and in particular why or why not it may seem normally distributed and which are the best (or at least very good) graphs for the description.
Does the structure of the initial pass through into the sample means and medians? If positive, discuss how. What effect might the sample size have here?
Which types of pictures reveal the data? We might need more than one, in order to reveal different characteristics. If histograms are used, we most likely need to use the “breaks” parameter when studying the data, consider which breakpoints and bin sizes are most convenient and explain that choice in the report. Other pictures that sometimes helps clarify the data are a plot of the sorted data: “plot(sort(mydata))”. Plots of the raw data (index x value) already were already included in some of the solutions to the first assignment.
Do the data (vector of means or medians) follow a normal distribution? If it might )or seems to), you should try the qqnorm function in R and compare the results visually with simulated data. There also is formal test of normality you should try, shapiro.test. A more general test, ks.test, might be used, but is not necessary here. Also discuss whether the prerequisites of the Central Limit Theorem are relevant and if he CLT might apply. Could there be other reasons for the data to be close to normally distributed? In particular, consider if the data is fairly symmetric “in its nature”.