====== T-Tests & Linear Models ====== ===Concepts=== | H₁ | Your hypothesis: This is what you are predicting. E.g. "there is a difference between conditions A and B on measure C. | | H₀ | "Null-hypothesis": This is the inverse of H1, "there will be no difference", \\ or more precisely, "your manipulation of the independent variable(s) does not cause anything". | | Probability | Chance may always affect the outcome of any measurement, to a greater or larger extent. Doing lots and lots of measurements can help us be more certain that our results are not a coincidence. That is expensive and takes time. \\ We can save time by cutting back on doing lots of measurements if we understand the effects of taking fewer measurements on the probability of getting things wrong. | | Statistical test | Helps us estimate the likelihood of us being wrong. | | One- and Two-Tailed Tests | Scenario: You measure something under two conditions, you expect there to be a difference between the measures. If you have strong suspicions that one measure will be higher than the other, you use a one-tailed test. \\ As you know, results of any experiment could be a coincidence. A statistical test helps us figure out what the probability of this is. \\ If we have a pre-determined idea of which direction a certain difference will be, our hypothesis is **stronger** than if our hypothesis simply says "there will be a difference". \\ This needs to be taken into account in the statistical test, when figuring out if the result might have been a coincidence. For a hypothesis that specifies the direction of a difference, use one-tailed, otherwise use two-tailed. | \\ \\ \\ \\ === Gathering Data === | Example hypothesis | "The fish in the North of Iceland is healthier than the fish in the South of Iceland." \\ H1: Fish in the North are healthier | | Sample | 20 individual fish are tested. | | Variables | Dependent: Health. Independent: Oceanic area (N,S). Dependent variable is measured with the "famous Health Probe". | | Subject pool | N=20; random sample. Specify by which means/method the randomness is generated and followed. | | Gathering data | Repeated measures: 20 measurements for indexes of health: \\ North:97,99,88,77,99,20,87,88,89,65; \\ South:66,48, .... | | **What we have so far** | Basically, we have a bunch of measurements which came from two different parts of the country. They will probably have a different mean, median, etc. -- it's unlikely that they will be equal. This difference, we would like to find out -- is it a true representation of the actual fish population in each of these two different locations? | | Sampling distribution | How all the measurements are distributed (for both N and S). | | Population distribution | How the total fish population, in North, South (and everywhere else that may matter) is distributed on this measure of health. | | Standard Deviation (SD) | If the population is normally distributed, we will have 68.2% +/- 1 SD from the mean, 95% of the population +/- 2 SD from the mean, and 99% +/- 3 SD from the mean. \\ Illustration from wikipedia: http://en.wikipedia.org/wiki/Image:Standard_deviation_diagram.svg | | **What we want to know** | How likely is it that the distribution of Health (as measured with the Health Probe) in the two separate samples (N,S) is a total coincidence? | \\ \\ \\ \\ === t-tests === | A.k.a. | "Student's t-test" | | When to use | To test the difference between two means, when the standard deviation of the population is unknown. | | Input | Data from two populations. | | NB: Underlying assumption | On the measure of health, the **total** fish population is normally distributed. We say that "The population is normally distributed." | | Standard deviation of sample | We use this as an estimate of the (actual) population standard deviation, using measures from both North and South. | | Output | t-value, p-value | | t-value | A measure of the difference between (sample) populations. | | p-value | Probability value: The percent likelihood of this result being a coincidence is p*100. | | Typical thresholds for p | p<0.05 and p<0.01 \\ ...that is, the difference between two (sample) populations is "statistically significant" if the p-value is below either of these thresholds. \\ (Which one to use depends on the circumstances.) | | One-sample and two-sample t-test | In the fish example above we have two separate sample populations, hence we use two-sample t-test. | | One-sample alternative names | Matched-sample t-test, Paired t-test, Repeated-measures t-test. | | More information | http://biology.nebrwesleyan.edu/courses/labs/biology_of_animals/t-test_flash.html | \\ \\ \\ \\ === Linear Models: Regression Analysis === | Purpose of Regression Analysis | Discover a function that allows prediction of the values of dependent variable y based on values of independent variable x | | Scatterplot | Shows the distribution of y-values for given (sampled) x-values | | First-order linear function | Y = A + bX \\ Provides us with a single, straight line that gets as close to all the points in the scatterplot as possible (given that it is straight) | | Residual | For each x,y point, the distance to the line | | How do we find the line? | Least Squares Criterion: We select the linear function that will yield the smallest sum of squared residuals | \\ \\ \\ \\ ===Linear Correlation=== | Given a linear function | Given an X-score, the predicted Y-score is given by the line. However, in reality the Y-score rarely falls straight on the line. | | Need estimate of error | We must estimate how closely real Ys (Y) follow the predicted Ys (Y') | | The measure most commonly used | Standard Error of Estimate | | Formula for Std. Err. of Est. | http://cs.gmu.edu/cne/modules/dau/stat/regression/multregsn/mreg_2_frm.html | | What it tells us | How far, on average, real Ys fall from the line | | The smaller the Std. Err. of Est. is ... | ... the better a predictor the line is | | Main limitation of linear models | Assumes -- apriori! -- a linear relationship | \\ \\ \\ \\ === Beyond Linear Models & T-Tests === | Pick the appropriate statistics | Depending on your experimental design, or on the nature of your field test, some tests are more appropriate than others. | | **What you study** | **What you use** | | Two factors varying along a continuum | Correlation/regression measures | | Two factors, where independent variable has (or can have) a few discrete values | Kolmogorov-Smirnov two-sample test, t-test (if distribution is normal), t-test for unequal variances (if variances between underlying populations),Wilcoxon-Mann-Whitney (if distribution is not normal) \\ cf. \\ http://beheco.oxfordjournals.org/content/17/4/688.full \\ http://advan.physiology.org/content/34/3/128 \\ http://rsos.royalsocietypublishing.org/content/1/3/140216 \\ https://xkcd.com/882/ | | One dependent variable, multiple independent variables, each with two or more levels | ANOVA - Analysis of variance | | Many dependent variables, many independent variables | MANOVA (multiple analysis of variance) | \\ \\ \\ \\ === p<0.05: A Word of Warning === | What Does p<0.005 Mean? | David Colquhoun says: If there were actually no effect (if the true difference between means were zero) then the probability of observing a value for the difference equal to, or greater than, that actually observed would be p=0.05. In other words there is a 5% chance of seeing a difference at least as big as we have done, by chance alone. \\ http://beheco.oxfordjournals.org/content/17/4/688.full | | The number will be right only if all the assumptions made by the test were true | One of the assumptions is that the measurements are truly randomized -- that there is no relationship between the measurements of the dependent variable on the dimensions of the independent variable being tested. This assumption is however frequently broken. | \\ \\ \\ \\ EOF