Differences

This shows you the differences between two versions of the page.

--- public:rem4:rem4-20:design_of_comparative_experiments_ii [2020/01/23 09:29] – created thorisson
+++ public:rem4:rem4-20:design_of_comparative_experiments_ii [2024/04/29 13:33] (current) – external edit 127.0.0.1
@@ Line 18: / Line 18: @@
 | Internal validity  | How likely is it that the independent variables caused the dependent variables?  |
 | External validity  | How likely is it that the results generalize to other instances of the phenomenon under study?  |
+| Type I Error  | Falsely rejecting the null hypothesis. \\ The null-hypothesis states that the difference in the variation in the dependent variable(s) between levels of the independent variable(s) is not due to the independent variables. \\ Falsely rejecting the null-hypothesis means that you thought there was an "effect" - your manipulations made a difference - when in fact they didn't.  |
+| Type II Error  | Falsely accepting the null hypothesis. \\ The null-hypothesis states that the difference in the variation in the dependent variable(s) between levels of the independent variable(s) is not due to the independent variables. \\ Falsely accepting the null-hypothesis means that you thought there was **no** "effect" - your manipulations had no effect - when in fact they did.    |
 \\
@@ Line 50: / Line 52: @@
 | One dependent variable, multiple independent variables, each with two or more levels  | ANOVA - Analysis of variance   |
 | Many dependent variables, many independent variables  | MANOVA (multiple analysis of variance)  |
+| REF for M/ANOVA | https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/hypothesis-testing/anova/  |
@@ Line 63: / Line 66: @@
 | Paired t-test  | Used for within-subjects designs  |
 | Standard t-test  | For between-subjects designs  |
+| One-tailed t-test  | If your hypothesis specifies in which direction your dependent variable will differ from the comparative (neutral) condition.   |
+| Two-tailed t-test  | If your hypothesis only says that your dependent variable will be affected, but does NOT specify how.   |
 \\
@@ Line 69: / Line 74: @@
 \\
+=== Example of an Experiment: Fish ===
-=== Using Models to Validate and Measure - a.k.a. Simulation ===
+| Theory   | Temperature has an effect on cell growth of animals. This goes for fish as well.    |
+| Motivation   | If we can find evidence for this we might be able to grow larger fish in captivity; larger fish means fewer people starving (or more revenue - or both). \\ Fishing further South might be better for everyone, even those living in the North.   |
+| Hypothesis  | That size of fish varies with ocean temperature. \\   |
+| Experiment   | Comparing the size of fish in the Atlantic Ocean by taking a sample from various latitudes. Argument: Temperature falls the further North one goes; thus, fish at higher latitudes should be smaller.  |
+| Sample | 100 fish south of Iceland. 100 fish north of Iceland.  |
+| Dependent variable  | Size of fish (continuous).   |
+| Independent variable  | Latitude (two levels - South and North).   |
+| Statistics  | Linear regression.    |
+\\
+\\
+\\
+\\
+=== Example of an Experiment: Routers ===
+| Theory    | Congestion on networks gets worse the smaller "visibility horizon" <m>H_v</m> each node <m>N_i</m> in a network has about traffic on other adjacent nodes. \\ <m>H_v</m>: Information about traffic, including past, present, and predicted.       |
+| Motivation   | Knowing whether nodes from router manufacturer X or Y are a better purchase might be decided by looking at their implemented routing methods. \\ Knowing how to set parameters on already-purchased routing nodes might be put on a more scientific ground.   |
+| Experiment   | Comparing routers from ZYX and Cis. The former advertise their routers to be "network-aware" whereas the latter brag about being "perfect for P2P networks" because each node doesn't need to know anything about the rest of the network.     |
+| Hypothesis   | Routers from ZYX will perform better at handling congestion than routers from Cis.     |
+| Independent variables   | 1. Router type. \\ 2. Traffic. \\ 3. Network size.    |
+| Dependent variables    | 1. Congestion. \\ 2. Congestion recovery. \\ 3. Routing efficiency.    |
+| Statistics    | MANOVA   |
+\\
+\\
+\\
+\\
+=== Linear Models: Regression Analysis ===
+| Purpose of Regression Analysis  | Discover a function that allows prediction of the values of dependent variable y based on values of independent variable x  |
+| Scatterplot  | Shows the distribution of y-values for given (sampled) x-values  |
+| First-order linear function  | Y = A + bX \\ Provides us with a single, straight line that gets as close to all the points in the scatterplot as possible (given that it is straight)  |
+| Residual  | For each x,y point, the distance to the line   |
+| How do we find the line?  | Least Squares Criterion: We select the linear function that will yield the smallest sum of squared residuals  |
+\\
+\\
+\\
+\\
+===Linear Correlation===
+| Given a linear function  | Given an X-score, the predicted Y-score is given by the line. However, in reality the Y-score rarely falls straight on the line.   |
+| Need estimate of error  | We must estimate how closely real Ys (Y) follow the predicted Ys (Y')  |
+| The measure most commonly used  | Standard Error of Estimate  |
+| Formula for Std. Err. of Est. | https://www.youtube.com/watch?v=r-txC-dpI-E (walk-through video)   |
+| What it tells us  | How far, on average, real Ys fall from the line  |
+| The smaller the Std. Err. of Est. is ... | ... the better a predictor the line is  |
+| Main limitation of linear models  | Assumes -- apriori! -- a linear relationship  |
+\\
+\\
+\\
+\\
-| What simulation is  | A simplified model of subject under study - that is, a simplification not of the key causal factors in the phenomenon, which must remain in our model for it to be useful, but rather a reduction (sometimes a radical one) of the "extra stuff that really doesn't matter".  |
-| What it does  | Simplifies! Makes it easier to \\ (A) set up testing conditions, \\ (B) control independent variables, \\ (C) make changes to the independent variables,(D) measure the results.  |
-| When to use  | When the complexity of that which is to be modeled/understood becomes so great that mathematical models are intractable and hypothesis falsification would take decades, centuries or millennia, or is simply out of the question (as in e.g. astrophysics).  |
-| Kinds of simulation methodologies | Continuous time and state: E.g. differential equations. \\  Discrete time/state: E.g. automata. \\ |
-| Relation between scientific theories and simulations  | To build a simulation we need a theory that tells us how things relate to each other.  |
-| Procedure  | Pick methodology. \\ Decide which kinds of questions to answer. \\ Model major states/transitions or input/output/functional properties of system. \\ Run simulations with variations in independent variables. \\ Note outcome. \\ Fix model. \\ Repeat.  |
 \\