Center for Analysis and Design of Intelligent Agents

2020 Lecture Notes

Concepts

Independent variables	We select their values - the values are known when we start an experiment. Any independent variable must have at least 2 levels (values), so its effect can be evaluated.
Levels	Relating to an independent variable: The number of levels of an independent variable is equal to the number of variations of that variable used in an experiment.
Dependent variables	Values are measured during and/or after the experiment.
Sample: subject selection from a “population” A representative subset, drawn from a population, of the phenomenon we are studying.	Examples: a. Siggi, Maggi and Biggi representing human males. b. 10 lakes representing all freshwater on the Earth's surface. c. rust on bottom of doors representing the overall state of an automobile. A sample should be randomly chosen to (1) minimize spurious correlations and thus (2) maximize the generalizability of the results of measuring only a small subset of the phenomenon.
Spurious correlation	“false” correlation - correlation that implies a connection between things measured, where there is no causal relationship between them, in and of themselves.
Between-subjects design	If our control group in an experiment contains different instances than the experimental group.
Within-subjects design	When the instances in our experimental group serve as their own control group.
Internal validity	How likely is it that the independent variables caused the dependent variables?
External validity	How likely is it that the results generalize to other instances of the phenomenon under study?
Type I Error	Falsely rejecting the null hypothesis. The null-hypothesis states that the difference in the variation in the dependent variable(s) between levels of the independent variable(s) is not due to the independent variables. Falsely rejecting the null-hypothesis means that you thought there was an “effect” - your manipulations made a difference - when in fact they didn't.
Type II Error	Falsely accepting the null hypothesis. The null-hypothesis states that the difference in the variation in the dependent variable(s) between levels of the independent variable(s) is not due to the independent variables. Falsely accepting the null-hypothesis means that you thought there was no “effect” - your manipulations had no effect - when in fact they did.

True Experimental Designs: Procedure

Identify the phenomenon to study	Characterize it in ways that make it easy to study.
Ask the right question(s)	“A question well asked is a question half-answered.”
Identify variables that matter	Independent and dependent.
Choose experimental design	Based on the nature of the experiement, but some flexibility with regards to how detailed/reliable/etc. the results should be.
Design the setup	Identify all factors that could potentially confound your results.
Execute the experiment	Double-blind procedure: The experimenter does not know which group a subject belongs to and/or which level of an independent variable is being tested.
Collect the data	Use tables, graphs, as appropriate - very important to choose right presentation method.
Apply statistical tests	Make sure you select the right statistical test based on your design and your knowledge of the relationship between your sample and your population, and the distribution and means of the population that the sample is drawn from.
Draw conclusions from statistical tests	Use inference, based on probabilities, statistical significance.
Write up the report

Some Statistical Methods for Experimental Designs: What to Use When

Selecting between hypotheses	Statistical tests help you figure out whether the difference (in means and distribution) observed in a dependent variable (as measured between two samples) is large enough to indicate a non-coincidence. To make this judgement, the “natural” variation in each group is used as a “baseline”. Significance level is a measure that tells you how non-coincidental you want your measure to be, to be considered as “significant”. p<0.05 and p<0.01 are most common (less than 5%, 1% probability of the result being random).
What you study	What you use
Two factors varying along a continuum	Correlation/regression measures
Two factors, where independent variable has (or can have) a few discrete values	t-test
One dependent variable, multiple independent variables, each with two or more levels	ANOVA - Analysis of variance
Many dependent variables, many independent variables	MANOVA (multiple analysis of variance)
REF for M/ANOVA	https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/hypothesis-testing/anova/

t-test

A fairly robust test for simple comparison experiments	Assumptions about population means and distributions can be violated without too much trouble.
Sample size	Good for small sample sizes
Paired t-test	Used for within-subjects designs
Standard t-test	For between-subjects designs
One-tailed t-test	If your hypothesis specifies in which direction your dependent variable will differ from the comparative (neutral) condition.
Two-tailed t-test	If your hypothesis only says that your dependent variable will be affected, but does NOT specify how.

Example of an Experiment: Fish

Theory	Temperature has an effect on cell growth of animals. This goes for fish as well.
Motivation	If we can find evidence for this we might be able to grow larger fish in captivity; larger fish means fewer people starving (or more revenue - or both). Fishing further South might be better for everyone, even those living in the North.
Hypothesis	That size of fish varies with ocean temperature.
Experiment	Comparing the size of fish in the Atlantic Ocean by taking a sample from various latitudes. Argument: Temperature falls the further North one goes; thus, fish at higher latitudes should be smaller.
Sample	100 fish south of Iceland. 100 fish north of Iceland.
Dependent variable	Size of fish (continuous).
Independent variable	Latitude (two levels - South and North).
Statistics	Linear regression.

Example of an Experiment: Routers

Theory	Congestion on networks gets worse the smaller “visibility horizon” <m>H_v</m> each node <m>N_i</m> in a network has about traffic on other adjacent nodes. <m>H_v</m>: Information about traffic, including past, present, and predicted.
Motivation	Knowing whether nodes from router manufacturer X or Y are a better purchase might be decided by looking at their implemented routing methods. Knowing how to set parameters on already-purchased routing nodes might be put on a more scientific ground.
Experiment	Comparing routers from ZYX and Cis. The former advertise their routers to be “network-aware” whereas the latter brag about being “perfect for P2P networks” because each node doesn't need to know anything about the rest of the network.
Hypothesis	Routers from ZYX will perform better at handling congestion than routers from Cis.
Independent variables	1. Router type. 2. Traffic. 3. Network size.
Dependent variables	1. Congestion. 2. Congestion recovery. 3. Routing efficiency.
Statistics	MANOVA

Linear Models: Regression Analysis

Purpose of Regression Analysis	Discover a function that allows prediction of the values of dependent variable y based on values of independent variable x
Scatterplot	Shows the distribution of y-values for given (sampled) x-values
First-order linear function	Y = A + bX Provides us with a single, straight line that gets as close to all the points in the scatterplot as possible (given that it is straight)
Residual	For each x,y point, the distance to the line
How do we find the line?	Least Squares Criterion: We select the linear function that will yield the smallest sum of squared residuals

Linear Correlation

Given a linear function	Given an X-score, the predicted Y-score is given by the line. However, in reality the Y-score rarely falls straight on the line.
Need estimate of error	We must estimate how closely real Ys (Y) follow the predicted Ys (Y')
The measure most commonly used	Standard Error of Estimate
Formula for Std. Err. of Est.	https://www.youtube.com/watch?v=r-txC-dpI-E (walk-through video)
What it tells us	How far, on average, real Ys fall from the line
The smaller the Std. Err. of Est. is …	… the better a predictor the line is
Main limitation of linear models	Assumes – apriori! – a linear relationship

EOF