Center for Analysis and Design of Intelligent Agents

Overview

True Experimental Designs: Procedure

Some Statistical Methods for Experimental Designs: What to Use When

t-test

Using Models to Validate and Measure: The Model Human Processor

Next Project: Write Contributions, Results, Conclusion

Concepts

Independent variables	We select their values - the values are known when we start an experiment. Any independent variable must have at least 2 levels (values), so its effect can be evaluated.
Levels	Relating to an independent variable: The number of levels of an independent variable is equal to the number of variations of that variable used in an experiment.
Dependent variables	Values are measured during and/or after the experiment.
Sample: subject selection from a “population” A representative subset, drawn from a population, of the phenomenon we are studying.	Examples: a. Siggi, Maggi and Biggi representing human males. b. 10 lakes representing all freshwater on the Earth's surface. c. rust on bottom of doors representing the overall state of an automobile. A sample should be randomly chosen to (1) minimize spurious correlations and thus (2) maximize the generalizability of the results of measuring only a small subset of the phenomenon.
Spurious correlation	“false” correlation - correlation that implies a connection between things measured, where there is no causal relationship between them, in and of themselves.
Between-subjects design	If our control group in an experiment contains different instances than the experimental group.
Within-subjects design	When the instances in our experimental group serve as their own control group.
Internal validity	How likely is it that the independent variables caused the dependent variables?
External validity	How likely is it that the results generalize to other instances of the phenomenon under study?

True Experimental Designs: Procedure

Identify the phenomenon to study	Characterize it in ways that make it easy to study.
Ask the right question(s)	“A question well asked is a question half-answered.”
Identify variables that matter	Independent and dependent.
Choose experimental design	Based on the nature of the experiement, but some flexibility with regards to how detailed/reliable/etc. the results should be.
Design the setup	Identify all factors that could potentially confound your results.
Execute the experiment	Double-blind procedure: The experimenter does not know which group a subject belongs to and/or which level of an independent variable is being tested.
Collect the data	Use tables, graphs, as appropriate - very important to choose right presentation method.
Apply statistical tests	Make sure you select the right statistical test based on your design and your knowledge of the relationship between your sample and your population, and the distribution and means of the population that the sample is drawn from.
Draw conclusions from statistical tests	Use inference, based on probabilities, statistical significance.
Write up the report

Some Statistical Methods for Experimental Designs: What to Use When

Selecting between hypotheses	Statistical tests help you figure out whether the difference (in means) observed in a dependent variable (as measured between two samples) is large enough to indicate a non-coincidence. To make this judgement, the “natural” variation in each group is used as a “baseline”. Significance level is a measure that tells you how non-coincidental you want your measure to be, to be considered as “significant”. p<0.05 and p<0.01 are most common (less than 5%, 1% probability of the result being random).
What you study	What you use
Two factors varying along a continuum	Correlation/regression measures
Two factors, where independent variable has (or can have) a few discrete values	t-test
One dependent variable, multiple independent variables, each with two or more levels	ANOVA - Analysis of variance
Many dependent variables, many independent variables	MANOVA (multiple analysis of variance)

t-test

A fairly robust test for simple comparison experiments	Assumptions about population means and distributions can be violated without too much trouble.
Sample size	Good for small sample sizes
Paired t-test	Used for within-subjects designs
Standard t-test	For between-subjects designs

Using Models to Validate and Measure - a.k.a. Simulation

What simulation is	A simplified model of subject under study - that is, a simplification not of the key causal factors in the phenomenon, which must remain in our model for it to be useful, but rather a reduction (sometimes a radical one) of the “extra stuff that really doesn't matter”.
What it does	Simplifies! Makes it easier to (A) set up testing conditions, (B) control independent variables, (C) make changes to the independent variables,(D) measure the results.
When to use	When the complexity of that which is to be modeled/understood becomes so great that mathematical models are intractable and hypothesis falsification would take decades, centuries or millennia, or is simply out of the question (as in e.g. astrophysics).
Kinds of simulation methodologies	Continuous time and state: E.g. differential equations. Discrete time/state: E.g. automata.
Relation between scientific theories and simulations	To build a simulation we need a theory that tells us how things relate to each other.
Procedure	Pick methodology. Decide which kinds of questions to answer. Model major states/transitions or input/output/functional properties of system. Run simulations with variations in independent variables. Note outcome. Fix model. Repeat.

Levels of System Knowledge in Simulation

0 Source Level	What variables to measure and how to observe them
1 Data	Data collected from a source system
1 Generative	Means to generate data in the system
1 Structure	Components (at lower levels) coupled together to form a generative system
_{Source: G. Klir 1985}

Using Models to Validate and Measure: The Model Human Processor

Card, Moran & Newell	1983, 1986
Model Human Processor	An attempt to an engineering approach to usability studies
Use data from psychological studies	Construct a model of a human user
Various elements of a user's mind	Memories, perception modules, cycle times, decay times, etc., plus a number of typical performance values and principles on how to use model to predict performance
Interest has been growing	But has been slower than most predicted

EOF