Notes On Research Methodology, spring 2012

Concepts

Theory (ísl. kenning)	“A set of statements or principles devised to explain a group of facts or phenomena, especially one that has been repeatedly tested or is widely accepted and can be used to make predictions about natural phenomena.” REF A theory is a relatively big explanation, covering several phenomena, often through a single principle, or a set of simple principles.
Hypothesis (isl. tilgáta)	Is a prediction about the relationship between a limited set of phenomena, as explained by a particular theory
Experimental design	“A planned interference in the natural order of events.”
Quasi-Experimental	When conditions do not permit an ideal design to be used and a controlled experiement is impossible, there are other techniques that can be used. These are called quasi-experimental designs.
Dependent variable(s)	These are “the things we want to measure”, e.g. the speedup seen with the new word processor. Values are measured during and/or after the experiment.
Independent variables	These are factors that need to be controlled for the results to be more intelligible. Example: If we want to study the efficiency speedup seen by a new multi-cultural word processor we would want to have all or some of the cultures represented when we do the study. We select their values - the values are known when we start an experiment. Any independent variable must have at least 2 levels (values), so its effect can be evaluated.
Subject(s)	Means the subject under study, which can be people, technology and natural phenomena
Sample: Subject selection from a “population” A representative subset, drawn from a population, of the phenomenon we are studying.	Typically you can't study all the individuals of a particular subject pool, so in your experiment you use a sample and hope that the results generalize to the rest of the subjects. Examples: a. Siggi, Maggi and Biggi representing human males. b. 10 lakes representing all freshwater on the Earth's surface. c. rust on bottom of doors representing the overall state of an automobile. A sample should be randomly chosen to (1) minimize spurious correlations and thus (2) maximize the generalizability of the results of measuring only a small subset of the phenomenon.
Data	Typically “raw numbers” – only contain low-level semantics
Information	Processed and prepared data
Statistics	Mathematical methods for dealing with uncertainty.

The Scientific Method: Classical Description

Identification, description and formalization of phenomenon	1. Observation and description of a phenomenon or group of phenomena.
Hypothesis, null-hypothesis	2. Formulation of an hypothesis to explain the phenomena. In physics, the hypothesis often takes the form of a causal mechanism or a mathematical relation.
Creation of experimental setup to test hypothesis	3. Use of the hypothesis to predict the existence of other phenomena, or to predict quantitatively the results of new observations.
Performance of experiment, collection and analysis of results	4. Performance of experimental tests of the predictions by several independent experimenters and properly performed experiments. Basic assumption: Repeatability — Can be repeated by anyone anywhere
Repeatability requires formal framework	Detailed description, clear goals, clear (limited) scope, hence the formalities in their execution
Key idea: Comparsion	Baseline collected in same experimental setup without any other intervention by experimenter
Key way of comparing	Experiments

Types of Experimental Designs: Concepts

Experimental design	“A planned interference in the natural order of events.”
Subject(s)	Means the subject under study, which can be people, technology and natural phenomena
Sample	Typically you can't study all the individuals of a particular subject pool, so in your experiment you use a sample and hope that the results generalize to the rest of the subjects.
Between subjects vs. within subjects design	Between subjects: Two separate groups of subject/phenomena measured Within subjects: Same subjects/phenomena measured twice, on different occasions
Quasi-Experimental	When conditions do not permit an ideal design to be used and a controlled experiement is impossible, there are other techniques that can be used. These are called quasi-experimental designs.
Internal validity	How likely is it that the independent variables caused the dependent variables?
External validity	How likely is it that the results generalize to other instances of the phenomenon under study?

Controlled Experiment

What it is	A fairly recent research method, historically speaking, for testing hypotheses / theories
When to use	When it is possible to control and select everything of importance to the subject of study
How to do	Select subjects freely, randomize samples, remove experimenter effect through double-blind procedure, use control groups, select independent and dependent variables as necessary to answer the questions raised.
Why randomize?	Given a complex phenomenon, it is impossible to know all potential causal chains that may exist between the various elements under study. Randomization lessens the probability that there is systematic bias in any factors that are not under study but could affect the results and thus imply different conclusions.
What is randomized?	The sample should be randomized; subjects should be randomly assigned to control group versus experimental group; Any independent variable identified which could affect the results but is not considered of interest to the research at hand.
Between subjects vs. within subjects design	Between subjects: Two separate groups of subject/phenomena measured Within subjects: Same subjects/phenomena measured twice, on different occasions
Internal validity	How likely is it that the independent variables caused the dependent variables?
External validity	How likely is it that the results generalize to other instances of the phenomenon under study?
Bottom line	Controlled Experiment: The most powerful mechanism for generating reliable knowledge known to mankind.

Steps in Research Execution

Identify the phenomenon to study	Characterize it in ways that make it easy to study.
Ask the right question(s)	“A question well asked is a question half-answered.”
Identify variables that matter	Independent and dependent.
Choose experimental design	Based on the nature of the experiement, but some flexibility with regards to how detailed/reliable/etc. the results should be.
Designing the experiment	How do we measure the dependent variables? How do we control the independent variables? What are the hypotheses? How will we run the experiment? Many ways to avoid spurious correlation, e.g. double-blind procedure: The experimenter does not know which group a subject belongs to and/or which level of an independent variable is being tested.
Choose statistical tests	Statistical methods should be chosen with the experimental design: Make sure you select the right statistical test based on your design and your knowledge of the relationship between your sample and your population, and the distribution and means of the population that the sample is drawn from.
Design the setup	Identify all factors that could potentially confound your results and take appropriate measures to get the best results possible.
Execute the experiment	Follow the method designed!
Collect data, analyze results	Careful experimental design makes this straightforward (bad design makes it a nightmare!). \\Use tables, graphs, as appropriate - very important to choose right presentation method.
Draw conclusions from statistical tests	Use inference, based on probabilities, statistical significance.
Write up the report	in theses or scientific publications

Correlational Studies & Quasi-Experimental Design

Correlation	Some factors/variables co-vary when changes in one variable are related with changes in the other, negative or positive
Correlation: Powerful tool	Any variables in the world can be measured for correlation. Only two variables are needed (independent and dependent) for doing correlation studies
Main operating principle behind correlation	There is no causation without correlation
Correlation: Pitfall	Correlation does not imply causation between the variables measured!
Quasi-experimental designs	Purpose: Where true experimental design is not possible, approximate it. If direct control over dependent/independent variables is not possible.
How it works	1. One-shot case study (no control group) 2. Single group pre- and post-test (minimal control) 3. ABAB: Single-group repeated measures (slightly less minimal control)
Limitations	Much greater uncertainty as to the internal and external validity of the quasi-experiments than true experimental designs

Field Studies

What is it?	Quasi-experimental design. To study a phenomeon “in the wild”.
When	When a controlled experimental setup is not possible.
Examples	Example hypothesis: “The popularity of Nokia phones has to do with the quality of their user interface.”
How	Try to approximate a true experimental design as possible, by randomizing where possible, and by controlling the independent variables, if possible. Make the best attempt possible at analyzing potential alternative variables related to the dependent variable to be measured.
Example method	Questionnaires: A set of questions posed to people intended to reflect reality.
Important note on questionnaires	Take good care designing/writing the questions! (It's harder to get it right than you think!)
Bottom line	Unavoidable in all fields of study. Very useful as a supportive method to true experiments.

Pilots

What is it?	A more loose, pre-study using the intended experimental design to tune it A pre-study intended to gauge the nature, scales or other factors of the variables to be measured, or the subject to be measured
Why and when	Pilots are much more useful than you might think. Yes, it will increase the duration and effort of your experiment BUT: It can significantly improve the quality of the subsequent experiment in many cases. It will certainly clarify and sharpen the experimenter's understanding of one or more of: the experiment, experimental procedure, variables and subjects.
Bottom line	Do not try to “save time” by skipping a pilot! A pilot study makes sense — most of the time.

Correlational Studies & Quasi-Experimental Design

Correlation	Some factors/variables co-vary when changes in one variable are related with changes in the other, negative or positive
Correlation: Powerful tool	Any variables in the world can be measured for correlation. Only two variables are needed (independent and dependent) for doing correlation studies
Main operating principle behind correlation	There is no causation without correlation
Correlation: Pitfall	Correlation does not imply causation between the variables measured!
Quasi-experimental designs
How it works	1. One-shot case study (no control group) 2. Single group pre- and post-test (minimal control) 3. ABAB: Single-group repeated measures (slightly less minimal control)
Limitations	Much greater uncertainty as to the internal and external validity of the quasi-experiments than true experimental designs

ABAB (aka Repeated Measures)

What is it?	Repeated measurements of the same sample, varying the independent variables between sessions
When	When control group is not possible; When the group of subjects is small or single-case (e.g. medical studies)
Example
Often done with only ABA	Adding the last “B” increases tremendously internal validity
Bottom line	Much more powerful than most books on experimental designs will tell you

Usability Studies

What is it?	The study of human use of technology. Not an experimental design paradigm in and of itself, yet important enough to warrant special discussion
When	When technology and/or its users are of interest
How	Experimental setup - easy to use true experimental design, but field studies also common Not as common: Models of users - simulations, e.g. Model Human Processor (Card, Moran, Newell) - typically used in addition to basic experiments or as a pilot
Origin	As people interact more with technology, questions regarding the outcome necessitate studying users and technology in context with each other
Bottom line	Increasingly important in a world where more and more technology is interacting with humans

True Experimental Design: Overview

True Experimental Designs: Procedure

Some Statistical Methods for Experimental Designs: What to Use When

t-test

Using Models to Validate and Measure: The Model Human Processor

Next Project: Write Contributions, Results, Conclusion

True Experimental Design: Main Concepts

Independent variables	We select their values - the values are known when we start an experiment. Any independent variable must have at least 2 levels (values), so its effect can be evaluated.
Levels	Relating to an independent variable: The number of levels of an independent variable is equal to the number of variations of that variable used in an experiment.
Dependent variables	Values are measured during and/or after the experiment.
Sample: subject selection from a “population” A representative subset, drawn from a population, of the phenomenon we are studying.	Examples: a. Siggi, Maggi and Biggi representing human males. b. 10 lakes representing all freshwater on the Earth's surface. c. rust on bottom of doors representing the overall state of an automobile. A sample should be randomly chosen to (1) minimize spurious correlations and thus (2) maximize the generalizability of the results of measuring only a small subset of the phenomenon.
Spurious correlation	“false” correlation - correlation that implies a connection between things measured, where there is no causal relationship between them, in and of themselves.
Between-subjects design	If our control group in an experiment contains different instances than the experimental group.
Within-subjects design	When the instances in our experimental group serve as their own control group.
Internal validity	How likely is it that the independent variables caused the dependent variables?
External validity	How likely is it that the results generalize to other instances of the phenomenon under study?

True Experimental Design: Procedure

Identify the phenomenon to study	Characterize it in ways that make it easy to study.
Ask the right question(s)	“A question well asked is a question half-answered.”
Identify variables that matter	Independent and dependent.
Choose experimental design	Based on the nature of the experiement, but some flexibility with regards to how detailed/reliable/etc. the results should be.
Design the setup	Identify all factors that could potentially confound your results.
Execute the experiment	Double-blind procedure: The experimenter does not know which group a subject belongs to and/or which level of an independent variable is being tested.
Collect the data	Use tables, graphs, as appropriate - very important to choose right presentation method.
Apply statistical tests	Make sure you select the right statistical test based on your design and your knowledge of the relationship between your sample and your population, and the distribution and means of the population that the sample is drawn from.
Draw conclusions from statistical tests	Use inference, based on probabilities, statistical significance.
Write up the report

Methods for Experimental Designs: What to Use When

Selecting between hypotheses	Statistical tests help you figure out whether the difference (in means) observed in a dependent variable (as measured between two samples) is large enough to indicate a non-coincidence. To make this judgement, the “natural” variation in each group is used as a “baseline”. Significance level is a measure that tells you how non-coincidental you want your measure to be, to be considered as “significant”. p<0.05 and p<0.01 are most common (less than 5%, 1% probability of the result being random).
What you study	What you can use
Two factors varying along a continuum	Correlation/regression measures
Two factors, where independent variable has (or can have) a few discrete values	t-test
One dependent variable, multiple independent variables, each with two or more levels	ANOVA - Analysis of variance
Many dependent variables, many independent variables	MANOVA (multiple analysis of variance)

t-test

A fairly robust test for simple comparison experiments	Assumptions about population means and distributions can be violated without too much trouble.
Sample size	Good for small sample sizes
Paired t-test	Used for within-subjects designs
Standard t-test	For between-subjects designs

Using Models to Validate and Measure - a.k.a. Simulation

What simulation is	A simplified model of subject under study. Sometimes only the environment is simulated.
What it does	Simplifies! Makes it easier to (A) set up testing conditions, (B) control independent variables, (C) make changes to the independent variables,(D) measure the results.
When to use	When the complexity of that which is to be modeled/understood becomes so great that mathematical models are intractable and hypothesis falsification would take decades, centuries or millennia.
Kinds of simulation methodologies	Continuous time and state: E.g. differential equations. Discrete time/state: E.g. automata.
Relation between scientific theories and simulations	To build a simulation we need a theory that tells us how things relate to each other.
Procedure	Pick methodology. Decide which kinds of questions to answer. Model major states/transitions or input/output/functional properties of system. Run simulations with variations in independent variables. Note outcome. Fix model. Repeat.

Levels of System Knowledge in Simulation

0 Source Level	What variables to measure and how to observe them
1 Data	Data collected from a source system
1 Generative	Means to generate data in the system
1 Structure	Components (at lower levels) coupled together to form a generative system
_{Source: G. Klir 1985}

Using Engineering Models to Validate and Measure: The Model Human Processor

Card, Moran & Newell	1983, 1986
Model Human Processor	An attempt to an engineering approach to usability studies
Use data from psychological studies	Construct a model of a human user
Various elements of a user's mind	Memories, perception modules, cycle times, decay times, etc., plus a number of typical performance values and principles on how to use model to predict performance
Interest has been growing	But has been slower than most predicted

EOF

Center for Analysis and Design of Intelligent Agents

Table of Contents

Notes On Research Methodology, spring 2012

Concepts

The Scientific Method: Classical Description

Types of Experimental Designs: Concepts

Controlled Experiment

Steps in Research Execution

Correlational Studies & Quasi-Experimental Design

Field Studies

Pilots

Correlational Studies & Quasi-Experimental Design

ABAB (aka Repeated Measures)

Usability Studies

True Experimental Design: Overview

True Experimental Design: Main Concepts

True Experimental Design: Procedure

Methods for Experimental Designs: What to Use When

t-test

Using Models to Validate and Measure - a.k.a. Simulation

Levels of System Knowledge in Simulation

Using Engineering Models to Validate and Measure: The Model Human Processor