Theory (ísl. kenning) | “A set of statements or principles devised to explain a group of facts or phenomena, especially one that has been repeatedly tested or is widely accepted and can be used to make predictions about natural phenomena.” REF A theory is a relatively big explanation, covering several phenomena, often through a single principle, or a set of simple principles. |
Hypothesis (isl. tilgáta) | Is a prediction about the relationship between a limited set of phenomena, as explained by a particular theory |
Experimental design | “A planned interference in the natural order of events.” |
Quasi-Experimental | When conditions do not permit an ideal design to be used and a controlled experiement is impossible, there are other techniques that can be used. These are called quasi-experimental designs. |
Dependent variable(s) | These are “the things we want to measure”, e.g. the speedup seen with the new word processor. Values are measured during and/or after the experiment. |
Independent variables | These are factors that need to be controlled for the results to be more intelligible. Example: If we want to study the efficiency speedup seen by a new multi-cultural word processor we would want to have all or some of the cultures represented when we do the study. We select their values - the values are known when we start an experiment. Any independent variable must have at least 2 levels (values), so its effect can be evaluated. |
Subject(s) | Means the subject under study, which can be people, technology and natural phenomena |
Sample: Subject selection from a “population” A representative subset, drawn from a population, of the phenomenon we are studying. | Typically you can't study all the individuals of a particular subject pool, so in your experiment you use a sample and hope that the results generalize to the rest of the subjects. Examples: a. Siggi, Maggi and Biggi representing human males. b. 10 lakes representing all freshwater on the Earth's surface. c. rust on bottom of doors representing the overall state of an automobile. A sample should be randomly chosen to (1) minimize spurious correlations and thus (2) maximize the generalizability of the results of measuring only a small subset of the phenomenon. |
Data | Typically “raw numbers” – only contain low-level semantics |
Information | Processed and prepared data |
Statistics | Mathematical methods for dealing with uncertainty. |
Identification, description and formalization of phenomenon | 1. Observation and description of a phenomenon or group of phenomena. |
Hypothesis, null-hypothesis | 2. Formulation of an hypothesis to explain the phenomena. In physics, the hypothesis often takes the form of a causal mechanism or a mathematical relation. |
Creation of experimental setup to test hypothesis | 3. Use of the hypothesis to predict the existence of other phenomena, or to predict quantitatively the results of new observations. |
Performance of experiment, collection and analysis of results | 4. Performance of experimental tests of the predictions by several independent experimenters and properly performed experiments. Basic assumption: Repeatability — Can be repeated by anyone anywhere |
Repeatability requires formal framework | Detailed description, clear goals, clear (limited) scope, hence the formalities in their execution |
Key idea: Comparsion | Baseline collected in same experimental setup without any other intervention by experimenter |
Key way of comparing | Experiments |
Experimental design | “A planned interference in the natural order of events.” |
Subject(s) | Means the subject under study, which can be people, technology and natural phenomena |
Sample | Typically you can't study all the individuals of a particular subject pool, so in your experiment you use a sample and hope that the results generalize to the rest of the subjects. |
Between subjects vs. within subjects design | Between subjects: Two separate groups of subject/phenomena measured Within subjects: Same subjects/phenomena measured twice, on different occasions |
Quasi-Experimental | When conditions do not permit an ideal design to be used and a controlled experiement is impossible, there are other techniques that can be used. These are called quasi-experimental designs. |
Internal validity | How likely is it that the independent variables caused the dependent variables? |
External validity | How likely is it that the results generalize to other instances of the phenomenon under study? |
What it is | A fairly recent research method, historically speaking, for testing hypotheses / theories |
When to use | When it is possible to control and select everything of importance to the subject of study |
How to do | Select subjects freely, randomize samples, remove experimenter effect through double-blind procedure, use control groups, select independent and dependent variables as necessary to answer the questions raised. |
Why randomize? | Given a complex phenomenon, it is impossible to know all potential causal chains that may exist between the various elements under study. Randomization lessens the probability that there is systematic bias in any factors that are not under study but could affect the results and thus imply different conclusions. |
What is randomized? | The sample should be randomized; subjects should be randomly assigned to control group versus experimental group; Any independent variable identified which could affect the results but is not considered of interest to the research at hand. |
Between subjects vs. within subjects design | Between subjects: Two separate groups of subject/phenomena measured Within subjects: Same subjects/phenomena measured twice, on different occasions |
Internal validity | How likely is it that the independent variables caused the dependent variables? |
External validity | How likely is it that the results generalize to other instances of the phenomenon under study? |
Bottom line | Controlled Experiment: The most powerful mechanism for generating reliable knowledge known to mankind. |
Identify the phenomenon to study | Characterize it in ways that make it easy to study. |
Ask the right question(s) | “A question well asked is a question half-answered.” |
Identify variables that matter | Independent and dependent. |
Choose experimental design | Based on the nature of the experiement, but some flexibility with regards to how detailed/reliable/etc. the results should be. |
Designing the experiment | How do we measure the dependent variables? How do we control the independent variables? What are the hypotheses? How will we run the experiment? Many ways to avoid spurious correlation, e.g. double-blind procedure: The experimenter does not know which group a subject belongs to and/or which level of an independent variable is being tested. |
Choose statistical tests | Statistical methods should be chosen with the experimental design: Make sure you select the right statistical test based on your design and your knowledge of the relationship between your sample and your population, and the distribution and means of the population that the sample is drawn from. |
Design the setup | Identify all factors that could potentially confound your results and take appropriate measures to get the best results possible. |
Execute the experiment | Follow the method designed! |
Collect data, analyze results | Careful experimental design makes this straightforward (bad design makes it a nightmare!). \\Use tables, graphs, as appropriate - very important to choose right presentation method. |
Draw conclusions from statistical tests | Use inference, based on probabilities, statistical significance. |
Write up the report | in theses or scientific publications |
Correlation | Some factors/variables co-vary when changes in one variable are related with changes in the other, negative or positive |
Correlation: Powerful tool | Any variables in the world can be measured for correlation. Only two variables are needed (independent and dependent) for doing correlation studies |
Main operating principle behind correlation | There is no causation without correlation |
Correlation: Pitfall | Correlation does not imply causation between the variables measured! |
Quasi-experimental designs | Purpose: Where true experimental design is not possible, approximate it. If direct control over dependent/independent variables is not possible. |
How it works | 1. One-shot case study (no control group) 2. Single group pre- and post-test (minimal control) 3. ABAB: Single-group repeated measures (slightly less minimal control) |
Limitations | Much greater uncertainty as to the internal and external validity of the quasi-experiments than true experimental designs |
What is it? | Quasi-experimental design. To study a phenomeon “in the wild”. |
When | When a controlled experimental setup is not possible. |
Examples | Example hypothesis: “The popularity of Nokia phones has to do with the quality of their user interface.” |
How | Try to approximate a true experimental design as possible, by randomizing where possible, and by controlling the independent variables, if possible. Make the best attempt possible at analyzing potential alternative variables related to the dependent variable to be measured. |
Example method | Questionnaires: A set of questions posed to people intended to reflect reality. |
Important note on questionnaires | Take good care designing/writing the questions! (It's harder to get it right than you think!) |
Bottom line | Unavoidable in all fields of study. Very useful as a supportive method to true experiments. |
What is it? | A more loose, pre-study using the intended experimental design to tune it A pre-study intended to gauge the nature, scales or other factors of the variables to be measured, or the subject to be measured |
Why and when | Pilots are much more useful than you might think. Yes, it will increase the duration and effort of your experiment BUT: It can significantly improve the quality of the subsequent experiment in many cases. It will certainly clarify and sharpen the experimenter's understanding of one or more of: the experiment, experimental procedure, variables and subjects. |
Bottom line | Do not try to “save time” by skipping a pilot! A pilot study makes sense — most of the time. |
Correlation | Some factors/variables co-vary when changes in one variable are related with changes in the other, negative or positive |
Correlation: Powerful tool | Any variables in the world can be measured for correlation. Only two variables are needed (independent and dependent) for doing correlation studies |
Main operating principle behind correlation | There is no causation without correlation |
Correlation: Pitfall | Correlation does not imply causation between the variables measured! |
Quasi-experimental designs | |
How it works | 1. One-shot case study (no control group) 2. Single group pre- and post-test (minimal control) 3. ABAB: Single-group repeated measures (slightly less minimal control) |
Limitations | Much greater uncertainty as to the internal and external validity of the quasi-experiments than true experimental designs |
What is it? | Repeated measurements of the same sample, varying the independent variables between sessions |
When | When control group is not possible; When the group of subjects is small or single-case (e.g. medical studies) |
Example | |
Often done with only ABA | Adding the last “B” increases tremendously internal validity |
Bottom line | Much more powerful than most books on experimental designs will tell you |
What is it? | The study of human use of technology. Not an experimental design paradigm in and of itself, yet important enough to warrant special discussion |
When | When technology and/or its users are of interest |
How | Experimental setup - easy to use true experimental design, but field studies also common Not as common: Models of users - simulations, e.g. Model Human Processor (Card, Moran, Newell) - typically used in addition to basic experiments or as a pilot |
Origin | As people interact more with technology, questions regarding the outcome necessitate studying users and technology in context with each other |
Bottom line | Increasingly important in a world where more and more technology is interacting with humans |
True Experimental Designs: Procedure |
Some Statistical Methods for Experimental Designs: What to Use When |
t-test |
Using Models to Validate and Measure: The Model Human Processor |
Next Project: Write Contributions, Results, Conclusion |
Independent variables | We select their values - the values are known when we start an experiment. Any independent variable must have at least 2 levels (values), so its effect can be evaluated. |
Levels | Relating to an independent variable: The number of levels of an independent variable is equal to the number of variations of that variable used in an experiment. |
Dependent variables | Values are measured during and/or after the experiment. |
Sample: subject selection from a “population” A representative subset, drawn from a population, of the phenomenon we are studying. | Examples: a. Siggi, Maggi and Biggi representing human males. b. 10 lakes representing all freshwater on the Earth's surface. c. rust on bottom of doors representing the overall state of an automobile. A sample should be randomly chosen to (1) minimize spurious correlations and thus (2) maximize the generalizability of the results of measuring only a small subset of the phenomenon. |
Spurious correlation | “false” correlation - correlation that implies a connection between things measured, where there is no causal relationship between them, in and of themselves. |
Between-subjects design | If our control group in an experiment contains different instances than the experimental group. |
Within-subjects design | When the instances in our experimental group serve as their own control group. |
Internal validity | How likely is it that the independent variables caused the dependent variables? |
External validity | How likely is it that the results generalize to other instances of the phenomenon under study? |
Identify the phenomenon to study | Characterize it in ways that make it easy to study. |
Ask the right question(s) | “A question well asked is a question half-answered.” |
Identify variables that matter | Independent and dependent. |
Choose experimental design | Based on the nature of the experiement, but some flexibility with regards to how detailed/reliable/etc. the results should be. |
Design the setup | Identify all factors that could potentially confound your results. |
Execute the experiment | Double-blind procedure: The experimenter does not know which group a subject belongs to and/or which level of an independent variable is being tested. |
Collect the data | Use tables, graphs, as appropriate - very important to choose right presentation method. |
Apply statistical tests | Make sure you select the right statistical test based on your design and your knowledge of the relationship between your sample and your population, and the distribution and means of the population that the sample is drawn from. |
Draw conclusions from statistical tests | Use inference, based on probabilities, statistical significance. |
Write up the report |
Selecting between hypotheses | Statistical tests help you figure out whether the difference (in means) observed in a dependent variable (as measured between two samples) is large enough to indicate a non-coincidence. To make this judgement, the “natural” variation in each group is used as a “baseline”. Significance level is a measure that tells you how non-coincidental you want your measure to be, to be considered as “significant”. p<0.05 and p<0.01 are most common (less than 5%, 1% probability of the result being random). |
What you study | What you can use |
Two factors varying along a continuum | Correlation/regression measures |
Two factors, where independent variable has (or can have) a few discrete values | t-test |
One dependent variable, multiple independent variables, each with two or more levels | ANOVA - Analysis of variance |
Many dependent variables, many independent variables | MANOVA (multiple analysis of variance) |
A fairly robust test for simple comparison experiments | Assumptions about population means and distributions can be violated without too much trouble. |
Sample size | Good for small sample sizes |
Paired t-test | Used for within-subjects designs |
Standard t-test | For between-subjects designs |
What simulation is | A simplified model of subject under study. Sometimes only the environment is simulated. |
What it does | Simplifies! Makes it easier to (A) set up testing conditions, (B) control independent variables, (C) make changes to the independent variables,(D) measure the results. |
When to use | When the complexity of that which is to be modeled/understood becomes so great that mathematical models are intractable and hypothesis falsification would take decades, centuries or millennia. |
Kinds of simulation methodologies | Continuous time and state: E.g. differential equations. Discrete time/state: E.g. automata. |
Relation between scientific theories and simulations | To build a simulation we need a theory that tells us how things relate to each other. |
Procedure | Pick methodology. Decide which kinds of questions to answer. Model major states/transitions or input/output/functional properties of system. Run simulations with variations in independent variables. Note outcome. Fix model. Repeat. |
0 Source Level | What variables to measure and how to observe them |
1 Data | Data collected from a source system |
1 Generative | Means to generate data in the system |
1 Structure | Components (at lower levels) coupled together to form a generative system |
Source: G. Klir 1985 |
Card, Moran & Newell | 1983, 1986 |
Model Human Processor | An attempt to an engineering approach to usability studies |
Use data from psychological studies | Construct a model of a human user |
Various elements of a user's mind | Memories, perception modules, cycle times, decay times, etc., plus a number of typical performance values and principles on how to use model to predict performance |
Interest has been growing | But has been slower than most predicted |
EOF