public:t-713-mers:mers-25:task_theory
                This is an old revision of the document!
Table of Contents
T-713-MERS-2025 Main 
Link to Lecture Notes
Task Teory
| What it is | A systematic framework for describing, comparing, and analyzing tasks, independent of any specific agent. Provides the foundations for evaluating intelligent systems empirically (measurable outcomes, repeatable experiments, controlled variables). | 
| Purpose | In AI today, evaluation is benchmark-driven (ImageNet, Atari, Go) but lacks a unifying science. Task theory proposes such a foundation: comparable to physics for engineering. It lets us treat tasks as objects of study, enabling systematic experimentation in Empirical Reasoning Systems (ERS). | 
| Task vs. Environment | A Task is a desired transformation of the world (e.g., get ball into goal). The Environment is the context where variables evolve. Together: ⟨Task, Environment⟩. In ERS, this separation allows us to ask: *what is the structure of the problem?* before asking *how the agent solves it*. | 
| Agent Separation | Describing tasks independently of the agent prevents conflating “what is to be achieved” with “who/what achieves it.” This is central for ERS: it allows us to evaluate reasoning systems across different domains and agents. | 
| Why Important | Enables: (1) Comparison of tasks across domains; (2) Abstraction into task classes; (3) Estimation of resource needs (time, energy, precision); (4) General evaluation of reasoning systems, beyond one-off benchmarks. | 
| Example Analogy | In physics, wind tunnels test many airplane designs under the same controlled conditions. In ERS, task theory plays a similar role: controlling task variables so that reasoning systems can be compared fairly. | 
Task: How it Hangs Together
| T: A Task | T = { G, V, F, C } | |
| G: Goal | Set of desired states or outcomes. Goals define what counts as “success” from the observer’s perspective. Example: robot reaches waypoint within 1m tolerance. | |
| V: Variables | V = { v₁, v₂, … }. Measurable and manipulatable aspects of the environment relevant to the task. Observer defines these formally (e.g., position, temperature); agent may only have partial/noisy access. | |
| F: Transformation Rules | Describe how variables evolve (physics, rules of a game, causal dynamics). These are objective world relations, available in principle to the observer. Agents must infer or approximate them. | |
| C: Constraints | Boundaries of what is possible (time, energy, error bounds, resource limits). Again, observer’s perspective = formal definition; agent’s perspective = experienced as difficulty or failure when limits are exceeded. | |
| Simple Task | Few variables, deterministic (press a button). | |
| Complex Task | Many variables, uncertainty, multi-step (cooking, multi-agent negotiation). | |
Intricacy & Difficulty
| Intricacy (Observer) | Structural complexity of a task, derived from number of variables, their couplings, and constraints in {V, F, C}. Defined independently of the agent (Eberding 2021). | 
| Effective Intricacy (Agent) | How complicated the task *appears to an agent*, given its sensors, prior knowledge, reasoning, and precision. For a perfect agent, effective intricacy → 0. | 
| Difficulty | A relation: Difficulty(T, Agent) = f(Intricacy(T), Agent Capacities). Same task can be easy for one agent, impossible for another. | 
| Example | Catching a ball: Observer sees physical intricacy (variables: position, velocity, gravity, timing). Agent: a human child has low effective intricacy after learning; a simple robot has very high effective intricacy. | 
| Connection to ERS | Difficulty is the bridge between objective task description (for observers) and empirical performance measures (for agents). ERS requires both views: tasks must be defined *in the world* (observer) but evaluated *through agent behavior*. | 
Dimensions of Task Environments (Thórisson et al., 2015)
| Determinism | Whether the same action in the same state always leads to the same result (deterministic) or whether outcomes vary (stochastic). | 
| Ergodicity | The degree to which all relevant states can in principle be reached, and how evenly/consistently they can be sampled through interaction. | 
| Controllable Continuity | Whether small changes in agent output produce small, continuous changes in the environment (high continuity) or abrupt/discontinuous ones (low continuity). | 
| Asynchronicity | Whether the environment changes only in response to the agent (synchronous) or independently of it, on its own time (asynchronous). | 
| Dynamism | Extent to which the environment changes over time without agent input; static vs. dynamic worlds. | 
| Observability | How much of the environment state is accessible to the agent (full, partial, noisy). | 
| Controllability | The extent to which the agent can influence the environment state; fully controllable vs. only partially or weakly controllable. | 
| Multiple Parallel Causal Chains | Whether multiple independent processes can run in parallel, influencing outcomes simultaneously. | 
| Number of Agents | Whether there is only a single agent or multiple agents (cooperative, competitive, or mixed). | 
| Periodicity | Whether the environment exhibits cycles or repeating structures that can be exploited for prediction. | 
| Repeatability | Whether experiments in the environment can be repeated under the same conditions, producing comparable results. | 
Why Task Theory Matters for Empirical Reasoning
| For Science (Observer) | Provides systematic, measurable, repeatable description of tasks — necessary for empirical study of reasoning systems. Comparable to controlled experiments in physics or biology. | 
| For Engineering (Agent & System Design) | Allows construction of benchmarks that measure generality (performance across task classes), not just single skills. Supports systematic curricula for training agents. | 
| For Empirical Evaluation (ERS Core) | Clarifies whether failure is due to the task (high intricacy, under-specified goals) or the agent (limited sensors, reasoning). Enables falsifiable claims about system capability. | 
| Reflection | In ERS, intelligence boils down to: *Given a formally defined task, how well does an agent reason about it empirically, under uncertainty and constraints?* Task theory provides the shared language to answer this. | 
Discussion Prompts
| Question | Observer Angle | Agent Angle | 
| ———— | —————– | ————– | 
| How is a “task” different from a “problem” in classical AI? | Problem = symbolic puzzle; Task = measurable transformation in a world | Must act in the world to achieve it | 
| Why must tasks be agent-independent? | To compare systems systematically | Otherwise evaluation collapses into “how this agent did” | 
| Can you think of a task with low intricacy but high difficulty for humans? | Observer: low variable count | Agent: limited memory/attention makes it hard (e.g., memorizing 200 digits) | 
| What role does causality play in defining tasks? | Observer: rules F define dynamics | Agent: must infer/approximate causal relations from data | 
| How does a variable-task simulator (like SAGE) help ERS? | Observer: controls task parameters systematically | Agent: experiences wide range of tasks, supports empirical generality tests | 
/var/www/cadia.ru.is/wiki/data/attic/public/t-713-mers/mers-25/task_theory.1756205362.txt.gz · Last modified: 2025/08/26 10:49 by leonard
                
                