public:t-713-mers:mers-25:task_theory
This is an old revision of the document!
Table of Contents
T-713-MERS-2025 Main
Link to Lecture Notes
Task Teory
What it is | A systematic framework for describing, comparing, and analyzing tasks, independent of any specific agent. Provides the foundations for evaluating intelligent systems empirically (measurable outcomes, repeatable experiments, controlled variables). |
Purpose | In AI today, evaluation is benchmark-driven (ImageNet, Atari, Go) but lacks a unifying science. Task theory proposes such a foundation: comparable to physics for engineering. It lets us treat tasks as objects of study, enabling systematic experimentation in Empirical Reasoning Systems (ERS). |
Task vs. Environment | A Task is a desired transformation of the world (e.g., get ball into goal). The Environment is the context where variables evolve. Together: ⟨Task, Environment⟩. In ERS, this separation allows us to ask: *what is the structure of the problem?* before asking *how the agent solves it*. |
Agent Separation | Describing tasks independently of the agent prevents conflating “what is to be achieved” with “who/what achieves it.” This is central for ERS: it allows us to evaluate reasoning systems across different domains and agents. |
Why Important | Enables: (1) Comparison of tasks across domains; (2) Abstraction into task classes; (3) Estimation of resource needs (time, energy, precision); (4) General evaluation of reasoning systems, beyond one-off benchmarks. |
Example Analogy | In physics, wind tunnels test many airplane designs under the same controlled conditions. In ERS, task theory plays a similar role: controlling task variables so that reasoning systems can be compared fairly. |
Task: How it Hangs Together
T: A Task | T = { G, V, F, C } | |
G: Goal | Set of desired states or outcomes. Goals define what counts as “success” from the observer’s perspective. Example: robot reaches waypoint within 1m tolerance. | |
V: Variables | V = { v₁, v₂, … }. Measurable and manipulatable aspects of the environment relevant to the task. Observer defines these formally (e.g., position, temperature); agent may only have partial/noisy access. | |
F: Transformation Rules | Describe how variables evolve (physics, rules of a game, causal dynamics). These are objective world relations, available in principle to the observer. Agents must infer or approximate them. | |
C: Constraints | Boundaries of what is possible (time, energy, error bounds, resource limits). Again, observer’s perspective = formal definition; agent’s perspective = experienced as difficulty or failure when limits are exceeded. | |
Simple Task | Few variables, deterministic (press a button). | |
Complex Task | Many variables, uncertainty, multi-step (cooking, multi-agent negotiation). |
Intricacy & Difficulty
Intricacy (Observer) | Structural complexity of a task, derived from number of variables, their couplings, and constraints in {V, F, C}. Defined independently of the agent (Eberding 2021). |
Effective Intricacy (Agent) | How complicated the task *appears to an agent*, given its sensors, prior knowledge, reasoning, and precision. For a perfect agent, effective intricacy → 0. |
Difficulty | A relation: Difficulty(T, Agent) = f(Intricacy(T), Agent Capacities). Same task can be easy for one agent, impossible for another. |
Example | Catching a ball: Observer sees physical intricacy (variables: position, velocity, gravity, timing). Agent: a human child has low effective intricacy after learning; a simple robot has very high effective intricacy. |
Connection to ERS | Difficulty is the bridge between objective task description (for observers) and empirical performance measures (for agents). ERS requires both views: tasks must be defined *in the world* (observer) but evaluated *through agent behavior*. |
Dimensions of Task Environments (Thórisson et al., 2015)
Determinism | Whether the same action in the same state always leads to the same result (deterministic) or whether outcomes vary (stochastic). |
Ergodicity | The degree to which all relevant states can in principle be reached, and how evenly/consistently they can be sampled through interaction. |
Controllable Continuity | Whether small changes in agent output produce small, continuous changes in the environment (high continuity) or abrupt/discontinuous ones (low continuity). |
Asynchronicity | Whether the environment changes only in response to the agent (synchronous) or independently of it, on its own time (asynchronous). |
Dynamism | Extent to which the environment changes over time without agent input; static vs. dynamic worlds. |
Observability | How much of the environment state is accessible to the agent (full, partial, noisy). |
Controllability | The extent to which the agent can influence the environment state; fully controllable vs. only partially or weakly controllable. |
Multiple Parallel Causal Chains | Whether multiple independent processes can run in parallel, influencing outcomes simultaneously. |
Number of Agents | Whether there is only a single agent or multiple agents (cooperative, competitive, or mixed). |
Periodicity | Whether the environment exhibits cycles or repeating structures that can be exploited for prediction. |
Repeatability | Whether experiments in the environment can be repeated under the same conditions, producing comparable results. |
Why Task Theory Matters for Empirical Reasoning
For Science (Observer) | Provides systematic, measurable, repeatable description of tasks — necessary for empirical study of reasoning systems. Comparable to controlled experiments in physics or biology. |
For Engineering (Agent & System Design) | Allows construction of benchmarks that measure generality (performance across task classes), not just single skills. Supports systematic curricula for training agents. |
For Empirical Evaluation (ERS Core) | Clarifies whether failure is due to the task (high intricacy, under-specified goals) or the agent (limited sensors, reasoning). Enables falsifiable claims about system capability. |
Reflection | In ERS, intelligence boils down to: *Given a formally defined task, how well does an agent reason about it empirically, under uncertainty and constraints?* Task theory provides the shared language to answer this. |
Discussion Prompts
Question | Observer Angle | Agent Angle |
———— | —————– | ————– |
How is a “task” different from a “problem” in classical AI? | Problem = symbolic puzzle; Task = measurable transformation in a world | Must act in the world to achieve it |
Why must tasks be agent-independent? | To compare systems systematically | Otherwise evaluation collapses into “how this agent did” |
Can you think of a task with low intricacy but high difficulty for humans? | Observer: low variable count | Agent: limited memory/attention makes it hard (e.g., memorizing 200 digits) |
What role does causality play in defining tasks? | Observer: rules F define dynamics | Agent: must infer/approximate causal relations from data |
How does a variable-task simulator (like SAGE) help ERS? | Observer: controls task parameters systematically | Agent: experiences wide range of tasks, supports empirical generality tests |
/var/www/cadia.ru.is/wiki/data/attic/public/t-713-mers/mers-25/task_theory.1756205362.txt.gz · Last modified: 2025/08/26 10:49 by leonard