[[/public:t-713-mers:mers-25:main|T-713-MERS-2025 Main]] \\ [[/public:t-713-mers:mers-25:lecture_notes|Link to Lecture Notes]] \\ \\ ====== Task Teory ====== \\ \\ | \\ What it is | A systematic framework for describing, comparing, and analyzing **tasks**, independent of any specific agent. Provides the foundations for evaluating intelligent systems empirically (measurable outcomes, repeatable experiments, controlled variables). | | Purpose | In AI today, evaluation is benchmark-driven (ImageNet, Atari, Go) but lacks a unifying science. Task theory proposes such a foundation: comparable to physics for engineering. It lets us treat **tasks as objects of study**, enabling systematic experimentation in Empirical Reasoning Systems (ERS). | | Task vs. Environment | A **Task** is a desired transformation of the world (e.g., get ball into goal). The **Environment** is the context where variables evolve. Together: ⟨Task, Environment⟩. In ERS, this separation allows us to ask: *what is the structure of the problem?* before asking *how the agent solves it*. | | Agent Separation | Describing tasks independently of the agent prevents conflating “what is to be achieved” with “who/what achieves it.” This is central for ERS: it allows us to evaluate reasoning systems across different domains and agents. | | Why Important | Enables: (1) **Comparison** of tasks across domains; (2) **Abstraction** into task classes; (3) **Estimation** of resource needs (time, energy, precision); (4) **General evaluation** of reasoning systems, beyond one-off benchmarks. | | Example Analogy | In physics, wind tunnels test many airplane designs under the same controlled conditions. In ERS, task theory plays a similar role: controlling task variables so that reasoning systems can be compared fairly. | \\ ==== Task: How it Hangs Together ==== | **T**: A Task | **T = { G, V, F, C }** || | **G**: Goal | Set of desired states or outcomes. Goals define what counts as “success” from the **observer’s perspective**. Example: robot reaches waypoint within 1m tolerance. || | **V**: Variables | **V = { v₁, v₂, … }**. Measurable and manipulatable aspects of the environment relevant to the task. Observer defines these formally (e.g., position, temperature); agent may only have partial/noisy access. || | **F**: Transformation Rules | Describe how variables evolve (physics, rules of a game, causal dynamics). These are **objective world relations**, available in principle to the observer. Agents must infer or approximate them. || | **C**: Constraints | Boundaries of what is possible (time, energy, error bounds, resource limits). Again, **observer’s perspective** = formal definition; **agent’s perspective** = experienced as difficulty or failure when limits are exceeded. || | Simple Task | Few variables, deterministic (press a button). | | Complex Task | Many variables, uncertainty, multi-step (cooking, multi-agent negotiation). | \\ ==== Intricacy & Difficulty ==== | Intricacy (Observer) | Structural complexity of a task, derived from number of variables, their couplings, and constraints in {V, F, C}. Defined **independently of the agent**. | | Effective Intricacy (Agent) | How complicated the task **appears to an agent**, given its sensors, prior knowledge, reasoning, and precision. For a perfect agent, effective intricacy → 0. | | Intricacy of Tasks | Based on (at least) three dimensions: | | | The minimal number of causal-relational models needed to represent the relations of the causal structure related to the goal(s). | | | The number, length and type of mechanisms of causal chains that affect observable variables on a causal path to at least one goal. | | | The number of hidden confounders influencing causal structures related to the goal. | | Difficulty | A relation: **Difficulty(T, Agent) = f(Intricacy(T), Agent Capacities)**. Same task can be easy for one agent, impossible for another. | | Example | Catching a ball: Observer sees physical intricacy (variables: position, velocity, gravity, timing). Agent: a human child has low effective intricacy after learning; a simple robot has very high effective intricacy. | | Connection to ERS | Difficulty is the bridge between **objective task description** (for observers) and **empirical performance measures** (for agents). ERS requires both views: tasks must be defined **in the world** (observer) but evaluated **through agent behavior**. | \\ ==== Example of a Task with different Intricacy ==== {{ :public:t-713-mers:tasktheoryflowchart.png?nolink&700 |}} Taken from [[https://www.researchgate.net/profile/Kristinn-Thorisson/publication/357637172_About_the_Intricacy_of_Tasks/links/620d1c8fc5934228f9701333/About-the-Intricacy-of-Tasks.pdf|About the Intricacy of Tasks]] by L.M. Eberding et al. \\ ==== Dimensions of Task Environments (Thórisson et al., 2015) ==== | Determinism | Whether the same action in the same state always leads to the same result (deterministic) or whether outcomes vary (stochastic). | | Ergodicity | The degree to which all relevant states can in principle be reached, and how evenly/consistently they can be sampled through interaction. | | Controllable Continuity | Whether small changes in agent output produce small, continuous changes in the environment (high continuity) or abrupt/discontinuous ones (low continuity). | | Asynchronicity | Whether the environment changes only in response to the agent (synchronous) or independently of it, on its own time (asynchronous). | | Dynamism | Extent to which the environment changes over time without agent input; static vs. dynamic worlds. | | Observability | How much of the environment state is accessible to the agent (full, partial, noisy). | | Controllability | The extent to which the agent can influence the environment state; fully controllable vs. only partially or weakly controllable. | | Multiple Parallel Causal Chains | Whether multiple independent processes can run in parallel, influencing outcomes simultaneously. | | Number of Agents | Whether there is only a single agent or multiple agents (cooperative, competitive, or mixed). | | Periodicity | Whether the environment exhibits cycles or repeating structures that can be exploited for prediction. | | Repeatability | Whether experiments in the environment can be repeated under the same conditions, producing comparable results. | \\ ==== Levels of Detail in Task Theory ==== | What it is | Tasks can be described at different levels of detail — from coarse abstract goals to fine-grained physical variables. The chosen level shapes both evaluation (observer) and execution (agent). | | Observer’s Perspective | The observer can choose how finely to specify variables, transformations, and constraints. A higher level of detail allows precise measurement but may make analysis intractable. | | Agent’s Perspective | The agent perceives and reasons at its own level of detail, often coarser than the environment’s “true” detail. Mismatch between observer’s definition and agent’s accessible level creates difficulty. | | Coarse Level | Only abstract goals and broad categories of variables are specified. Example: “Deliver package to location.” | | Intermediate Level | Includes some measurable variables and causal relations. Example: “Move package from x to y using navigation map.” | | Fine Level | Explicit representation of detailed physical dynamics, constraints, and noise. Example: “Motor torque, wheel slip, GPS error bounds, battery usage.” | | Implications for ERS | Enables systematic scaling of task complexity in experiments. \\ Supports fair comparison: two agents can be tested at the same or different levels of detail. \\ Clarifies where errors originate: poor reasoning vs. inadequate detail in task definition. | \\ ==== Intricacy and Level of Detail ==== | Maximum Intricacy | Any agent that is constrained by resources (time, energy, computation power, etc.) has a maximal intricacy of tasks it can solve. | | Problem | Even simple tasks like walking to the bus station, if defined in the finest level of detail (every motor command, etc.), have a massive intricacy attached. Planning through every step is computationally infeasible. | | Changing the task | If a task is too intricate to be performed, the task must be adjusted to fit the agent's capabilities. However, we still want to get the task done! | | Changing the Level of Detail | Is the only way to change the task, thus changing its intricacy without losing the goal of the task. | \\ ==== Why Task Theory Matters for Empirical Reasoning ==== | For Science (Observer) | Provides systematic, measurable, repeatable description of tasks — necessary for empirical study of reasoning systems. Comparable to controlled experiments in physics or biology. | | For Engineering (Agent & System Design) | Allows construction of benchmarks that measure **generality** (performance across task classes), not just single skills. Supports systematic curricula for training agents. | | For Empirical Evaluation (ERS Core) | Clarifies whether failure is due to the **task** (high intricacy, under-specified goals) or the **agent** (limited sensors, reasoning). Enables falsifiable claims about system capability. | | Reflection | In ERS, intelligence boils down to: *Given a formally defined task, how well does an agent reason about it empirically, under uncertainty and constraints?* Task theory provides the shared language to answer this. | \\