| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| public:t-713-mers:mers-25:task_theory [2025/08/26 13:40] – [Discussion Prompts] leonard | public:t-713-mers:mers-25:task_theory [2025/09/02 08:17] (current) – [Why Task Theory Matters for Empirical Reasoning] leonard |
|---|
| | \\ What it is | A systematic framework for describing, comparing, and analyzing **tasks**, independent of any specific agent. Provides the foundations for evaluating intelligent systems empirically (measurable outcomes, repeatable experiments, controlled variables). | | | \\ What it is | A systematic framework for describing, comparing, and analyzing **tasks**, independent of any specific agent. Provides the foundations for evaluating intelligent systems empirically (measurable outcomes, repeatable experiments, controlled variables). | |
| | Purpose | In AI today, evaluation is benchmark-driven (ImageNet, Atari, Go) but lacks a unifying science. Task theory proposes such a foundation: comparable to physics for engineering. It lets us treat **tasks as objects of study**, enabling systematic experimentation in Empirical Reasoning Systems (ERS). | | | Purpose | In AI today, evaluation is benchmark-driven (ImageNet, Atari, Go) but lacks a unifying science. Task theory proposes such a foundation: comparable to physics for engineering. It lets us treat **tasks as objects of study**, enabling systematic experimentation in Empirical Reasoning Systems (ERS). | |
| | Task vs. Environment | A **Task** is a desired transformation of the world (e.g., get ball into goal). The **Environment** is the context where variables evolve. Together: ⟨Task, Environment⟩. In ERS, this separation allows us to ask: *what is the structure of the problem?* before asking *how the agent solves it*. | | | Task vs. Environment | A **Task** is a desired transformation of the world (e.g., get ball into goal). The **Environment** is the context where variables evolve. Together: ⟨Task, Environment⟩ is the Task-Environment. In ERS, this separation allows us to ask: *What is the structure of the problem?* before asking *how the agent solves it*. | |
| | Agent Separation | Describing tasks independently of the agent prevents conflating “what is to be achieved” with “who/what achieves it.” This is central for ERS: it allows us to evaluate reasoning systems across different domains and agents. | | | Agent Separation | Describing tasks independently of the agent prevents conflating “what is to be achieved” with “who/what achieves it.” This is central for ERS: it allows us to evaluate reasoning systems across different domains and agents. | |
| | Why Important | Enables: (1) **Comparison** of tasks across domains; (2) **Abstraction** into task classes; (3) **Estimation** of resource needs (time, energy, precision); (4) **General evaluation** of reasoning systems, beyond one-off benchmarks. | | | Why Important | Enables: (1) **Comparison** of tasks across domains; (2) **Abstraction** into task classes; (3) **Estimation** of resource needs (time, energy, precision); (4) **General evaluation** of reasoning systems, beyond one-off benchmarks. | |
| | For Engineering (Agent & System Design) | Allows construction of benchmarks that measure **generality** (performance across task classes), not just single skills. Supports systematic curricula for training agents. | | | For Engineering (Agent & System Design) | Allows construction of benchmarks that measure **generality** (performance across task classes), not just single skills. Supports systematic curricula for training agents. | |
| | For Empirical Evaluation (ERS Core) | Clarifies whether failure is due to the **task** (high intricacy, under-specified goals) or the **agent** (limited sensors, reasoning). Enables falsifiable claims about system capability. | | | For Empirical Evaluation (ERS Core) | Clarifies whether failure is due to the **task** (high intricacy, under-specified goals) or the **agent** (limited sensors, reasoning). Enables falsifiable claims about system capability. | |
| | Reflection | In ERS, intelligence boils down to: *Given a formally defined task, how well does an agent reason about it empirically, under uncertainty and constraints?* Task theory provides the shared language to answer this. | | |
| |
| \\ | \\ |
| ==== Discussion Prompts ==== | |
| |
| | Question | Observer Angle | Agent Angle | | |
| | How is a "task" different from a "problem" in classical AI? | Problem = symbolic puzzle; Task = measurable transformation in a world | Must act in the world to achieve it | | |
| | Why must tasks be agent-independent? | To compare systems systematically | Otherwise evaluation collapses into “how this agent did” | | |
| | Can you think of a task with low intricacy but high difficulty for humans? | Observer: low variable count | Agent: limited memory/attention makes it hard (e.g., memorizing 200 digits) | | |
| | What role does causality play in defining tasks? | Observer: rules F define dynamics | Agent: must infer/approximate causal relations from data | | |
| | How does a variable-task simulator (like SAGE) help ERS? | Observer: controls task parameters systematically | Agent: experiences wide range of tasks, supports empirical generality tests | | |