Differences

This shows you the differences between two versions of the page.

--- public:t-713-mers:mers-25:task_theory [2025/08/26 11:49] – [Intricacy & Difficulty] leonard
+++ public:t-713-mers:mers-25:task_theory [2025/09/02 08:17] (current) – [Why Task Theory Matters for Empirical Reasoning] leonard
@@ Line 10: / Line 10: @@
 |  \\ What it is  | A systematic framework for describing, comparing, and analyzing **tasks**, independent of any specific agent. Provides the foundations for evaluating intelligent systems empirically (measurable outcomes, repeatable experiments, controlled variables). |
 |  Purpose  | In AI today, evaluation is benchmark-driven (ImageNet, Atari, Go) but lacks a unifying science. Task theory proposes such a foundation: comparable to physics for engineering. It lets us treat **tasks as objects of study**, enabling systematic experimentation in Empirical Reasoning Systems (ERS). |
-|  Task vs. Environment  | A **Task** is a desired transformation of the world (e.g., get ball into goal). The **Environment** is the context where variables evolve. Together: ⟨Task, Environment⟩. In ERS, this separation allows us to ask: *what is the structure of the problem?* before asking *how the agent solves it*. |
+|  Task vs. Environment  | A **Task** is a desired transformation of the world (e.g., get ball into goal). The **Environment** is the context where variables evolve. Together: ⟨Task, Environment⟩ is the Task-Environment. In ERS, this separation allows us to ask: *What is the structure of the problem?* before asking *how the agent solves it*. |
 |  Agent Separation  | Describing tasks independently of the agent prevents conflating “what is to be achieved” with “who/what achieves it.” This is central for ERS: it allows us to evaluate reasoning systems across different domains and agents. |
 |  Why Important  | Enables: (1) **Comparison** of tasks across domains; (2) **Abstraction** into task classes; (3) **Estimation** of resource needs (time, energy, precision); (4) **General evaluation** of reasoning systems, beyond one-off benchmarks. |
@@ Line 86: / Line 86: @@
 |  For Engineering (Agent & System Design)  | Allows construction of benchmarks that measure **generality** (performance across task classes), not just single skills. Supports systematic curricula for training agents. |
 |  For Empirical Evaluation (ERS Core)  | Clarifies whether failure is due to the **task** (high intricacy, under-specified goals) or the **agent** (limited sensors, reasoning). Enables falsifiable claims about system capability. |
-|  Reflection  | In ERS, intelligence boils down to: *Given a formally defined task, how well does an agent reason about it empirically, under uncertainty and constraints?* Task theory provides the shared language to answer this. |
 \\
-==== Discussion Prompts ====
-|  Question  | Observer Angle  | Agent Angle  |
-|------------|-----------------|--------------|
-|  How is a "task" different from a "problem" in classical AI?  | Problem = symbolic puzzle; Task = measurable transformation in a world | Must act in the world to achieve it |
-|  Why must tasks be agent-independent?  | To compare systems systematically | Otherwise evaluation collapses into “how this agent did” |
-|  Can you think of a task with low intricacy but high difficulty for humans?  | Observer: low variable count | Agent: limited memory/attention makes it hard (e.g., memorizing 200 digits) |
-|  What role does causality play in defining tasks?  | Observer: rules F define dynamics | Agent: must infer/approximate causal relations from data |
-|  How does a variable-task simulator (like SAGE) help ERS?  | Observer: controls task parameters systematically | Agent: experiences wide range of tasks, supports empirical generality tests |