The evaluation of intelligent systems is a complex task, due to (among other things) the fact that they adapt and change over time, and operate in environments that do the same. When a system has *general* intelligence the task gets even harder. What is needed is a better theoretical basis for both learning/adaptation and task construction/decomposition. Task Theory focuses on the latter - hopfully easier - part of the challenge: creating a mathematical framework around task-environments, grounded in physics, that can be (ultimately) used to predict and explain the difference and similarities between task-environments in physical terms. Among the things we like to look at is the time taken to do a task, how it could be discretized, what energy consuption it requires, how it can be decomposed, etc.
A major highest-level goal of this effort is:
- Given a learner and a task, to say whether and how well the learner can learn the task, identify which parts it will have trouble with, if any, remove parts to make task simpler or harder for the learner, etc. - all without doing any experiments. - Given two or more tasks and two or more learners, enumerate their similarties and differences in a way that relates to how a set of learners would perform when learning and performing the task, in physical parameters in cluding the time, energy, best-case/worst-case/average-case, etc. - without doing any physical experiments.
This will ultimately require two theories, one about tasks and one about learners, and they would have to be compatible. But we have to start somewhere, and making theory of learners and intelligence seems harder than making a theory about tasks, for at least two reasons:
While we won’t get there any time soon, we have already started. We need to move along, because I think real progress towards a theory of intelligence needs proper methods of testing it, comparing it, measuring it, and evaluating it.
Papers that might give us some more ideas for how to think about this: