This is an old revision of the document!
Course notes.
What is Intelligence?
It seems like such a very easy question to answer, yet people frequently argue about what intelligence is. And not just laypeople, experts in fields claiming to study some of the very phenomena related to intelligence don't seem to agree on what it is either. Yet there is one example of intelligence that everyone seems to agree on – most people agree that most if not all average human beings harbor intelligence. Even those who do not agree with that will tend to agree that intelligence is a capability inherent in human minds and which it sometimes exhibits, if not every day then at least every once in a while, or at the very least they agree that some individuals on planet Earth have exhibited intelligence at some point or other in past times.
Intelligence exists because there is complexity in the world. The purpose of intelligence is to deal with the complexity of the world. Complexity, as some readers probably have already discovered, is different from both simplicity – which of course is as the word implies, simple, and complete randomness, because randomness has no structure and therefore cannot be measured on a scale from simple to complex. However, randomness is not totally irrelevant to complexity, as we shall see, but it is more of a footnote than a main player. Complexity can be attributed to any system implementing long causal chains. The causation can be of various kinds, but let's for a moment stick to completely deterministic mechanical couplings, of which a mechanical wind-up clock is an excellent example. The complex combination of intertwining gears, springs and escapement (a way to generate mechanical rhythm) causes the hands on the clock to map in a predictable and reasonably consistent way (for all practical purposes) to an external reality (called the real world) so that those who can read a clock can coordinate temporally-depended actions ahead of time. Complex as it may be, the complexity inherent in a mechanical clock is not nearly as complex though as many of the subjects that science has made a point of studying through the ages, some examples being ecosystems, societies, natural language, biochemistry, genetics, and so on. What do these topics of study have, in addition to the complex mechanical causal chains of the wind-up clock? They have a mixture of causal chain types.
There are at least two axes along which causal chains can be classified, coupling density and coupling strength. Density refers to the number of causal connections between any identifiable intermediate stages in the causal chain; strength refers to the strength of the effect of each identifiable cause on subsequent events. This figure helps explain this:
Nodes stand for identified causal factors, edges stand for the causal relationship between these. a. Sparsely coupled system; b. Densely coupled system; c. Tightly coupled system; d. Loosely coupled system; e. System with multi-variable couplings; f. Densely coupled system with mixed loose and tight, and multi-variable causal chains.
In most natural systems worthy of study, causal chains include subsystem whose causal connections vary to different degrees along these two dimensions. In the real world of course there are oftentimes many nodes that are not directly observable in any way (as for example atoms were not observable before the scanning tunneling microscope) yet can be inferred, and their existence can be verified by experimentation (whether formal or informal). In cases where causal chains have factors that are not directly observable whose couplings with the rest of the system are loose, the best way to characterize the system is to note correlations between surface phenomena and/or system behavior. For example, if there are two types of plant very similar to others, one being edible and the other deadly poisonous, and where the main cause of this difference can only be traced to the plants' DNA (an unobservable cause for anyone without DNA inspection capabilities), an intelligent being's only recourse would be to identify key correlates between edibility and surface features such as subtle differences in color, leaf structure, or perhaps the place where the plant grows. The picture gets significantly more complex when the edibility of a particular type of plant depends in fact on the type of soil, place of growth, freshness, etc. This is of course how most things in the real world are: there is a lot of gray area – difficult-to-identify-and-classify features – that can mean the difference between life and death of a being. This is where intelligence comes in. This is in fact why intelligence exists at all.
In the real world the kinds of systems that are important to various intelligent species will of course vary depending on the species, and the complexity of the basic environments that the beings typically inhabit. There is, for example, no animal other than humans that could consider living in outer space (some would say this is beyond our capabilities at present). Communication can help overcome some limitations of non-communicative individuals, so that time-consuming exploration of how the world works can be communicated instead of repeated – as summarizing years of research in natural language does not have to recount all the dead ends that nevertheless had to be explored, and selecting symbols whose transmission may take 300 milliseconds can stand in for actions that took days to perform (e.g. “He rode his motorbike across most of Europe”). Telling your fellow citizens not to eat a certain plant because it is poisonous can save hundreds of lives, so once on the scene these are surviving traits: to be able to communicate and to be motivated to share knowledge.
And so it is with a huge number of functions that we normally count as part of the repertoire of any system we call “intelligent” – being able to perceive to a level that is at sufficient spatio-temporal resolution, the ability to direct this perception for various purposes (steerable, learnable attention), being able to mentally classify and catalogue stimuli along a vast amount of multi-modal dimensions (and to even learn new such dimensions as necessary), to be able to retrieve any and all (or most) relevant such information at the time when it is needed, to communicate physical and cognitive events to fellow beings as deemed necessary and needed, whether from the past, present, or future (and of course being cognizant of whether they actually are from the past, present, or future), etc. – these are hallmark traits of intelligent systems.
So, because the behavior of systems that are built out of a variety of causal chains is predictable yet non-obvious, intelligence exists. Intelligence is a practical solution to a practical problem. Since all events in the real world take time, but of course not infinite time, the speed of thought matters. For many tasks that require intelligence to be solved, individuals – the harbingers of the “natural intelligence engines” (brains) – may sometimes take days or weeks to ponder what to do. In other cases immediate action is required. In some cases the fastest and most immediate action possible (approximately 70 milliseconds - the fastest possible choice reaction time recorded in humans) is not fast enough to steer the individual away from deadly danger (being hit by lightning is one example). But by far the most numerous cases of a human-world encounters in which a fast reaction is required are much slower than this lower limit. Because the world consists of a mixture of systems implementing various types of causal chains, intelligence too employes a mixture of techniques to deal with them. However, at the core of any intelligent system is a perception capabilities – selectively sampling states of the inhabited world, one or more types of memory, decision making capabilities – deciding to do something (or nothing – which is also a decision, whether deliberate or not), and the ability to affect the world in some way. This all put together into a perception-action loop, whereby the same entity samples the world, makes some decision about doing some cognitive and/or real-world action, and subsequently executes those decisions, is what all natural intelligences consist of. There exist of course possibilities to do things differently when designing an artificial intelligence, but by and large it is difficult to fit any system that does not have these features under the definition of being “intelligent”. That is, there is something about the model for intelligence, provided to us by nature, that makes it a unified whole: Take any one part out and the whole thing becomes less like the thing we are trying to imitate/understand and starts to look more like something else (and we are invariably hard pressed to call that something else “intelligent” or “intelligence”).
The ability to adapt and learn are an important feature of naturally intelligent systems – these seem so critical to our understanding of the phenomenon of intellingece that many researchers (and laymen alike) will not call a system “intelligent” unless it can learn and adapt. There are other ways to approach a definition of intelligence that do not require adaptation and learning, emphasizing the ability to act in environments where the number of possible states is not enumerable. In the context of necessary and sufficient features a useful concept is that of “marginal necessity”. It is defined such that for a given entity <m>E</m> to belong to category <m>C</m> it must posses features <m>f1</m>, <m>f2</m> … <m>fn</m>, some notable number of which are marginally necessary for the entity to qualify as being of category <m>C</m>. Such features are thus called “marginally necessary” for <m>E</m> <m>epsilon</m> <m>C</m> to hold – i.e. it is in some sense disputable whether they are necessary or optional. For any system of reasonable complexity, that we nevertheless regularly use a single concept label for, including “society”, “ecosystem”, “social system”, “living system”, and of course “intelligent system”, have a large number of such marginally necessary features. Instead of spending our time arguing whether something is or isn't intelligent, we should instead spend our time trying to characterize better the systems within immediate and semi-distal range of our target category or categories. Some of that work will of course include talking about definitions, but we should remember that it is not the definitions that we are after, what we are after is understanding particular natural (and artificial) phenomena better.
Since intelligence exists because the world is complex yet predictable, and to do anything in the world one needs to know some of the states of the world to do anything with a particular goal in mind, inevitably some sort of feedback loop is a necessary part of any and all intelligence. We call this the perception-action feedback loop. Put this way it may sound to some like a pipeline process, sample-decide-act, but in fact that is not true of any intelligence in nature. Brains, being the only way which we are aware of (as of yet) for producing intelligence, are inherently parallel processing systems. At any point in time they carry on many “threads” of processing, so to speak, at the same time. Otherwise we would not be able to avoid that tree falling towards us while we recite our favorite poem. In some sense the parallelism of the brain is trivially obvious, and thus one might be led to believe that it should not be too hard to replicate artificially. But when trying to build a machine that does that with its processing tasks it is surprisingly difficult to implement the kind of “parallel multitasking” that even simple brains such as those of squirrels must be able to implement.
One of the core assumptions of artificial intelligence is that intelligence and thought is essentially an information process. While for practical purposes it may matter how the processing is implemented, in principle one can do computations in a number of ways: a change in the substrate on which these are carried out does not change the outcome, addition is still addition, whether one does it on a souped-up modern computer or on an abacus. Some have taken this to mean that it is not necessary to discuss matters of implementation in A.I. – any limitations on the speed of the processes would be solved over time by faster processors. This seemingly innocent and minor assumption has permeated the whole field for over three decades or more. It is totally incorrect assumption. It is incorrect because intelligence is a response to an environment; the relationship between a mind and its environment is key to defining numerous aspects of an intelligence, including how smart it is. The speed of change of the environment places requirements on the intelligence; solutions for computing actions for environments moving at one pace may prove significantly different to those good for an environment moving at another pace. If the environment changes rapidly, a mind capable of computing useful responses at rates that match the environment better than those produced by another mind should by all accounts be considered more intelligent than the other. This is an important fact about the phenomenon of intelligence: It can only be evaluated in relation to the world in which it operates. In the real world, of course, we are bound to a certain reality, to certain physics, and it is this reality that is the ultimate target of any and all A.I. systems that we might implement. Therefore, to discuss A.I. systems without reference to the limitations of the real world includes the making of two mistakes. First, the relevance of those systems to any real-world application is seriously cast in doubt. Second, because intelligence is a response to our physical reality the systems we may come up with during those discussions are bound to ignore critical features that are necessary (but not necessarily sufficient on their own) for implementing intelligence. One example of this classic mistake is ignoring time. Time is one of the very reasons for which intelligence exits – that is, the passage of time, and our lack of infinite time and resources to simply lie in bed and think. By de-emphasizing time to the point of complete ignorance, a critical feature of the environment for which intelligence exists, is banished from the discussion. The result can only be to reduce the concept of intelligence to a level where it does not resemble that which was its inspiration – natural intelligence. As a result, people end up arguing endlessly about what “is and isn't intelligent” – i.e. the nature and definition of intelligence.
What must be done is to create a list of reasonable marginally necessary features so that we can replicate what we perceive to be valuable in natural intelligences, in order to fulfill the full vision of artificial intelligence – that is to replicate human intelligence in all major ways, in a way that allows us to mold it into a form that is of practical use for various ends and means – and allows us to bring it to the next level – superhuman levels – at which point we hope to start using it to help solve the many complex problems that humanity faces.
One way to start listing the things that an intelligence must be capable of to deserve the label is to look at one extreme end of the intelligence spectrum, starting with human intelligence, and stripping it down – inspecting key functions of this system one by one to see if it is a nice-to-have or a must-have.
For this exercise I will start with a somewhat unconventional feature: The human mind's ability to break down problems in to smaller, more addressable problems. Given a problem <m>P</m>, a human generates a goal <m>G</m> of solving the problem. Identifying potential roadblocks towards achieving the goal, from a set of more than one roadblocks, the goal is broken down into sub-goals that could remove those roadblocks. This is called sub-goaling, and is illustrated in this picture:
A root goal (top) represents an end-state that is supposed to solve a particular novel problem. Part of meeting the root goal is generating simpler goals (sub-goals) for which a solution can more easily be found. In this figure sub-goal 4 cannot be sub-goaled because some knowledge is missing for how to generate it. In some cases new sub-goals may be blocked by other goals, i.e. depend on other goals being achieved (d).
As the sub-goals are created action can be taken to meet them at the same time as the space is getting generated (continuous planning and execution), or the full sub-goal space may be generated before any action is taken. When doing sub-goaling the intelligence brings in knowledge from prior experience, and its ability to do this effectively for an unknown problem depends in part on how quickly it can retrieve relevant knowledge to bring to bear: creating a sub-goal which there is little or no hope to meet will be highly detrimental to solving the root goal. As goals are solved one by one the intelligence gets closer and closer to having solved the initial problem <m>P</m>.
While it is unclear whether intelligences other than humans (e.g. dogs, birds, sea lions, etc.) do such sub-goaling in some way explicitly, the behavior of any animal solving a puzzle of one kind or other (e.g. how to get around an obstacle) can be mapped by an observer onto a analysis using such goal-subgoal analysis – whenever any intelligence solves any problem, it can in an abstract sense be said to be doing goal-generation. Any intelligence worthy of being called general must for sure be capable of doing goal-subgoal generation, because for certain classes of problems it is the only sensible and available way of attacking them. It can also be argued that in fact sub-goaling activity – whether explicit, an approximation, or an emulation – is what any mind does when faced with any challenge, no matter how trivial: A cat chasing a mouse, a bee centering its flight on a flower. Because solving a problem (e.g. catching a mouse) requires a logical combination of actions without which the goal would never be met.
In any case, from the above analysis we see that there one of the functions that any intelligence must have is a memory. A memory system that is implemented in the real world will of course have features of its physical operation that include the speed of writing to it and the speed of retrieving from it. It will also have a certain character that may make it more or less suited for particular tasks and environments. Humans have a highly associative memory that seems to be reasonably general for the many kinds of things that humans do and are capable of doing. Another is some way of perceiving a complex environment: An intelligence's sensors must be suited to retrieve information from the environment in a sufficiently fast manner so as to allow processing of the information and production of decisions quickly enough to ensure the survival of the being. While sensors are always limited in some ways, e.g. by field of view or color spectrum sensitivity, sufficient planning and anticipatory action may enable the being to live with those constraints. The capabilities of a perceptual system must also encompass an ability to filter out unimportant from important information in the environment. An action repertoire will also be required, for if the intelligence doesn't do anything in the world there is really no point in its existence. As we have seen above, he ability to dissect problems into smaller parts, each of which moves the system towards achieving a set goal, must be possessed (in some form) by this system. In fact, the ability to generate goal-subgoal structures, and accompanying actions to achieve the sub-goals as appropriate, follows from the assumption we can make for any intelligent system: That it must operate in environments vastly more complex than they themselves are able to process at any point in time (limited computational resources) with limited knowledge/memory. A system that is limited in what it can hold in memory and compute at any point in time must, by definition, pick and choose what it works on at any point in time. To do things that ideally should be given more “working memory” than is available, i.e. to address the issue of limitations on working memory, planning and scheduling can be called on to offset some of these limitations. Given limited computational speed and memory capacity, and the fact that the environment presents vastly more potential information that the system needs – or for that matter can process – at any point in time, some sort of resource management and coordination system is needed. In human psychology these are called attention and task planning. The latter depends a lot on the task, and the class of tasks it belongs to, so it is task-dependent; the former tends to be fairly task-indepenent, having more to do with internally generated goals referencing the management of the memory itself, the external sensors, from moment to moment, and the short-term coordination of external actuators.
One aspect of intelligence that we must admit to being central is learning. A system that cannot learn cannot adapt; a system that cannot adapt is essentially incapable of operating in complex environments, because by definition these will have so many possible “legal” states as to be innumerable, and thus, as long as they are logical worlds (i.e. with a significant amount of reliable causal connections), will have a large set of operational rules that the intelligence will not be provided with from the start. The intelligence must therefore have the capability of extracting “rules” or regularities from the environment. The more it can generalize these, the smarter it is. The faster and better it can figure out which generalizations to use to act in the world, understand, and solve problems in it, the more intelligent it is. Acquisition and “insightful” or creative application of knowledge is therefore a requirement for any higher intelligence.
For any system we may want to decide whether is intelligent or not, removing the above features one by one will make us more hard-pressed to agree on applying the label “intelligent” to describe the system.
2012©K.R.Thórisson