Course notes.

What is NARS?

NARS is a general-purpose reasoning system and an AGI-aspiring cognitive architecture. Here we will discuss the key principles underlying this architecture, and the theory it is built on.

The most important principle of NARS is P. Wang's significant contribution to AI, namely the definition of intelligence as “the ability to reason under constraints of insufficient resources and knowledge”.

What is meant by insufficient resources? Resources are the available means, methods, materials, and abilities for carrying out tasks, including mental operations. The term insufficient, in relation to a cognitive agent, means that the task or tasks that the agent must perform present requirements that, in an ideal situation, demand resources over and above what is actually available for the agent to use for those tasks. Insufficient resources thus refers to a ratio between an environment and a cognitive agent: The agent is given goals in its environment – e.g. the goal of surviving – for which the agent does not have an immediate set of actions or ways of achieving, and must go through a series of steps to derive ways of doing that. Instead of giving the agent a set of pre-determined context-action mappings, the agent has cognitive processes that, given some time, can produce actions that are sufficient (but not necessarily and almost never optimal) to achieve the goals. The goals can of course also be self-imposed, such as getting a promotion.

A human being trying to get a promotion is of course aspiring to a time-constrained goal, as no human will hold the same job forever. This is the other major way in which cognition deals with limited resources: The time for any agent to deal with some situation when trying to reach a goal is limited. In fact, if a cognitive system were not constrained by time, the other constraints would not matter, because with infinite time every possible solution, situation, consequence, etc., could simply be evaluated, and the best one chosen. So, in a very real sense intelligence would be unnecessary if we did not have temporal constraints.

NARS is also based on the key principle that intelligence is at its core reasoning. While the field of mathematics has produced various precise definitions of what reasoning means, in the context of NARS the concept is more broad, more like what we are used to in everyday dialogue, as in, for example, “while understandable, it is not rational to want to have the cake and eat it too”. The reason for the more layman usage of the term in the context of NARS stems from the fact that the real world does not allow for the conceptual crispness – and not for the same level of certainty – that the world of mathematics allows. In the real world we may not know, upon seeing numerous examples of white swans, to take a well-known example, whether or not there exist black swans. (They do in fact exist.) When we understand the genetic control of the bird's color we can say whether black swans are possible, but we still don't know if they exist in nature, until we find one. So knowledge about the real world is always incomplete. This is another fundamental principle behind NARS: That of experience-based reasoning.

The reasoning in cognitive agents cannot be similar to mathematical reasoning because in the real world we never know the “ground truth”. As an example, we know that there probably exist atoms, but at one point these were hypothetical entities. And they were thought to be “atomic” – indivisible. Mathematical reasoning is based on axioms – given “truth”. Clearly, in systems that operate in the real world, certainty cannot be thus encoded, and therefore not assumed. If we want a flexible cognitive system, that can adapt to a variety of environments, that system must have mechanisms for generating its own “ground truth” – this cannot be given by the designer. Hence, the reasoning mechanisms for such systems must be different from axiomatic reasoning systems. That is where the “N” in “NARS” comes from: Non-axiomatic.

What does insufficient knowledge refer to? “Knowledge” here means any information that the agent can collect by relatively direct means, as well as any information that it can derive from that information through various forms of reasoning. The term refers to available information, because the insufficiency of memory resources may render some knowledge inaccessible. A smart agent may invent ways of dealing with limitations of its own internal memory, such as books, alarm clocks, computers, telephones, etc. – this way it does not need to keep everything in its head at all times, or rely on recalling everything at the right time, all the time, which for many tasks, e.g. building skyscrapers, is simply impossible for a human cognitive agent to do. This way the knowledge for how to use these technologies replaces the need to train the cognitive system itself for the specific tasks that they ultimately serve, e.g. to wake up at the right time.

We can now summarize a bit. Intelligence is necessary because of real world time constraints, and because of resource constraints that derive directly from temporal ones, including limited processing power, limited memory, limited mobility (in the case of embodiment), etc. Intelligence is the activity of achieving goals under temporal and resource constraints.

NARS is one of the very few systems – if not the only one to date – where the assumption of insufficient resources and knowledge is explicitly put at the center of the system's design and implementation. As a result it is one of the few – if not the only one – that explicitly targets AGI.

The choice of reasoning is made for reasons of manageability, as well as for the reason that no other feasible method exists to date for allowing such systems to construct their own knowledge over time. For any AGI system understanding is a key part of its operation – unless the system is capable of increasing its understanding of the world/domain over time, the system will not get smarter with experience. For a system whose knowledge is not hand-coded up front we have to build in mechanisms for the system to uncover “how the world works”. This means some mechanism that can “dissect” causal chains, based on experiential evidence, which is often times loose, unreliable, and noisy – what may appear to the system as noise at one time may turn out to contain critical information for the causal chains at work. No approach exists, other than a logical/reasoning framework, that gives any hope of how such a bootstrapping mechanism could be implemented.

Let's look at some alternatives to reasoning for implementing systems capable of incrementally building their own knowledge. Probabilistic approaches assume that characterizing the connection between real-world phenomena in a statistical way may be sufficient to achieve intelligence. Substantial progress has been made in e.g. self-driving vehicles and machine perception using a probabilistic approach. However, upon closer examination a probabilistic approach will not suffice for building a complete AGI system, because for one we would expect such systems to be able to explain their knowledge – to some extent at least – and this is not possible to do in a “random access” fashion (in response to e.g. an external interviewer whose questions are not known in advance) unless causal chains are explicitly inferred. To take an example, in response to the question “why do you buckle your seatbelt?” humans will respond along the lines of being buckled up makes each trip safer. Further drilling in, with questions about the relationship between moving automobiles, road conditions, causes for car crashes, speed, visibility, heavy traffic, etc., are relatively easy for any city-dwelling human to discuss, and explain, as they are well aware of how human reaction time has an upper limit, and the faster a car goes the further it goes before any action can be taken. Humans learned this from experience (talking to others, reading, driving, watching movies, etc.). The reasoning behind the causal connections between all these various things is something that comes naturally to humans to do, and is demonstrably a key force behind human learning. While a probability-based system might be able to predict that, based on correlations, that using a seat belt reduces the probability of injury, a system that represents such relationships statistically has two major problems to deal with: First, the system is likely to be stuck at a particular level, or levels, of description – the correlation between fatal injuries and the use of seat belts is limited to that level of description, meaning that questions about issues higher up or lower down in abstraction a problem. For example, questions about the material of the seatbelt cannot be answered, as no methods exist for inferring the causal connection between the safety of seat belts and the material they are made from. So, at a minimum probability-based approaches must supplement their knowledge with some inference capabilities. Second, such systems are not capable of cognitive growth – of autonomously change from very dumb (at the outset) to very smart (after a lot of varied experience). Cognitive growth requires exploring the relationship between phenomena at many levels of detail, from materials to garments to the garment industry, from split-second events to the unfolding of each day to the meaning of a lifespan. Restricting representations to probability-based methods makes such exploration extremely difficult: Unless they are augmented with inference capabilities – that enables them to move up and down in an abstraction hierarchy of part-whole causal chains – these systems cannot grow cognitively. They would instead be limited to minute incremental additions to their knowledge. They would also require a significant part of their knowledge to be provided by an outside source – a designer that feeds in the knowledge at which abstraction level the system will from then on to a large part operate.

Another popular method for constructing learning systems is artificial neural networks (AANs). The key limitation with this approach is that it restricts the AGI design to a particular size of building block but does not provide any ideas for how to construct architectures – larger structures that enable large amounts of clusters of ANNs to operate together to implement a large set of cognitive functions in an agent. It is like trying to build a modern skyscraper in the 1700s with piles of sand – the principles for how to “glue” the sand together were not robust enough for high rises, glass could not be fashioned into the large plates necessary, and the principles for harnessing electricity to implement elevators was nowhere to be found, not to mention the seemingly minute but critical details for how to make elevators safe to ride in case of faults and electric blackouts. For ANNs, the problems of how to construct AGI systems capable of cognitive growth are at least as great as those for probabilistically-based approaches: Mechanisms for incrementally building knowledge at many levels of details, for a variety of contexts and domains, are missing from the equation, and no obvious solutions have been suggested as of yet. We know that implementing cognitive systems with neurons is possible, because we have natural intelligence all around us, but the principles for how to do this are elusive, and may represent the long road to our goal of AGI.

Because we need to implement mechanisms that are capable of building models of the causal chains at work in the world, reasoning cannot be avoided (random search is categorically out of the question, as it will take more time than the age of the universe to come up with anything useful). If reasoning is a necessary part of the only approach, what kinds of technologies allow us to implement reasoning? To some extent one can think of reasoning as particular sequences of pattern matching. The implementation task therefore revolves around how to efficiently implement sequential pattern matching mechanisms. But first we need to get an idea of what kinds of pattern matching we are talking about. Humans use several types of reasoning, chief among them being deduction, abduction, and induction. Collectively these have been called ampliative reasoning. NARS implements ampliative reasoning methods via a kind of term logic, which is different from propositional logic in important ways – a key one being that abduction and induction, two forms of reasoning considered extremely challenging to implement in AI systems, become much easier to do. The catch is that deduction becomes a bit less obvious in this approach. But the benefits far outweigh the negatives, because no AGI can be envisioned without abduction and induction. Why is that?

Abduction is the act of inferring causes from effects: If you see that the ground is wet, you may infer that it recently rained. Abduction is the kind of intelligence that the detective Sherlock Holmes had in such plentiful amounts. It is very difficult to imagine how a cognitive agent that does not posses at least some ability to do abductive reasoning could reach any sort of general intelligence. Induction is the act of generalizing from observed instances – to invent a global rule for a particular kinds of phenomena. Einstein's theory of relativity, E=mc^2, is the ultimate example of inductive reasoning: In its extreme forms it is essentially scientific discovery, and starts to resemble very closely what professional inventors are so good at. (Some philosophers argue that all scientific discoveries are inventions, because until we have a complete theory of anything and everything in the universe, scientific “discoveries” are actually only approximations to something we hope is the truth; approximations are not discoveries at all – they are inventions.) Like abduction, it is difficult to imagine an AGI that doesn't have at least some form of inductive reasoning capabilities. Without inductive reasoning one would be hard pressed to see a system create anything more complex than that which is informed by trial and error, or some form of inductive bias imparted to the system at birth by its designer. Needless to say, no one has suggested how either deduction or induction can be implemented in a statistical framework or in an ANN framework. We are thus forced to fall back on known methods from logical programming for implementing these methods.