Table of Contents
ATAI-20 Reykjavik University
Engineering Assignment 3:
Non-Axiomatic Reasoning System (NARS) and OpenNARS for Applications (ONA)
Aim: This assignment is meant to introduce you to OpenNARS-for-Applications (ONA), a general machine intelligence (GMI) aspiring system created by Patrick Hammer which is derived from Pei Wang’s Non-Axiomatic Reasoning System (NARS).
Summary: This assignment is divided into two different parts:
- Introduction to ONA: In this part you will read about Narsese (the language that is used to feed information to/ receive information from NARS/ ONA) and apply the gathered knowledge by creating your own little experiment in which ONA can be deployed and used to answer different questions about the environment.
- ONA on the Cart-Pole task: You will (once again) use the cart-pole environment to try different environments on ONA similar to Assignment 1. However ONA offers plenty different possibilities of task-environment changes like hiding variables in runtime. Eventhough ONA was not designed to tackle tasks such as the cart-pole task-environment you will have the opportunity to see a new, different, system applied to the already well known task-environment.
Part 1 - Introduction to ONA
ONA is a different system than any reinforcement learner or artificial neural network. It is built on non-axiomatic reasoning. To interact with ONA (or NARS) the language narsese was developed. In this part you will get to know narsese better and will build an own “Fuzzy Logic” problem for ONA to “solve”. To get a first grasp on narsese (or NAL) please refer to the slides from 2018 by Xiang Li: nars-tutorial.pdf.
For this first build ONA on your system. Follow the installation steps described in https://github.com/opennars/OpenNARS-for-Applications. If you do not have a Linux/ Ubuntu System installed on your computer please inform yourself about how to get ONA running. You can use a VM or Linux for Windows (presumably) for example, but please inform yourself and ask if something does not work early enough!
Once you have ONA set up correctly (try the evaluation as explained in the github link) you can get started with the task.
Your Task:
See the examples given in the ONA source code. There you can find examples for narsese (*.nal) and english (*.english). Your task is to create your own experiment similar to the described “school.nal” example (https://github.com/opennars/OpenNARS-for-Applications/blob/master/examples/nal/school.nal). Think of a problem which can be described in NAL and create the according *.nal file. Then run ONA on it. Define your own questions or commands which ONA has to answer and describe the expected output. Attach you *.nal file when committing the assignment. Your experiment should include at least one statement from all eight NAL descriptions from the lecture slides (NAL-1 to NAL-8).
Part 2 - ONA on the cart-pole task
To start working with ONA on the cart-pole task you must make certain changes and rebuild ONA before you can start using it.
Prerequisites:
One of the changes is in the environment itself: For this please download the newest version of the cart-pole environment. It now includes the possibility to hide observables during runtime, as well as a python script called ona.py which is the interface used to pass data to and receive data from ONA.
Further, ONA itself needs to be adjusted to work with the cart-pole environment. This needs to be done in order to restrict ONA to only use “^left” and “^right” as actions (similar to the actor-critic or yourself in assignment 1 and 2).
For this you will have to change a few lines in two c-files of ONA:
1. Open …/OpenNARS-for-Applications/src/Shell.c and comment the lines 75-82 such that the Shell_NARInit() function looks like this:
void Shell_NARInit() { fflush(stdout); NAR_INIT(); PRINT_DERIVATIONS = true; int k=0; if(k >= OPERATIONS_MAX) { return; }; NAR_AddOperation(Narsese_AtomicTerm("^left"), Shell_op_left); if(++k >= OPERATIONS_MAX) { return; }; NAR_AddOperation(Narsese_AtomicTerm("^right"), Shell_op_right); if(++k >= OPERATIONS_MAX) { return; }; //NAR_AddOperation(Narsese_AtomicTerm("^up"), Shell_op_up); if(++k >= OPERATIONS_MAX) { return; }; //NAR_AddOperation(Narsese_AtomicTerm("^down"), Shell_op_down); if(++k >= OPERATIONS_MAX) { return; }; //NAR_AddOperation(Narsese_AtomicTerm("^say"), Shell_op_say); if(++k >= OPERATIONS_MAX) { return; }; //NAR_AddOperation(Narsese_AtomicTerm("^pick"), Shell_op_pick); if(++k >= OPERATIONS_MAX) { return; }; //NAR_AddOperation(Narsese_AtomicTerm("^drop"), Shell_op_drop); if(++k >= OPERATIONS_MAX) { return; }; //NAR_AddOperation(Narsese_AtomicTerm("^go"), Shell_op_go); if(++k >= OPERATIONS_MAX) { return; }; //NAR_AddOperation(Narsese_AtomicTerm("^activate"), Shell_op_activate); if(++k >= OPERATIONS_MAX) { return; }; //NAR_AddOperation(Narsese_AtomicTerm("^deactivate"), Shell_op_deactivate); if(++k >= OPERATIONS_MAX) { return; }; assert(false, "Shell_NARInit: Ran out of operators, add more there, or decrease OPERATIONS_MAX!"); }
Only the Atomic Terms “^left” and “^right” should be left
2. Open …/OpenNARS-for-Applications/Config.h and change the value of “OPERATIONS_MAX” in line 86 to 2 (instead of 10):
//Maximum amount of operations which can be registered #define OPERATIONS_MAX 2
3. Rebuild ONA.
Now that everything is set up you can start the task of this part of the assignment. For this download the latest Python project: exercise_3.zip and proceed as previously by installing the requirements.txt file. You will have to change the path to your ONA NAR shell in ona.py line 26. Everything else should be set up correctly. Also have a look at ona.py from the downloaded python source code and try to understand how data is parsed in order to give it to ONA, as well as the changed reward condition. Note that the reward function is not a simple plus or minus any more.
Your Task:
- Plain Vanilla. Evaluate ONA’s performance on the cart-pole task given to you as python code:
- 1.a Run the learner repeatedly (at least 5 times); collect the data. Stop each run when 300 epochs are reached.
- 1.b Plot its improvement in performance over time (for example the mean of the 5+ runs or a running average or similar).
- Modified Version. Evaluate the learner’s performance on a modified version of the cart-pole task and compare them to the results from the plain vanilla runs:
- 2.a Limited Observations. Hide each variable one by one (from the start of the experiment) and run ONA at least 5 times for each condition, at least 300 epochs (increase this number, if you believe that it is necessary), and plot its improvement in performance over time (for example the mean of the 5+ runs or a running average or similar).
- 2.b Sudden Availability. Hide each variable one by one from the start of the experiment; then expose it after 200 epochs. Let the system run until it relearns the new task, and continue for another 200 epochs before hiding the variable again and continuing for another 200 epochs. Do this at least 5 times per variable. Stop if ONA cannot relearn the new task after 1500 episodes. Plot its improvement in performance over time (for example the mean of the 5+ runs or a running average or similar).
- 2.c Sudden Disappearance. The exact opposite of b): All variables (x, v, theta, omega) are exposed at the beginning and a single variable hidden after 200 epochs before re-exposed. Apply the same epoch rules of hiding/ exposure as described in b) (just the other way around). Do this for the variables one by one again at least 5 times and plot your results.
- 2.d Custom Task Mod. Think of a way to change the task in some way after a certain amount of epochs (e.g. 200). Think of at least three different changes, remember the first assignment.
- 2.e Try Out Your Own Ideas with ONA. For example, what happens if you change the discretization of the observations (see the data to Narsese parsing)? What happens if you change the reward conditions? Try things out that you are curious about and try to figure out some of the possibilities (and limitations) of ONA.
- Report. Summarize your results in a report. Compare them to the results from the first assignment (where appropriate). Try to explain your results. What makes ONA different from a Deep Reinforcement Learner?
The summary of the results from the first assignment can be found here: summary-general-remarks.pdf