public:t-720-atai:atai-21:engineering_assignment_2
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
Previous revisionNext revision | |||
— | public:t-720-atai:atai-21:engineering_assignment_2 [2021/10/11 08:21] – leonard | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | [[public: | ||
+ | \\ | ||
+ | |||
+ | ====ATAI-21 Reykjavik University==== | ||
+ | |||
+ | \\ | ||
+ | \\ | ||
+ | ====Engineering Assignment 2:==== | ||
+ | =====Non-Axiomatic Reasoning System (NARS) and OpenNARS for Applications (ONA)===== | ||
+ | |||
+ | \\ | ||
+ | |||
+ | **Aim:** This assignment is meant to introduce you to OpenNARS-for-Applications (ONA), a general machine intelligence (GMI) aspiring system created by Patrick Hammer which is derived from Pei Wang’s Non-Axiomatic Reasoning System (NARS). | ||
+ | |||
+ | **Summary: | ||
+ | - **Introduction to ONA:** In this part you will read about Narsese (the language that is used to feed information to/ receive information from NARS/ ONA) and apply the gathered knowledge by creating your own little experiment in which ONA can be deployed and used to answer different questions about the environment. | ||
+ | - **ONA on the Cart-Pole task:** You will (once again) use the cart-pole environment to try different environments on ONA similar to Assignment 1. However ONA offers plenty different possibilities of task-environment changes like hiding variables in runtime. Eventhough ONA was not designed to tackle tasks such as the cart-pole task-environment you will have the opportunity to see a new, different, system applied to the already well known task-environment. | ||
+ | |||
+ | \\ | ||
+ | \\ | ||
+ | |||
+ | ====Part 1 - Introduction to ONA==== | ||
+ | |||
+ | ONA is a different system than any reinforcement learner or artificial neural network. It is built on non-axiomatic reasoning. To interact with ONA (or NARS) the language narsese was developed. In this part you will get to know narsese better and will build an own “Fuzzy Logic” problem for ONA to “solve”. To get a first grasp on narsese (or NAL) please refer to the slides from 2018 by Xiang Li: {{/ | ||
+ | |||
+ | For this first build ONA on your system. Follow the installation steps described in [[https:// | ||
+ | |||
+ | Once you have ONA set up correctly (try the evaluation as explained in the github link) you can get started with the task. | ||
+ | |||
+ | \\ | ||
+ | |||
+ | ===Your Task:=== | ||
+ | See the examples given in the ONA source code. There you can find examples for narsese (*.nal) and english (*.english). Your task is to create your own experiment similar to the described “school.nal” example ([[https:// | ||
+ | |||
+ | \\ | ||
+ | \\ | ||
+ | ---- | ||
+ | |||
+ | |||
+ | |||
+ | ====Part 2 - ONA on the cart-pole task==== | ||
+ | To start working with ONA on the cart-pole task you must make certain changes and rebuild ONA before you can start using it. | ||
+ | |||
+ | \\ | ||
+ | |||
+ | ===Prerequisites: | ||
+ | |||
+ | One of the changes is in the environment itself: For this please download the newest version of the cart-pole environment. It now includes the possibility to //hide observables// | ||
+ | |||
+ | Further, ONA itself needs to be adjusted to work with the cart-pole environment. This needs to be done in order to restrict ONA to only use “^left” and “^right” as actions (similar to the actor-critic or yourself in assignment 1).\\ | ||
+ | For this you will have to change a few lines in two c-files of ONA:\\ | ||
+ | |||
+ | \\ | ||
+ | |||
+ | 1. Open …/ | ||
+ | void Shell_NARInit() | ||
+ | { | ||
+ | fflush(stdout); | ||
+ | NAR_INIT(); | ||
+ | PRINT_DERIVATIONS = true; | ||
+ | int k=0; if(k >= OPERATIONS_MAX) { return; }; | ||
+ | NAR_AddOperation(Narsese_AtomicTerm(" | ||
+ | NAR_AddOperation(Narsese_AtomicTerm(" | ||
+ | // | ||
+ | // | ||
+ | // | ||
+ | // | ||
+ | // | ||
+ | // | ||
+ | // | ||
+ | // | ||
+ | assert(false, | ||
+ | } | ||
+ | Only the Atomic Terms “^left” and “^right” should be left\\ | ||
+ | |||
+ | \\ | ||
+ | 2. Open .../ | ||
+ | //Maximum amount of operations which can be registered | ||
+ | #define OPERATIONS_MAX 2 | ||
+ | |||
+ | \\ | ||
+ | |||
+ | 3. Rebuild ONA.\\ | ||
+ | |||
+ | Now that everything is set up you can start the task of this part of the assignment. For this download the latest Python project: {{: | ||
+ | |||
+ | \\ | ||
+ | |||
+ | ===Your Task:=== | ||
+ | - **Plain Vanilla.** Evaluate ONA’s performance on the cart-pole task given to you as python code: | ||
+ | * 1.a Run the learner repeatedly (at least 2 times); collect the data. Stop each run when 300 epochs are reached. | ||
+ | * 1.b Plot its improvement in performance over time. | ||
+ | - **Modified Version**. Evaluate the learner’s performance on a modified version of the cart-pole task and compare them to the results from the plain vanilla runs: | ||
+ | * 2.a **Limited Observations.** Hide each variable one by one (from the start of the experiment) and run ONA 1-2 times for each condition, at least 300 epochs (increase this number, if you believe that it is necessary), and plot its improvement in performance over time. | ||
+ | * 2.b **Sudden Availability**. Hide each variable one by one from the start of the experiment; then expose it after 200 epochs. Let the system run until it relearns the new task, and continue for another 200 epochs before hiding the variable again and continuing for another 200 epochs. Do this 1-2 times per variable. Stop if ONA cannot relearn the new task after 1500 episodes. Plot its improvement in performance over time. | ||
+ | * 2.c **Sudden Disappearance**. The exact opposite of b): All variables (x, v, theta, omega) are exposed at the beginning and a single variable hidden after 200 epochs before re-exposed. Apply the same epoch rules of hiding/ exposure as described in b) (just the other way around). Do this for the variables one by one again 1-2 times and plot your results. | ||
+ | * 2.d **Custom Task Mod**. Think of a way to change the task in some way after a certain amount of epochs (e.g. 200). Think of at least 2 different changes. | ||
+ | * 2.e **Try Out Your Own Ideas** with ONA. For example, what happens if you change the discretization of the observations (see the data to Narsese parsing)? What happens if you change the reward conditions? Try things out that you are curious about and try to figure out some of the possibilities (and limitations) of ONA. | ||
+ | - **Report.** Summarize your results in a report. Compare them to the results from the first assignment (where appropriate). Try to explain your results. What makes ONA different from a Deep Reinforcement Learner? | ||
/var/www/cadia.ru.is/wiki/data/pages/public/t-720-atai/atai-21/engineering_assignment_2.txt · Last modified: 2024/04/29 13:33 by 127.0.0.1