Center for Analysis and Design of Intelligent Agents

P1. Pick a Topic for Your Paper

Over the course period you will write one empirical research paper, in the form of a prototypical science paper: A study requiring formulating hypotheses, designing a comparative experiment, obtaining and presenting empirical data.

What To Do

Your first assignment for the writing project is to pick a topic: Decide on a topic to write a paper on.
Ideally you should pick something of some interest to you, i.e. a topic you'd like to explore that fits the empirical paradigm.

Example Topics

Exploring variability in wireless signal receptions - describes an experiment to answer a question about variability that interferes with the quality of wireless signals
Comparison of three different malware detection techniques - describes an experiment comparing three algorithms
Applying machine learning to load balancing - experiment proposes to test the hypothesis that a particular kind of machine learning can load balance better than humans
Comparison of unigram, trigram and five-gram language models in speech recognition - experimental comparison of three methods for doing speech recognition

How To Do It

The teacher will will help you pick the best available topic. The topic must be approved by the teacher – not all topics are equally well suited for the purpose of learning how to write scientific papers. The paper's focus can be selected by proposing tentative titles – ideas for what the final title could or might be. Keep this in mind: The eventual title of a scientific paper should be as descriptive as possible and as short as possible. The tentative version may be longer, for the purpose of helping you stay focused on the topic at hand as your writing progresses.

Note: Topic and tentative title must be approved by the teacher. No submissions of further assignments are allowed until the tentative title and topic have been approved. (See further down on this page the status of your topic selection.)

When your topic has been approved you may want to polish the tentative title – but it can also be honed as the paper progresses.

SEND AN EMAIL TO THE TEACHER (thorisson - at - gmail dot com) SUGGESTING SOME IDEAS, IN THE FORM OF A SHORT DESCRIPTION OF A TOPIC AND TENTATIVE TITLE (OR SOME ALTERNATIVES THAT YOU WOULD LIKE TO PICK FROM).

When teacher has accepted it you should

SUBMIT IT SHORT DESCRIPTION OF TOPIC AND TENTATIVE TITLE ON MYSCHOOL IN A TXT FILE.

Please put your initials in the title of the submitted file, as well as the assignment number (e.g. “P1-JonJonsson”).

Typical format of a txt file turned into MySchool:

MY TENTATIVE TITLE
MY NAME
SHORT DESCRIPTION OF WHAT QUESTION(S) WILL BE ADDRESSED IN THE PAPER

APPROVED TENTATIVE TITLES 2015

Here all the approved tentative titles will be posted as they get agreed on by teacher and student.

1.

Improving Icetagger - increasing it's accuracy with a verb-based dictionary

Starkaður Barkarson

IceTagger is a rule-based tagger for Icelandic text, developped by Hrafn Loftsson. It's aim is to mark each word in a text with an appropriate part-of-speach tag (indicating for example the word's class, gender, if it's singular or plural etc.) In my thesis I will try to improve it's accuracy (which is now around 91%) by adding a list of verbs to an attached dictionary with information about which case they call for. My hypothesis is that by doing that the tagger will be able to solve ambiguity in a better way.

2.

Improving Medical Diagnosis' Accuracy Using Machine Learning

Ragnar Adolf Árnason

Measure the accuracy of machine learning algorithm(s) in medical diagnosis and compare the accuracy of that to doctors. My hypothesis is that ML algorithms should deliver at least as good results as doctors, given the same amount of information (medical records). Rarer diseases might be harder for the algorithms to learn, given that we have fewer examples of them. On the other hand, few doctors are exposed to them, so less experienced doctors might be more prone to mis-diagnosing those diseases. Form the experiment in such a way that one group will be the diagnosis of doctors, and the other group would be the diagnosis of doctors which have access to an ML “support system”? The doctors handle the gathering of data (question patients and collect symptoms), and the ML system is fed that data and gives the doctors a suggestion of likely diagnoses, which the doctors can use to assist their decision making? Introduces another variable of whether or not the doctors would actually use the system or not..

3.

General game description learning for picking optimal play strategy

Þorgeir Auðunn

There are alot of different algorithms that are used to solve games. However knowing which one fits best is a hard problem. Solutions range from specialized single strategies to picking between multiple strategies on the fly.

The question that will be answered in this paper is: H1: Can we learn which strategy is most effective on a game, based solely on it's description?

Note that this is different from switching between diffrent strategies on the fly, that has been shown in an existing paper to be an effective game play strategy. http://www.mini.pw.edu.pl/~swiechowskim/Miniplayer.pdf

For this experiment we pick three strategies, Automatic Heuristic Construction (AHC), Monte Carlo tree (MCTS) search and History Heuristic (HH).

We create our selection mechanism by running all strategies on many diffrent games and use machine learning to learn which strategy was best for which game description. The “Best” strategy is the won with the most wins in a particular game.

Then we can test if the selection process can correctly pick the best strategy for 5 unseen game descriptions (Chess, ConnectFour, Checkers, Othello and 9 Board Tic-Tac-Toe).

The expected result is a confidence factor for each unseen game description on how good each strategy will be. For example, classifying the optimal strategy for ConnectFour might give us HH = 20%, MCTS = 35% and AHC 45%. The percentage returned should indicate how many games each strategy should win against the others. The question then becomes is there a statistical corralation between the numbers we get and the actuall win ratios.

4.

Self optimizing activation functions in deep neural networks

Kristján Árni Gerhardsson

In this paper we would like to find out what kind of effect self learning an activation function has on the learning rate of different data sets. We would also like to find out whether individual activation functions for neurons have a positive effect on learning over using a single 'optimal' activation function.

5.

Power through Suggestion: Accounting for Outside Influence in the Formulation of Choice

Unnar Kristjánsson

A game simulation is proposed, to evaluate scenarios in which a human player that is presented with a varied choice, is also subject to external influence in the form of another human player.

Assuming these choices are equally viable as paths to completion, will the presence of another player affect player choice in any meaningful way. Such that a decision towards a certain path accounts for this player influence.

6.

Comparison between manual and automated testing on web application, having to chose only one, which method is more useful

Anna Vigdís Rúnarsdóttir

Testing a web application before it goes live can prevent critical failures that would otherwise been revealed to the public. Having a good testing methodology is therefore vital for releasing a quality web-base system. Manual and automated testing are two examples of useful testing methods than can be used and they are often combined togehter. But having to choose from either manual or automated testing method, which method would prove more useful to catch major and critical bugs.

Hypothesis: Using either manual or automated tests to test web applications one of the methods should find more critical and major bugs than the other on the same application.

7.

Image bundling for increasing download efficiency of multiple images

Dovydas Stankevicius

Would image bundling make image downloading more efficient? Would this method still be more efficient if we are dealing with higher number of images?

Hypothesis: Image concatenation is more effective image download strategy than downloading images separately because that would decrease number of required HTTP requests, hence ultimately decreasing the HTTP overhead.

8.

Comparison of diabetic management and machine learning algorithm

Bjarni Kristján Leifsson

If a person with diabetes doesn't control his or her insulin intake then they put themselves at high risk of causing damage to their body and risk the failure of various organs and tissues. I intend to look into the possibility of using machine learning to prevent this damage to the body by learning how the body reacts to insulin intake and nutrition consumption to predict the volume of insulin needed to keep the individuals glucose levels optimal. To do this I intend to do a comparison study of current research into machine learning and data mining.

9.

Local vs remote computation on mobile applications

Joy Rossi

Due to the diffusion of mobile devices, like smartphone, smartwatch and similar devices and, therefore to the diffusion of mobile applications, new software engineering problems related to these applications, that must run on small device usually without a great computational power, have appeared. One of these problems is when it is convenient to make a computation on the device or make it on a remote server and then send the result to the mobile device through the internet connection. Assuming that is possible to repeat the same experiment without external interference on the system, it is possible to test the effects of the local and remote computation to the change of the internet connection speed (bandwidth) and the computational power of the device. For instance, if we consider an application that convert an image from a data type to another, it could be better to use a remote computation if the mobile have a good internet connection, instead if it has a bad connection, send the image to the server could take more time than the computation itself would take. Thanks to these experiment we could define what is more suitable for the computation in relation to the variables that it was defined and make mobile application more efficient.

10.

A comparison experiment of different union-find algorithms on three different processors

Sigurgrímur Unnar Ólafsson

Theoreticly it makes no difference for time- or space complexity in union-find algorithms if you run union(p, q) or union (q, p) (find citation). I hypothesise that the afore-mentioned holds true and to prove this I will run both implementations on three different machines running Windows OS and compare the results. One machine has an i7 processor and 16 GB of RAM, one has an i5 processor and 8 GB of RAM and the last one has an i3 processor and 4 GB of RAM. I expect the first machine will be the fastest and the last the slowest but the space complexity will be the same for all machines. If I am wrong then the results should show that it matters which order p and q are in when you call union().

11.

A comparison of machine learning classification techniques

Ingibergur Sindri Stefnisson

The questions that will be addressed and tried to answer in the paper is whether decision trees can be at least as fast and as correct as other machine learning algorithms for simple datasets. The reason for this is that decisions trees are very human readable and simpler than other types of machine learning classification algorithms. The idea is if there is any need for other machine learning algorithms for simple datasets.

12.

Compiling SCCharts: Dataflow Approach vs. Priority-based Approach

Caroline Butschek

For the synchronous language SCCharts two Compilers have been proposed: A priority-based Compiler and a data-flow Compiler. Results from 2014 exist, but the implementation of the data-flow Compiler has been optimized and the Compiler used for the priority-based compilation was the predecessor of the current compiler which used a slightly differend algorithm. Therefore it makes sense to compare them again. Furthermore the so-called “railway Project” used SCCharts, so there is an additional test-program to the old test suite, which only contained small programs. Interesting parameters: compile time, program size, and response time. The programs I would like to test are the railway projekt and some smaller tests.

13.

Creating simple UI client components with AngularJS, React/Ember and Backbone

Patrekur Patreksson

My idea is for the paper is to be about Javascript Frameworks, more specifically a comparison of three different frameworks for client applications. The framework in question are AngularJS, React/Ember and Backbone.

The question I would like to answer is how do these frameworks compare when creating relatively small and simple components? How does the learning curve compares?

The main questions that I would like to answer refer to the following features/aspects:

-Two way data binding/one-way data binding: When changing something in the view (UI), the model (state data) changes automatically. When changing the model, the view(UI) changes automatically. When building simple client components/applications how to the frameworks in question implement this feature? What is different? How easy is it to use?

-Dependency injection: Technique where the framework loads modules automatically for you. When building simple client components/applications how to the frameworks in question implement this feature? What is different? How easy is it to use?

-Routing: Allows the reference of a view with a specific URL and having the URL change the application state. When building simple client components/applications how to the frameworks in question implement this feature? What is different? How easy is it to use?

-Templates: Allows the specification of a predefine display template(HTML) in order to be reusable which can accept data. When building simple client components/applications how to the frameworks in question implement this feature? What is different? How easy is it to use?

14.

Performance Improvements Using Open-Channed SSD Interface With LightNVM

Freysteinn Alfreðsson

Abstract (22nd Sep, 2016). Solid State Drives (SSDs) are replacing magnetic disks as secondary storage for database management, as they offer orders of magnitude improvement in terms of bandwidth and latency. In terms of system design, the advent of SSDs raises considerable challenges which has sparked the debate of the necessary death of the block device interface[Bjørling, Bonnet, Bouganim, Dayan] because it encapsulates decisions made by the SSD' Flash Translation Layer (FTL) which can add expensive overhead, unpredictable latency spikes, and suboptimal performance that prematurely wears out the Flash media. A new interface called Open-Channel SSD that shares the FTL responsibilities with the operating system that traditionally SSD keep strictly in firmware has been proposed as a replacement for the block device interface. In this paper we will show the benefit of using the new Open-Channel SSD interface in comparison to the traditional block layer interface by comparing performance results for the RocksDB database using the new LightNVM Linux kernel subsystem to manage the Open-Channel SSD. We will do this by comparing five different vendor SSD that have either implemented the Open-Channel SSD interface or we have replaced the firmware with our own implementation.

Older description:

My topic is on my masters project which is on the new LightNVM subsystem in the Linux kernel that moves part of the Solid State Disk(SSD) functionality into the the kernel. My project is to implement this functionality into SSD disk drivers and evaluate how it compares to the manufacturers on-disk implementations on domain specific IO patterns using selected databases.

15.

Speech recognition on sign language with a (recurrent) neural network

Josephine Rutten

What would be the different methods to 'translate' American Sign Language to English? What kind of movement recognition could we use? How should we go about the different grammatics? How is this done in translating other languages? How much training data would we need? Where could we find training data?

16.

Visible Efficient Electricity Consumption

Hrannar Már Ágústsson

Electricity is limited resource and should be used efficiently. Today people have no easy way to see how much electricity they are using at any given time and therefore they don't know if they are using the electricity efficiently at the time.

By measuring the each circuit in the house and store all the data. We could make algorithm that can determine if the electricity consumer is using more or less electricity based on earlier consumption.

If we show the electricity consumer in real time he is more likely to turn of unnecessary devices and therefore use less electricity.

17.

Monophasic vs Polyphasic sleep: how apprentice is affected

Marco Massetti

Every person in the world has one thing in common with the others: the sleep. Every people sleep, several hours per day, but the main difference is between the ones which use the mono-phasic sleep and others that use the polyphasic. Have this methods influence on their memory and speed of apprenticing? We can test it measuring the time spend on comprehension of a difficult paper, how much time they need and how much they remember after 1 month (for example) of the sleeping method. In this way we can measure the different time of sleeping effects on human’s brain.

18.

The world of music: why do we get tired of a song?

Leonardo Veroli

In order to escape the stress of everyday life, each one takes refuge in the world of music that is able to provide us a sort of shelter in which we find relief, happiness and relax. But a question arises: why do we get tired of a song? The answer would be that a sort of relationship is created between the song and our state of mind when we listen it, caused by the melody, the tone, the beat and the lyrics of thath song: it's not a case that, e.g., when we are sad we listen to slow songs, with a melancholy text. In this paper we will try to see if there is this connection between mood and songs and try to explain the reasons behind this behaviour.

19.

Using convolutional neural networks to optimize bone search in cod fish

Guðmundur Már Einarsson

Can convolutional neural networks work better than our current method in finding bones in cod fish? What are the differences? What effect will better bone search have on the final product?

20.

Fine Motor Skill Object Manipulations with HTC Vive Controllers in a Virtual Environment

Halldór Snær Kristjánsson

A research which determines the HTC Vive’s capabilities to perform fine motor skill movement (and thus manipulating small objects with precision). The research question is answered by designing a virtual environment in which the individual is to perform fine motor motions with the controllers in a test like manner. If the user fails to perform the instructed movements within the precision bounds of the experiment, the mistake is noted and the magnitude of the failure is recorded. Evaluation will be based on the capabilities of the individuals to perform a set of tasks, which will require different levels of precision. Such an evaluation could empirically show how well the technology handles fine motor movements and object manipulations.

21.

Time classification of Hip-Hop lyrics

Guðmundur Bjarni Kristjánsson

This research project will attempt to position hip-hop songs on a timeline. By building a classifier that uses clustering algorithms to identify patterns in the lyrics of the songs and perhaps other features such as rhyme to predict the decade the song was written. Some research has been done on the classification of songs by lyrics but all previous efforts have been focused towards music genre classification with some results. This research will however focus on the distinguishing features attributed to time across the hip-hop music genre.

22.

Exploring Testing Techniques in Swift

Magnús Ólafur Magnússon

TBD

EOF