====== Lab 1 ======

**Hand in your code in MySchool before midnight today (20 August). A single .py file containing the code in the same order as the given problems.** You can use File->New File in IDLE to create the file.

If you can't manage to complete a particular problem please hand in your incomplete code -- comment it out if it produces an error.

===== 1. Getting to know some helpful functions =====

<code python>
#Use dir() to see the names that exist in the current scope.

dir()

#You can use help() to see what dir does.

help(dir)

#Now define three variables and a list. Feel free to change the values:

my_str = "This is an ordinary string"
my_int = 5
my_float = 4.6
my_list = ['A','B','C','D']

#Use dir() again. Has anything changed?

#Use type() to see the type of each.

type(my_str)
...

#Now use dir on the four types you defined.

dir(my_str)
...

#Most of the names are functions that can be applied to the types.
#For example dir(my_str) lists 'upper' so its possible to do the following:

my_str.upper()

#You can use help to see what each function does:

help(my_str.upper)

#Use dir and help to select one function to apply to each of the variables
#and the list.

</code>

**Return the your code for applying the four functions you selected in the last part.**
Note to use print(my_...) to show the change the function made.


===== 2. Naive is_male =====

<code python>
#Define a very simple and naive function to check if an Icelandic proper name belongs to a male
#(e.g. ends with "son").

def is_male(proper_name):
	return #add code here
</code>

Example usage:

<code>
>>> is_male("Örvar Kárason")
True
>>> is_male("Glódís Káradóttir")
False
>>> is_male("Gillian Anderson")
True
>>> is_male("Tucson")
True
>>> is_male("a person")
True
</code>

**Can you improve the function so it handles some of the false positives in the example above?**

===== 3. Replace bad with good =====

Define a function that takes a string (text), list (bad_list) and an optional string (good_str) as arguments. It should return the text-string where all occurances of the string items on the bad-list-list have been replaced by the good-string.

<code python>
def str_replace(text, bad_list, good=''):
    #add code here
    return text
</code>

Example usage:

<code>
>>> str_replace("Duck", ['u','c'], '*')
'D**k'
>>> str_replace("Python has strange rules!", ['strange ','has '])
'Python rules!'
</code>

===== 4. NLTK functions =====

Before you can get started with the NLTK corpora you have download it with nltk.download() once.

<code python>
import nltk
</code>

Apply NLTK functions to do the following:

  * Import text6 from nltk
  * Show the concordance of the word "coconut"
  * Find words occuring in similar contexts to "coconut"
  * Find the collocations in text6

===== 5. NLTK coding =====

Write code to do the following with the NLTK:

  * List all words starting with 'z' alphabetically in text6
  * List all uppercase words in text6 (problem 23 [[http://www.nltk.org/book/ch01.html#exercises]])

===== 6. A dictionary of rules =====

You are given a dictionary (string:list) of CFG production rules. Make some changes to the rules and then print them nicely.

<code python>
rules = {"S": ["NP VP"],
         "VP": ["V NP"],
         "NP": ["Det N", "Adj NP"],
         "N": ["boy", "girl"],
         "V": ["sees", "likes"],
         "Adj": ["big", "small"],
         "Det": ["a", "the"]}

#Add code to add the verb "hates" to "V".

#Add code to add the nouns "dog" and "cat" to "N".

#Add code to print out the rules giving the following output.

</code>
//Hint: items() [[https://docs.python.org/3.4/tutorial/datastructures.html#looping-techniques]]//

Expected output (the order of the rules does not matter):

<code>
N -> boy | girl | dog | cat
S -> NP VP
NP -> Det N | Adj NP
Adj -> big | small
Det -> a | the
VP -> V NP
V -> sees | likes | hates
</code>

===== Possible Solutions =====

<code python>
#1
my_str.upper()
my_int.str()
my_float.is_integer()
my_list.pop(2)

#2
def is_male(proper_name):
    return proper_name[-3:] == "son"
    #return proper_name.endswith("son")

# return ... and ' ' in proper_name and proper_name.istitle() and "Gillian" in proper_name
# return ... and proper_name.find(' ') and proper_name.istitle() and proper_name.find("Gillian")

#3
def str_replace(text, bad_list, good=''):
    for bad in bad_list:
        text = text.replace(bad, good)
    return text

#4
from nltk.book import text6
text6.concordance("coconut")
text6.similar("coconut")
text6.collocations()

#5
sorted(w for w in set(text6) if w.startswith('z')) # w[0] == 'z'
sorted(set(w for w in text6 if w.isupper())))

#6
rules["N"] += ["dog", "cat"]
rules["V"].append("hates")

for left, right in rules.items():
    print(left, "->", ' | '.join(right))


</code>