====== Lab 1 ====== **Hand in your code in MySchool before midnight today (20 August). A single .py file containing the code in the same order as the given problems.** You can use File->New File in IDLE to create the file. If you can't manage to complete a particular problem please hand in your incomplete code -- comment it out if it produces an error. ===== 1. Getting to know some helpful functions ===== #Use dir() to see the names that exist in the current scope. dir() #You can use help() to see what dir does. help(dir) #Now define three variables and a list. Feel free to change the values: my_str = "This is an ordinary string" my_int = 5 my_float = 4.6 my_list = ['A','B','C','D'] #Use dir() again. Has anything changed? #Use type() to see the type of each. type(my_str) ... #Now use dir on the four types you defined. dir(my_str) ... #Most of the names are functions that can be applied to the types. #For example dir(my_str) lists 'upper' so its possible to do the following: my_str.upper() #You can use help to see what each function does: help(my_str.upper) #Use dir and help to select one function to apply to each of the variables #and the list. **Return the your code for applying the four functions you selected in the last part.** Note to use print(my_...) to show the change the function made. ===== 2. Naive is_male ===== #Define a very simple and naive function to check if an Icelandic proper name belongs to a male #(e.g. ends with "son"). def is_male(proper_name): return #add code here Example usage: >>> is_male("Örvar Kárason") True >>> is_male("Glódís Káradóttir") False >>> is_male("Gillian Anderson") True >>> is_male("Tucson") True >>> is_male("a person") True **Can you improve the function so it handles some of the false positives in the example above?** ===== 3. Replace bad with good ===== Define a function that takes a string (text), list (bad_list) and an optional string (good_str) as arguments. It should return the text-string where all occurances of the string items on the bad-list-list have been replaced by the good-string. def str_replace(text, bad_list, good=''): #add code here return text Example usage: >>> str_replace("Duck", ['u','c'], '*') 'D**k' >>> str_replace("Python has strange rules!", ['strange ','has ']) 'Python rules!' ===== 4. NLTK functions ===== Before you can get started with the NLTK corpora you have download it with nltk.download() once. import nltk Apply NLTK functions to do the following: * Import text6 from nltk * Show the concordance of the word "coconut" * Find words occuring in similar contexts to "coconut" * Find the collocations in text6 ===== 5. NLTK coding ===== Write code to do the following with the NLTK: * List all words starting with 'z' alphabetically in text6 * List all uppercase words in text6 (problem 23 [[http://www.nltk.org/book/ch01.html#exercises]]) ===== 6. A dictionary of rules ===== You are given a dictionary (string:list) of CFG production rules. Make some changes to the rules and then print them nicely. rules = {"S": ["NP VP"], "VP": ["V NP"], "NP": ["Det N", "Adj NP"], "N": ["boy", "girl"], "V": ["sees", "likes"], "Adj": ["big", "small"], "Det": ["a", "the"]} #Add code to add the verb "hates" to "V". #Add code to add the nouns "dog" and "cat" to "N". #Add code to print out the rules giving the following output. //Hint: items() [[https://docs.python.org/3.4/tutorial/datastructures.html#looping-techniques]]// Expected output (the order of the rules does not matter): N -> boy | girl | dog | cat S -> NP VP NP -> Det N | Adj NP Adj -> big | small Det -> a | the VP -> V NP V -> sees | likes | hates ===== Possible Solutions ===== #1 my_str.upper() my_int.str() my_float.is_integer() my_list.pop(2) #2 def is_male(proper_name): return proper_name[-3:] == "son" #return proper_name.endswith("son") # return ... and ' ' in proper_name and proper_name.istitle() and "Gillian" in proper_name # return ... and proper_name.find(' ') and proper_name.istitle() and proper_name.find("Gillian") #3 def str_replace(text, bad_list, good=''): for bad in bad_list: text = text.replace(bad, good) return text #4 from nltk.book import text6 text6.concordance("coconut") text6.similar("coconut") text6.collocations() #5 sorted(w for w in set(text6) if w.startswith('z')) # w[0] == 'z' sorted(set(w for w in text6 if w.isupper()))) #6 rules["N"] += ["dog", "cat"] rules["V"].append("hates") for left, right in rules.items(): print(left, "->", ' | '.join(right))