This is an old revision of the document!
Table of Contents
Lab 1
Hand in your code in MySchool before midnight today (20 August). A single .py file containing the code in the same order as the given problems. You can use File→New File in IDLE to create the file.
If you can't manage to complete a particular problem please hand in your incomplete code – comment it out if it produces an error.
1. Getting to know some helpful functions
#Use dir() to see the names that exist in the current scope. dir() #You can use help() to see what dir does. help(dir) #Now define three variables and a list. Feel free to change the values: my_str = "This is an ordinary string" my_int = 5 my_float = 4.6 my_list = ['A','B','C','D'] #Use dir() again. Has anything changed? #Use type() to see the type of each. type(my_str) ... #Now use dir on the four types you defined. dir(my_str) ... #Most of the names are functions that can be applied to the types. #For example dir(my_str) lists 'upper' so its possible to do the following: my_str.upper() #You can use help to see what each function does: help(my_str.upper) #Use dir and help to select one function to apply to each of the variables #and the list.
Return the your code for applying the four functions you selected in the last part. Note to use print(my_…) to show the change the function made.
2. Naive is_male
#Define a very simple and naive function to check if an Icelandic proper name belongs to a male #(e.g. ends with "son"). def is_male(proper_name): return #add code here
Example usage:
>>> is_male("Örvar Kárason") True >>> is_male("Glódís Káradóttir") False >>> is_male("Gillian Anderson") True >>> is_male("Tucson") True >>> is_male("a person") True
Can you improve the function so it handles some of the false positives in the example above?
3. Replace bad with good
Define a function that takes a string (text), list (bad_list) and an optional string (good_str) as arguments. It should return the text-string where all occurances of the string items on the bad-list-list have been replaced by the good-string.
def str_replace(text, bad_list, good=''): #add code here return text
Example usage:
>>> str_replace("Duck", ['u','c'], '*') 'D**k' >>> str_replace("Python has strange rules!", ['strange ','has ']) 'Python rules!'
4. NLTK functions
Before you can get started with the NLTK corpora you have download it with nltk.download() once.
import nltk
Apply NLTK functions to do the following:
- Import text6 from nltk
- Show the concordance of the word “coconut”
- Find words occuring in similar contexts to “coconut”
- Find the collocations in text6
5. NLTK coding
Write code to do the following with the NLTK:
- List all words starting with 'z' alphabetically in text6
- List all uppercase words in text6 (problem 23 http://www.nltk.org/book/ch01.html#exercises)
6. A dictionary of rules
You are given a dictionary (string:list) of CFG production rules. Make some changes to the rules and then print them nicely.
rules = {"S": ["NP VP"], "VP": ["V NP"], "NP": ["Det N", "Adj NP"], "N": ["boy", "girl"], "V": ["sees", "likes"], "Adj": ["big", "small"], "Det": ["a", "the"]} #Add code to add the verb "hates" to "V". #Add code to add the nouns "dog" and "cat" to "N". #Add code to print out the rules giving the following output.
Hint: items() https://docs.python.org/3.4/tutorial/datastructures.html#looping-techniques
Expected output (the order of the rules does not matter):
N -> boy | girl | dog | cat S -> NP VP NP -> Det N | Adj NP Adj -> big | small Det -> a | the VP -> V NP V -> sees | likes | hates
Possible Solutions
#1 my_str.upper() my_int.str() my_float.is_integer() my_list.pop(2) #2 def is_male(proper_name): return proper_name[-3::] == "son" #return proper_name.endswith("son") #proper_name.find(' ') #proper_name.istitle() #proper_name.find("Gillian") #3 def str_replace(text, bad_list, good=''): for bad in bad_list: text = text.replace(bad, good) return text #4 from nltk.book import text6 text6.concordance("coconut") text6.similar("coconut") text6.collocations() #5 sorted(w for w in set(text6) if w.startswith('z')) sorted(w for w in set(text6) if w.upper())) #6 rules["N"] += ["dog", "cat"] rules["V"].append("hates") for left, right in rules.items(): print(left, "->", ' | '.join(right))