====== Lab 1 ======
**Hand in your code in MySchool before midnight today (20 August). A single .py file containing the code in the same order as the given problems.** You can use File->New File in IDLE to create the file.
If you can't manage to complete a particular problem please hand in your incomplete code -- comment it out if it produces an error.
===== 1. Getting to know some helpful functions =====
#Use dir() to see the names that exist in the current scope.
dir()
#You can use help() to see what dir does.
help(dir)
#Now define three variables and a list. Feel free to change the values:
my_str = "This is an ordinary string"
my_int = 5
my_float = 4.6
my_list = ['A','B','C','D']
#Use dir() again. Has anything changed?
#Use type() to see the type of each.
type(my_str)
...
#Now use dir on the four types you defined.
dir(my_str)
...
#Most of the names are functions that can be applied to the types.
#For example dir(my_str) lists 'upper' so its possible to do the following:
my_str.upper()
#You can use help to see what each function does:
help(my_str.upper)
#Use dir and help to select one function to apply to each of the variables
#and the list.
**Return the your code for applying the four functions you selected in the last part.**
Note to use print(my_...) to show the change the function made.
===== 2. Naive is_male =====
#Define a very simple and naive function to check if an Icelandic proper name belongs to a male
#(e.g. ends with "son").
def is_male(proper_name):
return #add code here
Example usage:
>>> is_male("Örvar Kárason")
True
>>> is_male("Glódís Káradóttir")
False
>>> is_male("Gillian Anderson")
True
>>> is_male("Tucson")
True
>>> is_male("a person")
True
**Can you improve the function so it handles some of the false positives in the example above?**
===== 3. Replace bad with good =====
Define a function that takes a string (text), list (bad_list) and an optional string (good_str) as arguments. It should return the text-string where all occurances of the string items on the bad-list-list have been replaced by the good-string.
def str_replace(text, bad_list, good=''):
#add code here
return text
Example usage:
>>> str_replace("Duck", ['u','c'], '*')
'D**k'
>>> str_replace("Python has strange rules!", ['strange ','has '])
'Python rules!'
===== 4. NLTK functions =====
Before you can get started with the NLTK corpora you have download it with nltk.download() once.
import nltk
Apply NLTK functions to do the following:
* Import text6 from nltk
* Show the concordance of the word "coconut"
* Find words occuring in similar contexts to "coconut"
* Find the collocations in text6
===== 5. NLTK coding =====
Write code to do the following with the NLTK:
* List all words starting with 'z' alphabetically in text6
* List all uppercase words in text6 (problem 23 [[http://www.nltk.org/book/ch01.html#exercises]])
===== 6. A dictionary of rules =====
You are given a dictionary (string:list) of CFG production rules. Make some changes to the rules and then print them nicely.
rules = {"S": ["NP VP"],
"VP": ["V NP"],
"NP": ["Det N", "Adj NP"],
"N": ["boy", "girl"],
"V": ["sees", "likes"],
"Adj": ["big", "small"],
"Det": ["a", "the"]}
#Add code to add the verb "hates" to "V".
#Add code to add the nouns "dog" and "cat" to "N".
#Add code to print out the rules giving the following output.
//Hint: items() [[https://docs.python.org/3.4/tutorial/datastructures.html#looping-techniques]]//
Expected output (the order of the rules does not matter):
N -> boy | girl | dog | cat
S -> NP VP
NP -> Det N | Adj NP
Adj -> big | small
Det -> a | the
VP -> V NP
V -> sees | likes | hates
===== Possible Solutions =====
#1
my_str.upper()
my_int.str()
my_float.is_integer()
my_list.pop(2)
#2
def is_male(proper_name):
return proper_name[-3:] == "son"
#return proper_name.endswith("son")
# return ... and ' ' in proper_name and proper_name.istitle() and "Gillian" in proper_name
# return ... and proper_name.find(' ') and proper_name.istitle() and proper_name.find("Gillian")
#3
def str_replace(text, bad_list, good=''):
for bad in bad_list:
text = text.replace(bad, good)
return text
#4
from nltk.book import text6
text6.concordance("coconut")
text6.similar("coconut")
text6.collocations()
#5
sorted(w for w in set(text6) if w.startswith('z')) # w[0] == 'z'
sorted(set(w for w in text6 if w.isupper())))
#6
rules["N"] += ["dog", "cat"]
rules["V"].append("hates")
for left, right in rules.items():
print(left, "->", ' | '.join(right))