Differences

This shows you the differences between two versions of the page.

--- public:t-malv-15-3:3 [2015/09/02 22:33] – [2. myscript.py: argv] orvark
+++ public:t-malv-15-3:3 [2024/04/29 13:33] (current) – external edit 127.0.0.1
@@ Line 1: / Line 1: @@
 ====== Lab 3 ======
-**Try to complete as many problems as you can. Hand in your code files with what you have finished in MySchool before midnight today (3 September). **
+**Try to complete as many of the problems as you can. Hand in your code files with what you have done in MySchool before midnight today (3 September). **
-The first and the last problems should be in a file named ''lab3.py'', the others in ''myscript.py'', ''tokenize.py'' and ''rot13.py''.
+The first and the last problems should be in a file named ''lab3.py'', the others in ''myscript.py'', ''mytokenize.py'' and ''rot13.py''.
 If you can't manage to complete a particular problem please hand in your incomplete code -- comment it out if it produces an error.
@@ Line 13: / Line 13: @@
 </code>
-We can specify a "step" size for the slice. The following returns every second character within the slice: ''monty[6:11:2]''. It also works in the reverse direction: ''monty[10:5:-2]'' Try these for yourself, then experiment with different step values.
+We can specify a "step" size for the slice. The following returns every second character within the slice: ''monty[6:11:2]''. It also works in the reverse direction: ''monty[10:5:-2]'' **Try these for yourself, then experiment with different step values.**
-What happens if you ask the interpreter to evaluate ''monty[::-1]''? Explain why this is a reasonable result.
+**What happens if you ask the interpreter to evaluate ''monty[::-1]''? Explain why this is a reasonable result.**
 (Problem 4 and 5 in [[http://www.nltk.org/book/ch03.html|Chapter 3]])
@@ Line 30: / Line 30: @@
 <code>
-% python3 myscript.py One TWO three
+$ python myscript.py One TWO three
-['lab3-1.py', 'One', 'TWO', 'three']
+['myscript.py', 'One', 'TWO', 'three']
 </code>
@@ Line 42: / Line 42: @@
 </code>
-Create a script that produces the following output when executed with the parameters indicated:
+**Create a script named ''myscript.py'' that produces the following output when executed with the parameters indicated:**
 <code>
@@ Line 51: / Line 51: @@
 Second parameter: file2.txt
 </code>
+**NOTE: The python installer for Windows does not seem to add python to the path by default. If you can't invoke python in the Command Prompt (cmd) the simplest solution might be to install python again (choose "Change Python") and then make sure "Add python.exe to Path" is selected (last option undir "Customize Python").**
+{{:public:t-malv-15-3:python-path-install.png?direct&200|}}
 ([[https://docs.python.org/3.3/using/cmdline.html#using-on-cmdline|Information on executing python scripts in Windows]])
-===== 3. tokenize.py: Read file =====
+===== 3. mytokenize.py: Read file =====
-Read a file contents, tokenize them, remove stopwords and print out the remaining tokens.
+**Create a script name ''mytokenize.py'' that reads a file contents, tokenizes them, removes stopwords and print out the remaining tokens, one per line.**
 <code python>
@@ Line 65: / Line 69: @@
 #Get file name from argv (see problem 3).
 #Open file for reading.
-#Read contents into string.
+#Read contents into a string.
 #Tokenize the string.
 #Remove stopwords (words in stopwords.words('english')).
@@ Line 71: / Line 75: @@
 </code>
-You should be able to invoke the script using ''python tokenize.py test.txt''.
+You should be able to invoke the script using ''python mytokenize.py test.txt''.
 ===== 4. rot13.py: Read and Write file =====
-Lets now create a script that reads the contents from one file, line by line and alters the lines with a simple algorithm before writing them to another file.
+**Now create a script named ''rot13.py'' that reads the contents from one file, line by line and alters the lines with a simple algorithm before writing them to another file.**
 <code python>
@@ Line 85: / Line 89: @@
 #Open file1 for reading.
 #Open file2 for writing.
-#  Read one line from file1.
+#  Loop; read one line from file1.
      line = encode(line, 'rot_13')
 #    Write the line to file2.
@@ Line 105: / Line 109: @@
 ===== 5. List comprehension ====
-Rewrite the following loop as a **list comprehension**:
+**Rewrite the following loop as a "list comprehension"**:
 <code python>
@@ Line 124: / Line 128: @@
 **If you feel this problem is easy you should also try your hand at problems 31 and 41.**
+===== Possible Solutions =====
+<code python>
+#1
+>>> monty[::-1] == 'nohtyP ytnoM'
+True
+#2
+from sys import argv
+print('Number of parameters: ', len(argv)-1)
+print('Script name: ', argv[0])
+print('First parameter: ', argv[1])
+print('Second parameter: ', argv[2])
+#3
+from sys import argv
+from nltk import word_tokenize
+from nltk.corpus import stopwords
+with open(argv[1]) as infile:
+    for w in word_tokenize(infile.read()):
+        if w.lower() not in stopwords.words('english'):
+            print(w)
+#Since files are context managers, they can be used in a with-statement.
+#The file will close when the code block is finished, even if an exception occurs
+#4
+from sys import argv
+from codecs import encode
+with open(argv[1]) as infile, open(argv[2], 'w') as outfile:
+    for line in infile:
+        outfile.write(encode(line, 'rot_13'))
+#5
+[(w, len(w)) for w in sent]
+</code>