User Tools

Site Tools


public:t-malv-15-3:3

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
public:t-malv-15-3:3 [2015/09/03 10:17] – [2. myscript.py: argv] orvarkpublic:t-malv-15-3:3 [2024/04/29 13:33] (current) – external edit 127.0.0.1
Line 31: Line 31:
 <code> <code>
 $ python myscript.py One TWO three $ python myscript.py One TWO three
-['lab3-1.py', 'One', 'TWO', 'three']+['myscript.py', 'One', 'TWO', 'three']
 </code> </code>
  
Line 60: Line 60:
 ===== 3. mytokenize.py: Read file ===== ===== 3. mytokenize.py: Read file =====
  
-**Create a script name ''tokenize.py'' that reads a file contents, tokenizes them, removes stopwords and print out the remaining tokens, one per line.**+**Create a script name ''mytokenize.py'' that reads a file contents, tokenizes them, removes stopwords and print out the remaining tokens, one per line.**
  
 <code python> <code python>
Line 69: Line 69:
 #Get file name from argv (see problem 3). #Get file name from argv (see problem 3).
 #Open file for reading. #Open file for reading.
-#Read contents into string.+#Read contents into string.
 #Tokenize the string. #Tokenize the string.
 #Remove stopwords (words in stopwords.words('english')). #Remove stopwords (words in stopwords.words('english')).
Line 75: Line 75:
 </code> </code>
  
-You should be able to invoke the script using ''python tokenize.py test.txt''.+You should be able to invoke the script using ''python mytokenize.py test.txt''.
  
  
Line 128: Line 128:
  
 **If you feel this problem is easy you should also try your hand at problems 31 and 41.** **If you feel this problem is easy you should also try your hand at problems 31 and 41.**
 +
 +===== Possible Solutions =====
 +
 +<code python>
 +#1
 +>>> monty[::-1] == 'nohtyP ytnoM'
 +True
 +
 +#2
 +from sys import argv
 +
 +print('Number of parameters: ', len(argv)-1)
 +print('Script name: ', argv[0])
 +print('First parameter: ', argv[1])
 +print('Second parameter: ', argv[2])
 +
 +#3
 +from sys import argv
 +from nltk import word_tokenize
 +from nltk.corpus import stopwords
 +
 +with open(argv[1]) as infile:
 +    for w in word_tokenize(infile.read()):
 +        if w.lower() not in stopwords.words('english'):
 +            print(w)
 +
 +#Since files are context managers, they can be used in a with-statement.
 +#The file will close when the code block is finished, even if an exception occurs
 +
 +#4
 +from sys import argv
 +from codecs import encode
 +
 +with open(argv[1]) as infile, open(argv[2], 'w') as outfile:
 +    for line in infile:
 +        outfile.write(encode(line, 'rot_13'))
 +
 +#5
 +[(w, len(w)) for w in sent]
 +</code>
/var/www/cadia.ru.is/wiki/data/attic/public/t-malv-15-3/3.1441275430.txt.gz · Last modified: 2024/04/29 13:32 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki