public:t-malv-15-3:5
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
public:t-malv-15-3:5 [2015/09/17 02:15] – [4. Most frequent tag of Hapax legomenon] orvark | public:t-malv-15-3:5 [2024/04/29 13:33] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 7: | Line 7: | ||
===== 1. Down the Garden Path ===== | ===== 1. Down the Garden Path ===== | ||
- | Lets get started by trying out the POS-Tagger in NLTK. See if you can think of some ambiguous sentences that confuse the tagger -- or google for [[http:// | + | Lets get started by trying out the POS-Tagger in NLTK. See if you can think of some ambiguous sentences that confuse the tagger -- or google for some [[http:// |
<code python> | <code python> | ||
Line 28: | Line 28: | ||
</ | </ | ||
- | See if making minor changes to the wording of the sentences is enough for the tagger to get the correct result. For example adding '' | + | See if making minor changes to the wording of the sentences is enough for the tagger to tag it correctly. For example adding '' |
- | ===== 2. Training and Testing Data and Finding the Baseline | + | ===== 2. Training and Testing Data, and Finding the Baseline |
<code python> | <code python> | ||
Line 41: | Line 41: | ||
Train the **Unigram Tagger** on the training sentences using **Default Tagger** as backoff. Use ' | Train the **Unigram Tagger** on the training sentences using **Default Tagger** as backoff. Use ' | ||
- | Evaluate the taggers performance on the testing sentences. How well does it do? Do you think this is a fair baseline | + | Evaluate the taggers performance on the testing sentences. How well does it do? Do you think this is a fair baseline |
===== 3. A Cascade of Taggers ===== | ===== 3. A Cascade of Taggers ===== | ||
Line 55: | Line 55: | ||
</ | </ | ||
- | Evalute | + | Evaluate |
===== 4. Most frequent tag of Hapax legomenon | ===== 4. Most frequent tag of Hapax legomenon | ||
Line 61: | Line 61: | ||
Instead of using the most common tag overall for the Default Tagger some say that [[https:// | Instead of using the most common tag overall for the Default Tagger some say that [[https:// | ||
- | See if you can write code to find the most common | + | See if you can write code to find the most common |
What is the most common tag of //hapax legomenon//? | What is the most common tag of //hapax legomenon//? | ||
+ | |||
+ | Looking at the twenty most common tags, how do you think the difference between the overall model and hapax legomenon model will develop as the training corpus grows larger? |
/var/www/cadia.ru.is/wiki/data/attic/public/t-malv-15-3/5.1442456138.txt.gz · Last modified: 2024/04/29 13:32 (external edit)