By the end of this unit you should have:
Work in pairs. Discuss your answers to the questions below.
Read.
Current expert witnesses in criminal cases who need to examine evidene and present their findings to court do not have access to a system that can compare datasets. The primary purpose of your expert system is to create a system that identifies the linguistic similarity among datasets (corpora). The system should compare two or more known corpora to a questioned corpus. The system will evaluate which corpus is most similar and which is most dissimilar. The dissimilar corpus is ruled out. This continues until only one corpus remains. That corpus is identified most similar, and therefore we assume written by the same author as the questionned corpus.
Check these lists of features for the expert system which show what the system needs to do.
The expert system will be able to:
absolutely + JJ
essentialThe expert system will be able to:
absolutely + JJ
The expert system will be able to:
Read.
POS tagging is the act of labelling words with a particular part of speech. The common parts of speech are noun, verb, adverb and adjective. However, most POS taggers use a much large set of tags. The most popular POS tagset has 36 tags. NLP pipelines that aim to map syntax or disambiguate meanings often use this layer. The Penn treebank tagset is shown in the table below.
CC Coordinating conjunction | CD Cardinal number | DT Determiner |
EX Existential there | FW Foreign word | IN Preposition or subordinating conjunction |
JJ Adjective | JJR Adjective, comparative | JJS Adjective, superlative |
LS List item marker | MD Modal | NN Noun, singular or mass |
NNS Noun, plural | NNP Proper noun, singular | NNPS Proper noun, plural |
PDT Predeterminer | POS Possessive ending | PRP Personal pronoun |
PRP$ Possessive pronoun | RB Adverb | RBRAdverb, comparative |
RBS Adverb, superlative | RP Particle | SYM Symbol |
TO to | UH Interjection | VB Verb, base form |
VBD Verb, past tense | VBG Verb, gerund or present participle | VBN Verb, past participle |
VBP Verb, non-3rd person singular present | VBZ Verb, 3rd person singular present | WDT Wh-determiner |
WP Wh-pronoun | WP$ Possessive wh-pronoun | WRB Wh-adverb |
Check the slide deck and see if there are any ideas that you could use to improve your expert system. Slides were created by students and combined into a single slide deck.
Can you:
If you do not, make sure that you do before your next class.
Running count: 38 of 38 concepts covered so far.