By the end of this unit you should:
Work in pairs. Discuss your answers to the following questions
What effect do the following have on authorship analysis?
Explain the following terms in simple English.
Your name is Yuki Abe. You have just received a study abroad scholarship and will go to the United States. You will stay with a family. There is a university student in the family called Joey. Write an email of about 200 words introducing yourself to Joey. You CAN use an online dictionary but you CANNOT use Translation tools (e.g. DeepL, GoogleTranslate) of AI-generation tools (e.g. ChatGPT). Include the following information:
Submit your work via ELMS.
A 19-year-old British woman claimed that the police statement that she had written by hand was dictated to her by a Cypriot police officer. If the language used is typical of a young British woman, then her claim is likely to be untrue. However, is the language is not typical of a young British woman, then her claim is likely to be false. The full text is given below:
Statement
The report that I did on the 17th of July 2019 that I was raped at ayia napa was not the truth. The truth is that I wasnt raped and everything that happened in that appartment was with my consent. The reason I made the statement with the fake report is because I did not know they were recording & humiliating me that night I discovered them recording me doing sexual intercourse and I felt embarrassed so I want to appologise, say I made a mistake.
Source: Donlan, L., & Nini, A. (2022). A forensic authorship analysis of the Ayia Napa rape statement. In I. Picornell, R. Perkins, & M. Coulthard (Eds.), Methodologies and Challenges in Forensic Linguistic Casework, (pp.29-43). Wiley.
To solve this case an authorship profiling approach was adopted. The forensic linguist identified the language features listed below as being worthy of analysis.
Analyze the features listed and submit your work via ELMS.
Read.
N-grams are often used in forensic linguistics as the access point to identify the distinguishing, that is, the idiosyncratic features of an author. Access point is the name given to the way in which a text is first analyzed. The overuse or underuse of particular n-grams can help narrow down or identity the author of a questioned text. In the fields of computational linguistics and probability, an n-gram (sometimes also called Q-gram) is a contiguous sequence of n items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application. Using Latin numerical prefixes, an n-gram of size 1 is referred to as a "unigram"; size 2 is a "bigram" (or, less commonly, a "digram"); size 3 is a "trigram". English cardinal numbers are sometimes used, e.g., "four-gram", "five-gram", and so on.
Discuss the meaning of the following with a partner.
The reading text in Activity 4 is a combination of sentences written your tutor and someone else.
Work with a partner to try to identify which sentences were NOT written by your tutor.
Individual work
Work alone. Insert a sentence taken from Wikipedia into the email that you submitted in Activity 2.
Whole class work
Share your revised version by displaying it on your screen if you are in a classroom. Divide into two groups of students - authors and detectives. Authors stay by your screen. Detectives move around the classroom and identify the sentences copied from Wikipedia
Team work
Discuss how to automatically to identify the text NOT written by the author.
Make sure you can explain the following in simple English:
Running count: 58 of 60 concepts covered so far.