logo

Texts and Tools (TNT) Lab

The current focus in the Texts and Tools lab is on creating practical online tools that help people learn English. The tools we create often detect and/or visualize particular language features. Some language features are easy to detect automatically while others are much more challenging. Our research draws on corpus linguistics to analyze texts and computational linguistics to create rule-based and probabilistic-based pattern-searching tools or pipelines.

Lab overview

  • Vision: Enhancing language learning using technology
  • Mission: Help scientists share their research with the world
  • Aims:
    Create tools for scientists to
    1. understand written research documents
    2. produce written research documents
  • Objectives for AY2020-22:
    1. Deploy feature visualizer at micro-, meso- and macro-discourse levels for annotated texts, including graduation theses and academic essays
    2. Develop functionality of tense identifier
    3. Develop and deploy dynamic language assessment tool
    4. Design and prototype authorship analysis tool
    5. Timeline generator: Automatic generation of timeline to visualize time-tense relationship
    6. Various NLTK-inspired tools, e.g. Gist highlighter based on simplified SVOCA analysis using NLP pipeline coded in Python
    7. Evaluate existing tools and increase their accuracy and usabilty
  • Slogan: Grit and a razor-focus leads to success
  • Values: Integrity and work ethic (plodding and bursting)
  • Culture: Tracking board showing key performance indicators (KPIs) - "what gets measured gets done"

Research themes

  • Pronunciation Scaffolder This version annotates presentation scripts to help users read their script aloud. Features annotated include pausing, intonation, content words, word stress, tricky sounds and linking between words. The initial prototypes were created by computer science majors in EL317 language and patterns class of 2017.
  • Error detector This tool detects errors found in a corpus of information and computer science short research articles. Automated feedback is given for accuracy, brevity, clarity, objectivity and formality errors.
  • Language feature detector This tool detects various language features, including: modality (hedges, approximations and boosters), voice, pronouns and articles.
  • Language feature visualizer. This tool visualizes language features in a pre-loaded pre-annotated corpus of short research articles and academic essays (beta release expected soon).
  • Gist Visualizer. This tool highlights the main gist in texts using simplified SVOCA analysis using an NLP pipeline coded in Python (subject to funding) to identify the finite verbs and their grammatical subjects.

Lab member recruitment

If you want to develop practical language-related online tools, consider joining this lab. I am keen to recruit students who are keen to use their coding skills to create language tools that will help others improve their reading and writing skills. English is used as the primary lingua franca. Online communication is via Slack. If you have no interest in coding, this is not the lab for you. I have one expectation for lab members: show grit. If you are interested in joining the TNT lab, check out this video slideshow.

Lab schedule

The TNT lab aims to be a place that offers members a supportive challenging atmosphere. Each lab member brings with a different set of skills, behaviours, knowledge and interests. We aim to harness these individual differences to their best and match members with research projects that best suit them. Lab members are expected to contribute to a variety of projects when B3 students. By working on a variety of projects, you will get valuable experience which will help you design and lead your own project as a B4 student. This project will be the vehicle for their graduation thesis.

Formal lab meetings are held in Semester 1 for B3 students and Semester 2 for B4 students.

B3 students
  • Quarter 1: problem-based workshops on professionalism and basic research methodology
  • Quarter 2: problem-based workshops on pattern detection and language analysis
  • Quarter 3: practical programming - Python
  • Quarter 4: practical programming - NLTK and Keras
B4 students
  • Quarter 1: GT research - GT relevant works
  • Quarter 2: GT research - GT poster presention
  • Quarter 3: GT research - GT first draft (Introduction & Method)
  • Quarter 4: GT research - GT submission and presentation

Recommended courses for lab members

Recommended (but not required) courses

  • FU08 Automata and languages
  • FU10 Language processing systems
  • EL317 Language and patterns
  • EL236 Visualizing time and tense

Lab members

Current:

  • Associate professor: John Blake
  • Graduation thesis student (due to graduate Autumn 2020)
    • Jun Inoue (B5, NLP)
  • Graduation thesis students (from April 2020 to March 2022):
    • Akihiro Oda (B3, NLP)
    • Yusuke Niiyama (B3, NLP)
    • Izumu Koshihara (B3, NLP)
    • Kento Miura (B3, NLP)

(B3 = junior, B4 = senior, M1 = first year master degree, M2 = second-year master degree)

Lab alumni:

  • Takumi Kondo. Graduation thesis submitted 2020: Pattern detection and video typology.
  • Hiroki Inoue. Graduation thesis submitted 2018: Verification and improvement of software to support reading English aloud.