Texts and Tools (TNT) Lab

The current focus in the Texts and Tools lab is on creating practical online tools that help people learn English. The tools we create often detect and/or visualize particular language features. Some language features are easy to detect automatically while others are much more challenging. Our research draws on corpus linguistics to analyze texts and computational linguistics to create rule-based and probabilistic-based pattern-searching tools or pipelines.

Lab overview

  • Vision: Enhancing language learning using technology
  • Mission: Help scientists share their research with the world
  • Aims:
    Create tools for scientists to
    1. understand written research documents
    2. produce written research documents
  • Objectives for AY2020-22:
    1. Deploy feature visualizer at micro-, meso- and macro-discourse levels for annotated texts, including graduation theses and academic essays
    2. Develop functionality of tense identifier
    3. Develop and deploy dynamic language assessment tool
    4. Design and prototype authorship analysis tool
    5. Timeline generator: Automatic generation of timeline to visualize time-tense relationship
    6. Various NLTK-inspired tools, e.g. Gist highlighter based on simplified SVOCA analysis using NLP pipeline coded in Python
    7. Evaluate existing tools and increase their accuracy and usability
  • Slogan: Grit and a razor-focus leads to success
  • Values: Integrity and work ethic (plodding and bursting)
  • Culture: Tracking board showing key performance indicators (KPIs) - "what gets measured gets done"

Research themes

  • Pronunciation Scaffolder This version annotates presentation scripts to help users read their script aloud. Features annotated include pausing, intonation, content words, word stress, tricky sounds and linking between words. The initial prototypes were created by computer science majors in EL317 language and patterns class of 2017.
  • Error detector This tool detects errors found in a corpus of information and computer science short research articles. Automated feedback is given for accuracy, brevity, clarity, objectivity and formality errors.
  • Language feature detector This tool detects various language features, including: modality (hedges, approximations and boosters), voice, pronouns and articles.
  • Language feature visualizer. This tool visualizes language features in a pre-loaded pre-annotated corpus of short research articles and academic essays (beta release expected soon).
  • Gist Visualizer. This tool highlights the main gist in texts using simplified SVOCA analysis using an NLP pipeline coded in Python (subject to funding) to identify the finite verbs and their grammatical subjects.

Lab member recruitment: undergraduate and graduate

If you want to develop practical language-related online tools, consider joining this lab. I am keen to recruit students who are keen to use their coding skills to create language tools that will help others improve their reading and writing skills. English is used as the primary lingua franca. Online communication is via Slack. If you have no interest in coding, this is not the lab for you. I have one expectation for lab members: show grit.

The TNT lab aims to be a place that offers members a supportive challenging atmosphere. Each lab member brings with a different set of skills, behaviours, knowledge and interests. We aim to harness these individual differences to their best and match members with research projects that best suit them. Lab members are expected to contribute to a variety of projects when B3 students. By working on a variety of projects, you will get valuable experience which will help you design and lead your own project as a B4 student. This project will be the vehicle for their graduation thesis.

Schedule for 2022 cohort

Formal lab meetings are usually held in Semester 1 for B3 students and Semester 2 for B4 students. In AY2022, the planned schedule is as follows:

Year 3 (B3 students)
  • Quarter 1: problem-based workshops on professionalism and basic research methodology
  • Quarter 2: problem-based workshops on pattern detection and language analysis
  • Quarter 3: practical programming - Python
  • Quarter 4: practical programming - NLTK and Keras
Year 4 (B4 students)
  • Quarter 1: GT research - GT relevant works
  • Quarter 2: GT research - GT poster presention
  • Quarter 3: GT research - GT first draft (Introduction & Method)
  • Quarter 4: GT research - GT submission and presentation

Previous Year 4 schedule

The previous schedule was as follows:

  • Quarter 1
    • Blended course: Professionalism
    • Intensive course: Python
  • Quarter 2
    • Blended course: Research methods
    • Intensive course: Pattern detection with Python (NLTK or Keras)
  • Quarter 3
    • GT course: Poster, GT first draft
    • Indidividual project: Development
  • Quarter 4
    • GT course: Thesis and presentation
    • Indidividual project: Develoment and Evaluation

Recommended courses for lab members

Recommended (but not required) courses

  • FU08 Automata and languages
  • FU10 Language processing systems
  • EL317 Language and patterns
  • EL236 Visualizing time and tense

Lab members


  • Senior associate professor: John Blake
  • Graduation thesis students (from April 2022 to March 2024):
    • Kazuma Tamura (B3)
    • Kaito Asai (B3)
    • Fumito Takeue (B3)

(B3 = junior, B4 = senior, M1 = first year master degree, M2 = second-year master degree)

KPI table

Lab alumni:

  • Yusuke Niiyama. Graduation thesis submitted 2022: Development of an app of trend description generation.
  • Izumu Koshihara. Graduation thesis submitted 2022: Authorship attribution application and algorithm.
  • Kento Miura. Graduation thesis submitted 2022: Comparison of document features by passive voice, n-gram and readability using natural language processing.
  • Takumi Kondo. Graduation thesis submitted 2020: Pattern detection and video typology.
  • Hiroki Inoue. Graduation thesis submitted 2018: Verification and improvement of software to support reading English aloud.