logo

Unit 4: Intelligent CALL

Learning outcomes

By the end of this unit you should:

  • be able explain what intelligent CALL is
  • have tried out some intelligent CALL tools designed for UoA students
  • have designed a grammatical parsing tool
  • know the pros and cons of using regular expressions
  • have developed a program to actualize the design
Rubik

Activity 1: The rise of intelligent CALL

Read page 4 of this book chapter.

When you are ready, work in pairs. Compare and contrast the following:

  1. CALL and intelligent CALL
  2. Natural language processing and intelligent CALL
  3. Natural language processing and natural language generation
  4. Natural language processing and machine learning

Activity 2: Feature visualizer

The feature visualizer is a prototype tool that is under development. It is designed to help learners become familiar with the genre of short scientific research articles. In the UoA, the graduation thesis for undegraduates takes the form of such an article. The feature visualizer comprises of a small database of research articles. Various language features can be identified and highlighted in the current version. Video and textual explanations will be added in the near future to provide users with more details about the language that is visualized. The accuracy and usability of the functions vary.

Check out the functionalities listed below. Evaluate the accuracy, scope and usability of each function. Remember the purpose of this tool is pedagogic.

  1. Sections: Introduction, Method, etc.
  2. Moves: Related works, Importance, Novelty, Overview
  3. Functions: Refer to figure, table or equation
  4. Connections: Cohesion and coherence
  5. Linking: Prepositions, Conjunctions and Adverbial transitions.
  6. Tense: Past simple, Present perfect, etc.
  7. Voice: Passive voice
  8. Modality: Boostes, hedges and approximations
  9. Word profile: Words are highlighted according to frequency

Discuss ways to improve the tool with a partner. Share your conclusions with your tutor.

Activity 3: Parsing

Work in pairs. Design a rule-based expert system that identifies the appropriate grammatical structure to use in one of the following scenarios. Once you have worked out your system, create a decision tree using flowchart symbols, or a set of if-then decision rules (i.e. a rulebase).

  1. Indefinite, definite or null article (a, an, the, -- ): e.g. I bought ** sandwich and put it in my bag. Later that day, I shared ** sandwich with my friend. The user inputs a sentence with ** prior to any phrase that may use an article. Your task is to replace the ** symbol with the appropriate article.
  2. Rejoinders (so do I, neither can she, etc.): Prompt: I can speak Latin. Responses: So can I. I can't. The user inputs a declarative statement as a prompt. Your task is to generate two appropriate responses - one that shows agreement or similarity and one that shows disagreement or dissimilarity. The adverbs so and neither are used to show agreement.

Activity 4: Regular expressions

Read.

Regular expressions look daunting. There is a lot to learn, but it can be approached systematically by dividing the knowledge to be acquired into managable blocks. There are many tutorial sites geared to helping learners understand and use regular expressions. I recommend using the website RegexOne to help you practise each of the following concepts. You will learn how to match, skip and capture characters and groups. There is a lesson and an exercise for each of the concepts listed below. Each lesson should only take a few minutes. If you are an expert, you should be finished in 15 minutes or less.

  • Lesson 1: matching letters
  • Lesson 1.5: matching letters and numbers
  • Lesson 2: wildcard character
  • Lesson 3: matching specific characters
  • Lesson 4: excluding specific characters
  • Lesson 5: character ranges
  • Lesson 6: matching repeated characters
  • Lesson 7: matching repeated characters (part 2)
  • Lesson 8: matching optional characters
  • Lesson 9: matching whitespace
  • Lesson 10: matching lines
  • Lesson 11: capturing groups
  • Lesson 12: capturing nexted groups
  • Lesson 13: capturing multiple nexted groups
  • Lesson 14: matching conditional text using the pipe symbol
  • Lesson 15: using metacharacters

The solution to the exercise for Lesson 1 is shown below. When you solve the problem, you can continue to the next stage. The website offers solutions, but I strongly advise you to attempt the exercises yourself. If you cannot solve these exercises, you will almost certainly struggle with the assignment later in this course. Learn regex now.

stock crash

Activity 5: Part-of-speech tagging

Read

POS tagging is the act of labelling words with a particular part of speech. The common parts of speech are noun, verb, adverb and adjective. However, most POS taggers use a much large set of tags. The most popular POS tagset has 36 tags. NLP pipelines that aim to map syntax or disambiguate meanings often use this layer. The Penn treebank tagset is shown in the table below.

CC Coordinating conjunction CD Cardinal number DT Determiner
EX Existential there FW Foreign word IN Preposition or subordinating conjunction
JJ Adjective JJR Adjective, comparative JJS Adjective, superlative
LS List item marker MD Modal NN Noun, singular or mass
NNS Noun, plural NNP Proper noun, singular NNPS Proper noun, plural
PDT Predeterminer POS Possessive ending PRP Personal pronoun
PRP$ Possessive pronoun RB Adverb RBRAdverb, comparative
RBS Adverb, superlative RP Particle SYM Symbol
TO to UH Interjection VB Verb, base form
VBD Verb, past tense VBG Verb, gerund or present participle VBN Verb, past participle
VBP Verb, non-3rd person singular present VBZ Verb, 3rd person singular present WDT Wh-determiner
WP Wh-pronoun WP$ Possessive wh-pronoun WRB Wh-adverb

Activity 6: Natural Language Tool Kit (NLTK)

The Natural Language Tool Kit (NLTK) is one of the most popular libraries for creating NLP pipelines. There are many tutorials online to show you how to get started. For those who prefer a video introduction, check out the first video in a playlist. The topic is tokenizing. Sentdex is a popular programming YouTuber.

Watch and listen to this short introducttion to using NLTK with Python.

Activity 7: Program creation

Create a program that runs in Terminal/Command line for the expert system you have created.

Activity 8: Sharing your program

Discuss the best way to make your program accessible online. Identify the pros and cons of the different approaches.

Activity 9: Evaluating programs

Try out these expert systems developed by participants in the 2023 cohort. Consider the following questions:

  1. How accurate is the system? Are there false positive or false negative results?
  2. How user-friendly is the system? Can you suggest improvements?
  3. Do you think users can learn from interacting with this system? Why or why not? Can its effectiveness be improved?

Here are the links:

  1. Article generator (a, an, the or null): Codepen
  2. Article generator (a, an, the or null): Webpage
  3. Rejoinders to show agreement (so/neither): Colab link
  4. Rejoinders to show agreement (so/neither): Webpage or See below

so-neither generator

Input sentence and enter generate button to generate a sentence which has the meaning of ME TOO

Result -

Review

Can you do the following?

  1. Explain part-of-speech tags
  2. Describe what an expert system is
  3. Create a decision tree or rulebase for an expert system.
  4. Create a program using.

If you cannot, make sure that you do before your next class.

Running count: 39 of 65 concepts covered so far.