ITCS 4111/5111: Introduction to Natural Language Processing

Fall 2023

Instructor & TAs: | Razvan Bunescu | Sivani Josyula |
Manogna Chennuru |
|||
---|---|---|---|---|---|---|

Office: | Woodward 210F | Zoom & Burson 239B | Zoom & Burson 239B | |||

Office hours: | Tue, Thu 4:00 – 5:00pm | Tue, Wed 10:00 – 11:00am | Thu, Fri 10:00 – 11:00am | |||

Email: | rbunescu @ charlotte edu | sjosyul2 @ charlotte edu | mchennu2 @ charlotte edu |

Natural Language Processing (NLP) is a branch of Artificial Intelligence concerned with developing computer systems that can analyze or generate natural language. This course will introduce fundamental linguistic analysis tasks, including tokenization, word representations, text classification, syntactic and semantic parsing, and coreference resolution. Machine learning (ML) based techniques will be introduced, ranging from Naive Bayes and logistic regression to Transformer-based language models, which will be used in a number of NLP applications such as sentiment classification, information extraction, or question answering. Overall, the aim of this course is to equip students with an array techniques and tools that they can use to solve known NLP tasks, as well as new types of NLP problems.

Students are expected to be comfortable with programming in Python, data structures and algorithms (ITSC 2214), and have basic knowledge of linear algebra (MATH 2164), statistics, and formal languages (regular and context free grammars). Knowledge of machine learning will be very useful, though not strictly necessary. Relevant background material will be made available on this website throughout the course.

- Syllabus & Introduction
- Python for programming, linear algebra, and visualization
- Python lecture
- Python tutorial
- NumPy tutorial
- Matplotlib tutorial

- Tokenization: From text to sentences and tokens
- Examples from lecture on Aug 31
- Hand notes from lecture on Sep 5

- Regular expressions
- Examples and hand notes from lecture on Sep 7
- Regular expressions in Python with
*re* - Regular expressions in UNIX with
*sed*

- Strengths and Weaknesses of Language Models
- Application development using GPT and Llama-2 through the Chat completion API
- GPT Examples and notebooks for GPT and Llama-2 from lecture on Sep 14

- Text classification using Naive Bayes
- Logistic regression
- Hand notes from lecture on Oct 10: probabilities and the logistic sigmoid.
- Hand notes from lectures on Oct 12, Oct 17, and Oct 19.
- Slides 1 to 13 from CS 4156 lecture on Intro to ML
- Slides 1 to 24 (LR) and slide 27 (LR + L2 regularization) from CS 4156 lecture on Logistic Regression
- Slides 1 to 21 from CS 4156 lecture on Gradient Descent

- Biases vs. fairness and rationality in NLP models
- Bias Mitigation for Machine Learning Classifiers: A Comprehensive Survey, by Hort et al., 2023.
- Challenging the appearance of machine intelligence: Cognitive bias in LLMs, by Talboy and Fuller, 2023.
- Benchmarking Cognitive Biases in Large Language Models as Evaluators, by Koo et al., 2023.
- Capturing Failures of Large Language Models via Human Cognitive Biases, by Jones and Steinhardt, NeurIPS 2022.

- Manual annotation for NLP
- Brat rapid annotation tool

- Word meanings; Sparse vs. dense representations of words
- N-grams and Neural models for Language Modeling and Sequence Processing
- Hand notes from lectures on Nov 14 and Nov 21.
- Slides 1 to 20 from CS 4156 lecture on Feed-forward neural networks.
- All Our N-gram are Belong to You

- Machine translation, Sequence-to-sequence models and Attention
- Chapter 9 in J & M on Deep Learning Architectures for Sequence Processing

- Transformer: Self-Attention Networks
- Hand notes from lectures on Nov 28 and Nov 30.
- Chapter 10 in J & M on Transformers and Pretrained Language Models
- Jay Alammar's Illustrated Transformer
- HuggingFace course section on How do Transformers work?

- Language Models: Pretraining and Fine-tuning
- Language Models: Prompting, In-context Learning, Chain of Thought, Instruct Tuning, RLHF
- The Ethico-Political Universe of ChatGPT, by John Levi Martin, Journal of Social Computing, March 2023.

- Coreference resolution
- Syntax, constituency parsing, dependency parsing

- Assignment 0 on Python lists and strings.
- Assignment 1 on Word distributions.
- Assignment 2 on Wikipedia processing with regular expressions.
- Assignment 3 on Question Answering on semi-structured restaurant data using the chat completion API.
- Assignment 4 on Sentiment Analysis with Naive Bayes.
- Assignment 5 on Sentiment Analysis with Logistic Regression and engineered features.
- Assignment 6 on corpus acquisition and annotation.
- Assignment 7 on Vector Representations of Words.
- Assignment 8 on RNNs for Sentiment Classification.
- Skeleton code and data.
- Educational cluster instructions.

- Assignment 9 on Transformer-based models for NLP.

- Resources and guidelines for the final project
- Tips for choosing a project topic:
- Project report guidelines

- Python programming:
- Probability and statistics:
- Basic probability theory (pp. 12-19) in Pattern Recognition and Machine Learning.
- Chapter 3 in DL textbook on Probability and Information Theory.
- Chapters 1-5 in Probability & Statistics for Engineers & Scientists.
- Nathaniel E. Helwig's Introduction to Probability Theory
- Statistical Inference book, Casella and Berger, 2001.
- Seeing Theory: A visual introduction to probability and statistics, Kunin et al., 2018.

- Linear Algebra:
- Chapter 2 in DL textbook on Linear Algebra.
- Chapter 2 on Linear Algebra in Mathematics for Machine Learning.
- Inderjit Dhillon's Linear Algebra Background
- Gilbert Strang's Introduction to Linear Algebra
- Petersen et al.'s The Matrix Cookbook
- Mike Brookes' Matrix Reference Manual

- Calculus:
- Basic properties for derivatives, integrals, exponentials, and logarithms.
- Chapter 4.3 in DL textbook on Numerical Computation.
- Gilbert Strang's Calculus texbook.

- Training language models to follow instructions with human feedback, Ouyang et al., NeurIPS 2022
- Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task, Li et al., ICLR 2023.
- Theory of Mind May Have Spontaneously Emerged in Large Language Models, Michal Kosinski, Stanford 2023.
- Are Emergent Abilities of Large Language Models a Mirage?, Schaeffer et al., DeployableGenerativeAI 2023.
- What’s the Meaning of Superhuman Performance in Today’s NLU?, Tedeschi et al., ACL 2023.

- Natural language processing:
- Machine learning:
- PyTorch Deep learning in Python
- PyTorch tutorial from University of Amsterdam.
- Scikit-learn Machine learning in Python
- Hugging Face ML models and datasets