ITCS 4101: Introduction to Natural Language Processing
Spring 2026


Time and Location: Tue, Thu 2:30 – 3:45pm, Duke 207

Instructor & IAs:   Razvan Bunescu     Andrew Morgan     Kento Hopkins
Office:   Woodward 410G   Cone 164   Cone 164
Office hours:   Tue, Thu 4:00 – 5:00pm   Mon 3:00 – 4:00pm   Wed 11:00am – 12:00pm
Email:   rbunescu @ charlotte edu   amorga94 @ charlotte edu   khopki22 @ charlotte edu

Textbook (PDF available online):
  • Speech and Language Processing (3rd edition draft), by Daniel Juraksfy and James E. Martin; draft released on Jan 6, 2026.

  • Course description:
    Natural Language Processing (NLP) is a branch of Artificial Intelligence whose focus is on the development of computer systems that process or generate natural language. This course will first introduce fundamental linguistic analysis tasks, including tokenization, syntactic parsing, semantic parsing, and coreference resolution. We will then study vector based representations of text, ranging from bag-of-words and TF-IDF to neural word and text embeddings. The course will survey machine learning models and techniques underlying modern NLP, including attention and Transformer-based language models, which will be used in a number of NLP applications such as sentiment classification, information extraction, or question answering. In parallel, the course will introduce standard frameworks for developing workflows where LLM-based agents connect with tools and communicate with other agents. Overall, the aim of this course is to equip students with an array techniques and tools that they can use to solve known NLP tasks, as well as new types of NLP problems.

    Prerequisites:
    Introduction to Machine Learning (ITCS 3156). Students are expected to be comfortable with programming in Python, data structures and algorithms (ITSC 2214), and basic machine learning techniques. Relevant background material will be made available on this website throughout the course.

    Lecture notes:
    1. Syllabus & Introduction
    2. Python for programming, linear algebra, and visualization
    3. Tokenization: From text to sentences and tokens
    4. Regular expressions
    5. Text classification using Logistic Regression
    6. LLMs: use scenarios, strengths and weaknesses
    7. LLMs: application development through APIs
    8. LLMs: connecting applications with tools and external resources
    9. LLMs: Efficient Factual grounding using RAG and vector DBs
    10. LLMs: Connecting AI applications to external resources using MCP
    11. LLMs: developing and deploying multi-agent systems
    12. Word meanings; Sparse vs. dense representations of words
    13. N-grams and Neural models for Language Modeling and Sequence Processing
    14. Machine translation, Sequence-to-sequence models and Attention
    15. Transformers and Self-Attention
    16. Biases vs. fairness and rationality in NLP models

    Homework assignments1:
    Background reading materials:
    Supplemental readings:
    Tools and packages: