CS 6301.004: Deep Learning for NLP

Course Information

Days & Times Friday, 10:00 am - 12:45 pm (11:15--11:35 break)
Location GR 2.530
Professor Contact Information
Professor: Xinya Du
Office Phone: (972) 883-2634
Email Address: xinya.du@utdallas.edu
Office Location & Hours: ECSS 3.227, Friday 2 pm - 3 pm
Teaching Assistant Information
Teaching Assistant: Yi-Hui Lee
Email Address: Yi-Hui.Lee@UTDallas.edu
Office Hours: Teams, Monday 2-3pm
Prerequisites, Co-requisites, and/or Other Restrictions
Programming in Python
CS 5343 Algorithm Analysis and Data Structures
CS 6375 Machine Learning
More information
- Grading, Presentation and project requirements, Course policies, please see the syllabushere
- Grouping: see the excel spreadsheet on eLearning
- Uploading slides: see the folder links on eLearning (Due: 11:59pm the day before your presentation)
- Uploading Assignments and Reports: GradeScope (Due: 11:59pm on the due date)

Course Description

This graduate course covers the topic of deep learning for natural language processing, as well as the recent research related to this topic. It mainly consists of three parts:

Foundation knowlege and background about DL and NLP: We will cover core deep learning and NLP concepts, such as feedforward networks, attention, backpropagation, and word embeddings; and most recent concepts like prompting and pre-training. We will also cover application based topics such as question answering, summarization, and information extraction.
Recent research paper discussions: We will hold paper presentations and discussions that focus on recent research on NLP.
Hands-on Projects: each group of students work on a project that involves both DL and NLP. Proposal template [link] Final report template [link]

Textbooks and Materials

Speech and Language Processing (3rd edition), Jurafsky and Martin [link]
Neural Network Methods for Natural Language Processing, Yoav Goldberg [link]

Schedule

Date	Topic	Assignment	Papers / Recommended Readings	Group
Week 1 (Jan 20)	Introduction	.
Week 2 (Jan 27)	Text Classification, Neural Networks; Backpropagation [slides] [slides]		J&M 7 (.1--.4), Primer, J&M 7, Intro to Computation Graphs
Week 3 (Feb 3)	Word Representations; [slides] RNNs, Seq2Seq, Attention [slides]	a1 out	J&M 6, word2vec explained J&M 9, J&M 10 (.2, .3), Luong 15
Week 4 (Feb 10)	"""
Week 5 (Feb 17)	(Optional) Pytorch & Transformers Tutorial	quiz	If you have attended the tutorial in CS6320, or have background in the frameworks (i.e. Pytorch and HF), you can skip this lecture.
	Self-attention, Transformers [slides]	a1 due, a2 out	(Blog) illustrated Transformer (Jay) (Blog) Annotated Transformer (Sasha), Original Paper
Week 6 (Feb 24)	"""
	Pretrained Language Models (PLMs) [slides]		(Blog post) Generalized Language Models 2019, Pre-trained Models for Natural Language Processing: A Survey (Liu et al 2019), Percy Liang's introduction to LLMs
Week 7 (Mar 3)	Paper Presentation * 2 (PLMs)		Encoder-only models: BERT, ELECTRA, "Encoder-decoder models: T5, mT5, (Optional) FLAN, T0, Scaling Instruction-Finetuned Language Model (FLAN)	Group 17 Group 20
	DL for NLP applications (QA, NLG) [slides] [slides]		Huggingface Datasets, Paper with Code
Week 8 (Mar 10)	Project proposal * 21, Check our students' awesome proposal presentations! ➡️	a2 due	Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12, Group13, Group14, Group15, Group16, Group17, Group18, Group19, Group20, Group21, Group22, Group23, Group24
Week 9	Spring Break!
Week 10 (Mar 24)	Paper Presentation * 1 (PLMs)		Decoder-only models: GPT-2, GPT-3 (OpenAI); (optional) PaLM, OPT; (optional) How does GPT Obtain its Ability (blog post)?	Group 11
	Prompting, In-context learning (PL, RZ) [slides]	project proposal report due	Survey Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing, Liu et al 2021
Week 11 (Mar 31)	Paper Presentation * 3 (Prompting)		Prompting for few-shot learning: Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference Schick and Schütze 2021 Making Pre-trained Language Models Better Few-shot Learner (Gao et al 2021) Prompting as parameter-efficient fine-tuning Prefix-Tuning: Optimizing Continuous Prompts for Generation (Li and Liang 2021)	Group 16 Group 7 Group 12
	"""		In-context learning: Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? (Min et al. 2021) What Makes Good In-Context Examples for GPT-3? (Liu et al 2021)	Group 10 Group 14
Week 12 (Apr 7)	Interpretability, Explainability, Model Analysis (RZ), Robustness (RZ) [slides] [slides]		(Optional) Probing Classifiers: Promises, Shortcomings, and Advances, (Optional) ACL 2020 Tutorial
	Reasoning (D, CoT)		CoT Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, Logical Reasoning over Natural Language as Knowledge Representation: A Survey Table 2 in https://arxiv.org/pdf/2302.00923.pdf Table 6 in A Survey of Deep Learning for Mathematical Reasoning
Week 13 (Apr 14)	Paper Presentation * 3 (Interpretability, RZ, D)		Attention Attention is not Explanation Probing A Structural Probe for Finding Syntax in Word Representations Causal Inference Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias (presented on Mar 31)	Group 21 Group 8 Group 18
	Paper Presentation * 3 (Robustness, RZ)		Motivation Explaining and Harnessing Adversarial Examples, Adversarial Examples for Evaluating Reading Comprehension Systems Design Universal Adversarial Triggers for Attacking and Analyzing NLP (Optional) Evaluating Models’ Local Decision Boundaries via Contrast Sets Improve Extracting or Guessing? Improving Faithfulness of Event Temporal Relation Extraction (to be presented on Apr 21)	Group 19 Group 22 Group 9
Week 14 (Apr 21)	Vision-Language Models (RZ) [slides]		- (Optional) Blog Post Generalized Visual Language Models, - Do As I Can, Not As I Say: Grounding Language in Robotic Affordances - Flamingo: a Visual Language Model for Few-Shot Learning , - Multimodal Chain-of-Thought Reasoning in Language Models,
	Paper presentations * 3 (Reasoning)		Zero-shot Large Language Models are Zero-Shot Reasoners Kojima et al 2022 Self-consistency Self-Consistency Improves Chain of Thought Reasoning in Language Models, Wang et al 2022, (Optional) Complexity-Based Prompting for Multi-Step Reasoning Fu et al. 2022 Faithful Faithful Reasoning Using Large Language Models (Optional) Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning DeepMind	Group 5 Group 6 Group 13
Week 15 (Apr 28)	Paper presentations * 3 (VL)		Visual Transformer (ViT) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 2020, (Optional) Pretrain VisualBERT: A Simple and Performant Baseline for Vision and Language 2019" Grounding Do As I Can, Not As I Say: Grounding Language in Robotic Affordances Contrastive Pretraining (CLIP) Learning transferable visual models from natural language supervision. 2021, (Optional one application) ClipCap: CLIP Preﬁx for Image Captioning, (Optional) ViT to VisualBERT to CLIP (slide 1-7)"	Group 15 Group 24 Group 23
	Final presentations * 6 Check our students' awesome final presentations! ➡️		Group24, Group23, Group22, Group21, Group20, Group19	8min+2min QA
Week 16 (May 5)	Final presentations * 15	project final report due	Group18, Group17, Group16, Group15, Group14, Group13, Group12, Group11, Group10, Group9, Group8, Group7, Group6, Group5	8min+2min QA

CS 6301.004 (Spring 23): Special Topics in Computer Science - Deep Learning for NLP

Course Information

Course Description

Textbooks and Materials

Schedule