Course Information

Course Description

This graduate course covers the topic of deep learning for natural language processing, as well as the recent research related to this topic. It mainly consists of three parts:

Textbooks and Materials

Schedule

Date Topic Assignment Papers / Recommended Readings Group
Week 1 (Jan 20) Introduction
.
Week 2 (Jan 27) Text Classification, Neural Networks;
Backpropagation [slides] [slides]
J&M 7 (.1--.4), Primer, J&M 7, Intro to Computation Graphs
Week 3 (Feb 3) Word Representations; [slides]
RNNs, Seq2Seq, Attention [slides]
a1 out J&M 6, word2vec explained
J&M 9, J&M 10 (.2, .3), Luong 15
Week 4 (Feb 10) """
Week 5 (Feb 17) (Optional) Pytorch & Transformers Tutorial
quiz If you have attended the tutorial in CS6320, or have background in the frameworks (i.e. Pytorch and HF), you can skip this lecture.
Self-attention, Transformers [slides]
a1 due,
a2 out
(Blog) illustrated Transformer (Jay)
(Blog) Annotated Transformer (Sasha), Original Paper
Week 6 (Feb 24) """
Pretrained Language Models (PLMs) [slides]
(Blog post) Generalized Language Models 2019,
Pre-trained Models for Natural Language Processing: A Survey (Liu et al 2019),
Percy Liang's introduction to LLMs
Week 7 (Mar 3) Paper Presentation * 2 (PLMs)
Encoder-only models: BERT, ELECTRA,
"Encoder-decoder models: T5, mT5,
(Optional) FLAN, T0, Scaling Instruction-Finetuned Language Model (FLAN)
Group 17

Group 20
DL for NLP applications (QA, NLG) [slides] [slides]
Huggingface Datasets,
Paper with Code
Week 8 (Mar 10) Project proposal * 21, Check our students' awesome proposal presentations! ➡️
a2 due Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12, Group13, Group14, Group15, Group16, Group17, Group18, Group19, Group20, Group21, Group22, Group23, Group24
Week 9 Spring Break!
Week 10 (Mar 24) Paper Presentation * 1 (PLMs)
Decoder-only models: GPT-2, GPT-3 (OpenAI);
(optional) PaLM, OPT;
(optional) How does GPT Obtain its Ability (blog post)?
Group 11
Prompting, In-context learning (PL, RZ) [slides]
project proposal report due Survey Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing, Liu et al 2021
Week 11 (Mar 31) Paper Presentation * 3 (Prompting)
Prompting for few-shot learning: Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference Schick and Schütze 2021
Making Pre-trained Language Models Better Few-shot Learner (Gao et al 2021)
Prompting as parameter-efficient fine-tuning Prefix-Tuning: Optimizing Continuous Prompts for Generation (Li and Liang 2021)
Group 16

Group 7

Group 12
"""
In-context learning: Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? (Min et al. 2021)
What Makes Good In-Context Examples for GPT-3? (Liu et al 2021)
Group 10

Group 14
Week 12 (Apr 7) Interpretability, Explainability, Model Analysis (RZ), Robustness (RZ) [slides] [slides]
(Optional) Probing Classifiers: Promises, Shortcomings, and Advances,
(Optional) ACL 2020 Tutorial
Reasoning (D, CoT)
CoT Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,
Logical Reasoning over Natural Language as Knowledge Representation: A Survey Table 2 in https://arxiv.org/pdf/2302.00923.pdf
Table 6 in A Survey of Deep Learning for Mathematical Reasoning
Week 13 (Apr 14) Paper Presentation * 3 (Interpretability, RZ, D)
Attention Attention is not Explanation
Probing A Structural Probe for Finding Syntax in Word Representations
Causal Inference Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias (presented on Mar 31)
Group 21
Group 8
Group 18
Paper Presentation * 3 (Robustness, RZ)
Motivation Explaining and Harnessing Adversarial Examples, Adversarial Examples for Evaluating Reading Comprehension Systems
Design Universal Adversarial Triggers for Attacking and Analyzing NLP
(Optional) Evaluating Models’ Local Decision Boundaries via Contrast Sets
Improve Extracting or Guessing? Improving Faithfulness of Event Temporal Relation Extraction (to be presented on Apr 21)
Group 19

Group 22

Group 9
Week 14 (Apr 21) Vision-Language Models (RZ) [slides]
- (Optional) Blog Post Generalized Visual Language Models,
- Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
- Flamingo: a Visual Language Model for Few-Shot Learning ,
- Multimodal Chain-of-Thought Reasoning in Language Models,
Paper presentations * 3 (Reasoning)
Zero-shot Large Language Models are Zero-Shot Reasoners Kojima et al 2022
Self-consistency Self-Consistency Improves Chain of Thought Reasoning in Language Models, Wang et al 2022,
(Optional) Complexity-Based Prompting for Multi-Step Reasoning Fu et al. 2022
Faithful Faithful Reasoning Using Large Language Models
(Optional) Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning DeepMind
Group 5
Group 6


Group 13
Week 15 (Apr 28) Paper presentations * 3 (VL)
Visual Transformer (ViT) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 2020,
(Optional) Pretrain VisualBERT: A Simple and Performant Baseline for Vision and Language 2019"
Grounding Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Contrastive Pretraining (CLIP) Learning transferable visual models from natural language supervision. 2021,
(Optional one application) ClipCap: CLIP Prefix for Image Captioning,
(Optional) ViT to VisualBERT to CLIP (slide 1-7)"
Group 15



Group 24

Group 23
Final presentations * 6 Check our students' awesome final presentations! ➡️
Group24, Group23, Group22, Group21, Group20, Group19 8min+2min QA
Week 16 (May 5) Final presentations * 15
project final report due Group18, Group17, Group16, Group15, Group14, Group13, Group12, Group11, Group10, Group9, Group8, Group7, Group6, Group5 8min+2min QA