Course Schedule
Esfand 99
Session | Date | Topic |
---|---|---|
1 | 3 Esfand | Introduction |
2 | 5 Esfand | Word representations (Distributional semantics, co-occurrence matrix, dimensionality reduction and SVD, language models) Readings: [cs224n-1][cs224n-1-notes] |
3 | 10 Esfand | Word embeddings (Word2vec, GloVe) Readings: [cs224n-1][cs224n-1-notes] |
4 | 12 Esfand | Word embeddings (Evaluation, cross-lingual space, ambiguity and sense embeddings) Readings: [cs224n-2] [cs224n-2-notes] |
5 | 17 Esfand | Word embeddings (Sub-word embeddings, retrofitting, debiasing) Readings: [nn4nlp2021] |
6 | 19 Esfand | Text classification and regression Readings: [info256-5][info256-6] |
7 | 24 Esfand | Language modeling (n-gram, probability computation, back-off interpolation, sparsity and smoothing, feedforward NN for LM) Readings: [cs224n-5][Voita-LM] |
8 | 26 Esfand | Language modeling with RNNs (backprop through time, text generation, perplexity, text generation, sampling with temprature) Readings: [cs224n-5][Voita-LM] |
Farvardin 00
Session | Date | Topic |
---|---|---|
9 | 15 Farvardin | Vanishing/exploding gradients and fancy RNNs (LSTMs, bidirectional and stacked RNNs) Readings: [cs224n-6] [cs224n-6-notes] |
10 | 17 Farvardin | Machine Translation (SMT, NMT, seq2seq models, beam-search decoding, evaluation) Readings: [cs224n-7] [cs224n-7-notes] |
11 | 22 Farvardin | Paper discussion on RNNs |
12 | 24 Farvardin | Attention mechanism (seq2seq attention, attention variants, hierarchical attention networks) Readings: [cs224n-7] [cs224n-7-notes] |
13 | 29 Farvardin | Progress Report I |
14 | 31 Farvardin | Word senses and contextualization (skipped) |
Ordibehesht 00
Session | Date | Topic |
---|---|---|
15 | 5 Ordibehesht | Transformers (BERT model, self-attention, multi-head, positional encoding, contextualised embeddings, derivatives of BERT) Readings: [slides] [cs224n-9] |
16 | 7 Ordibehesht | More about Transformers and Pretraining (subwords, byte-pair encoding, pretrain/finetune, architecture types: decoders, encoders, and encoder-decoders) Readings: [cs224n-10] |
17 | 12 Ordibehesht | Paper discussion on Transformers |
18 | 19 Ordibehesht | *Isotropicity of Semantic Spaces (Rajaee) Readings: [slides] |
19 | 21 Ordibehesht | Question Answering (reading comprehension, SQuAD, LSTM-based and BERT models, BiDAF, open-domain QA) Readings: [cs224n-11] |
20 | 26 Ordibehesht | Progress Report II |
21 | 28 Ordibehesht | *LM-based Word Sense Disambiguation (Rezaee) Readings: [slides] |
Khordad 00
Session | Date | Topic |
---|---|---|
22 | 2 Khordad | *Interpretability (Modaressi & Mohebbi) Readings: [slides] |
23 | 4 Khordad | *Dialogue (Pourdabiri) Readings: [slides] |
24 | 9 Khordad | Integrating knowledge in language models (knowledge-aware LMs, entity embedding, ERNIE, memory-based models, KGLM, kNN-LM, modified training, WKLM, evaluation, prompting) Readings: [cs224n-15] |
25 | 11 Khordad | Neural Language Generation (applications, maximum likelihood training, teacher forcing, greedy and random sampling, top-k and nucleus sampling, unlikelihood training, exposure bias, evaluating NLG, bias and ethical concerns) Readings: [cs224n-12] |
26 | 18 Khordad | *Zero-shot applictions of Cloze test (Tabasi) Readings: [slides] |
27 | 23 Khordad | Paper discussion on knowledge-enhanced models |
28 | 25 Khordad | Progress Report III |