Semester 7Year 4 · OddCore Subject★★★★★ Hard
CS 703

Natural Language Processing

Study of text processing, language models, transformers, sentiment analysis, machine translation, and NLP applications.

4Units
28Topics
4Credits
60hLecture hrs
100Max marks
Your Progress
0 / 28 topics
0% complete
Overview
🎯
Why it matters
ChatGPT, Google Translate, Siri, Alexa — all NLP. Understanding tokenization, transformers, BERT, GPT is essential for building conversational AI, chatbots, search engines, and text analytics.
💼
Placement relevance
NLP Engineer roles at Google, Microsoft, OpenAI. Chatbot developers. Search ranking teams. ₹35-70 LPA for NLP specialists. HUGE demand post ChatGPT boom.
🔗
Prerequisites for
Conversational AI · Chatbot Development · Machine Translation · Text Analytics · Voice Assistants · LLM Fine-tuning
📚
Recommended books
Speech and Language Processing by Jurafsky and Martin · Natural Language Processing with Python by Steven Bird · Natural Language Processing in Action by Hobson Lane · Transformers for Natural Language Processing by Denis Rothman
Curriculum — 4 Units
U1
Unit 1 · 7 Topics · 0% complete
Text Processing & Basics
Key Formulae
TF-IDF:TF-IDF = TF(t,d) × log(N/DF(t))
Word2Vec:CBOW (context→word) vs Skip-gram (word→context)
Tokenization
Stemming & Lemmatization
Stop Words Removal
Bag of Words (BoW)
TF-IDF
Word Embeddings (Word2Vec, GloVe)
N-grams
U2
Unit 2 · 7 Topics · 0% complete
Language Models & Sequence Processing
Key Formulae
Language Model:P(w₁, w₂, ..., wₙ) = ∏P(wᵢ | w₁...wᵢ₋₁)
Attention:Context vector = weighted sum of encoder states
N-gram Language Models
RNN for NLP
LSTM for Text
Sequence-to-Sequence Models
Encoder-Decoder Architecture
Attention Mechanism
Beam Search
U3
Unit 3 · 7 Topics · 0% complete
Transformers & Modern NLP
Key Formulae
Self-Attention:Attention(Q,K,V) = softmax(QK^T/√d_k)V
BERT:Masked Language Modeling + Next Sentence Prediction
Self-Attention Mechanism
Transformer Architecture
BERT (Bidirectional Encoder)
GPT (Generative Pre-trained Transformer)
Fine-tuning Pre-trained Models
Transfer Learning in NLP
Hugging Face Transformers
U4
Unit 4 · 7 Topics · 0% complete
NLP Applications
Key Formulae
NER:Sequence tagging: BIO tags (Begin, Inside, Outside)
Sentiment:Classify: Positive, Negative, Neutral
Sentiment Analysis
Named Entity Recognition (NER)
Machine Translation
Text Summarization
Question Answering
Chatbots & Dialogue Systems
Topic Modeling (LDA)
Previous Year Questions
Unit 12023 · End Semester10 marks
Calculate TF-IDF scores for the word 'machine' in 3 documents. Given: Doc1: 'machine learning', Doc2: 'machine intelligence', Doc3: 'deep learning'. Show all steps.
Unit 32023 · End Semester8 marks
Explain Transformer architecture with self-attention mechanism. How does multi-head attention work? What are the advantages over RNN/LSTM?
Unit 42022 · End Semester6 marks
Design a sentiment analysis pipeline for tweets. Mention preprocessing steps, feature extraction (TF-IDF or embeddings), and classification algorithm.
Exam Strategy
🔢
TF-IDF calculations
Practice TF-IDF computations with 2-3 documents. Show term frequency, document frequency, final TF-IDF score. Common exam question.
🔄
Transformers are key
Attention mechanism, multi-head attention, BERT vs GPT comparison. Draw Transformer architecture diagram. Explain positional encoding.
💡
Real applications
Sentiment analysis, NER, chatbots — explain with pipeline diagrams. Preprocessing → Feature extraction → Model → Output. Give examples.
Related Subjects
Semester 7
Deep Learning
CS 701
Semester 5
Machine Learning
CS 501
Semester 6
Artificial Intelligence
CS 601