Deep Learning
A comprehensive exploration of neural networks, covering foundational architectures like MLPs and advanced models including CNNs, RNNs, and Transformers.
This course provides a deep dive into the historical evolution and modern application of neural networks. We begin with the mathematical foundations of the McCulloch Pitts Neuron and Perceptrons, progressing into the representation power of Multi-Layer Perceptrons and the mechanics of Backpropagation. The curriculum covers a wide array of optimization strategies, including modern variants like Adam and NAdam, alongside regularization techniques such as Dropout and Batch Normalization. Students will gain expertise in state-of-the-art architectures, from Convolutional Neural Networks (ResNet, Inception) to sequence-based models like LSTMs and the transformative Attention mechanism.
Instructor
Prof. Mitesh M. Khapra, Department of Computer Science and Engineering, IIT Madras
Course Schedule & Topics
The course is structured over 12 weeks, transitioning from foundational theory to complex deep learning architectures.
| Week | Primary Focus | Key Topics Covered |
|---|---|---|
| 1 | Foundations | History of DL, McCulloch Pitts Neuron, and Perceptron Learning Algorithm. |
| 2 | MLPs & Gradient Descent | Multilayer Perceptrons, Sigmoid Neurons, and the basics of Gradient Descent. |
| 3 | Feedforward Networks | Representation Power and the Backpropagation algorithm. |
| 4 | Optimization Algorithms | Momentum, Nesterov, Adagrad, RMSProp, Adam, and Learning Rate Schedulers. |
| 5 | Unsupervised Learning | Autoencoders, PCA relations, Denoising, and Contractive Autoencoders. |
| 6 | Regularization & Bias | Bias-Variance Tradeoff, L2 Regularization, Data Augmentation, and Dropout. |
| 7 | Training Improvements | Activation functions (ReLU, etc.), Weight Initialization, and Batch Normalization. |
| 8 | Convolutional Networks (CNN) | Word Vectorial Representations, LeNet, AlexNet, VGGNet, and ResNet architectures. |
| 9 | CNN Visualization | Guided Backpropagation, Deep Dream, Deep Art, and Adversarial Attacks (Fooling CNNs). |
| 10 | Recurrent Neural Networks (RNN) | BPTT, Vanishing and Exploding Gradients, and Truncated BPTT. |
| 11 | Gated RNNs (LSTM/GRU) | Gated Recurrent Units, LSTM Cells, and overcoming the vanishing gradient problem. |
| 12 | Advanced Sequence Models | Encoder-Decoder Models, Attention Mechanisms, and an introduction to Transformers. |
Material used
-
Deep Learningby Ian Goodfellow, Yoshua Bengio, and Aaron Courville (MIT Press).