Three point one Linear Regression
Three point one point three The Normal Distribution and Squared Loss
Basic Properties of Tensor Arithmetic
Derivatives and Differentiation
Nine point one Working with Sequences
Nine point one point two Sequence Models
Nine point one point four Prediction
Nine point two point one Reading the Dataset
Nine point two point three
Exploratory Language Statistics
Nine point three point one
Nine point three point two Perplexity
Nine point three point four Summary and Discussion
Nine point four point one Neural Networks without Hidden States
Nine point four point three RNN-Based Character-Level Language Models
Recurrent Neural Network Implementation from Scratch
Nine point five point two
Nine point five point three
Nine point five point four
Nine point five point five
Nine point five point six
Nine point five point seven
Nine point six point one Defining the Model
Nine point six point three Summary
Nine point seven point one Analysis of Gradients in RNNs
Nine point seven point three
Nine point seven point four Exercises
Fourteen point six point four Summary
Fourteen point seven Single Shot Multibox Detection
Fourteen point seven point two
Fourteen point seven point three Prediction
Fourteen point eight point one R-CNNs
Fourteen point nine point one Image Segmentation and Instance Segmentation
Fourteen point nine point three
Fourteen point ten point one Basic Operation
Fourteen point ten point three
Fourteen point ten point four Summary
Fourteen point eleven Fully Convolutional Networks
Fourteen point eleven point two Initializing Transposed Convolutional Layers
Fourteen point eleven point four Training
Fourteen point twelve point one Method
Fourteen point twelve point three Preprocessing and Postprocessing
Fourteen point twelve point five Defining the Loss Function
Fourteen point twelve point seven
Fourteen point twelve point eight
Notebooks, Website, GitHub, and Forum
Installing the Deep Learning Framework and the d2l Package
Downloading and Running the Code
Probability and Information Theory
One point one A Motivating Example
One point two. Key Components
One point two point two Models
One point two point three Objective Functions
One point two point four Optimization Algorithms
One point three Kinds of Machine Learning Problems
One point three point one Supervised Learning
One point three point two Unsupervised and Self-Supervised Learning
One point three point three Interacting with an Environment
One point three point four Reinforcement Learning
One point five. The Road to Deep Learning
The Road to Deep Learning
One point six Success Stories
The Essence of Deep Learning
One point seven. The Essence of Deep Learning
One point nine. Exercises
Two point one. Data Manipulation
Two point one point one. Getting Started
Two point one point two Indexing and Slicing
Two point one point three Operations
Two point one point four Broadcasting
Two point one point five Saving Memory
Two point one point six Conversion to Other Python Objects
Two point one point seven Summary
Two point one point eight Exercises
Two point two Data Preprocessing
Two point two point one Reading the Dataset
Two point two point three Conversion to the Tensor Format
Two point two point five Exercises
Two point three Linear Algebra
Two point three point one Scalars
Two point three point two Vectors
Two point three point three Matrices
Two point three point three
Two point three point four Tensors
Two point three point five Basic Properties of Tensor Arithmetic
Two point three point six Reduction
Two point three point seven Non-Reduction Sum
Two point three point eight Dot Products
Two point three point nine Matrix-Vector Products
Two point three point ten Matrix-Matrix Multiplication
Two point three point eleven Norms
Two point three point twelve Discussion
Two point three point thirteen Exercises
Two point four point one Derivatives and Differentiation
Two point four point two Visualization Utilities
Two point four point three Partial Derivatives and Gradients
Two point four point four Chain Rule
Two point four point five Discussion
Two point four point six Exercises
Two point five. Automatic Differentiation
Two point five point one. A Simple Function
Two point five point two. Backward for Non-Scalar Variables
Two point five point three. Detaching Computation
Two point five point four Gradients and Python Control Flow
Two point five point five Discussion
Two point five point six Exercises
Two point six Probability and Statistics
Two point six point one A Simple Example: Tossing Coins
Two point six point two A More Formal Treatment
Two point six point three Random Variables
Two point six point four Multiple Random Variables
Two point six point five An Example
Two point six point six Expectations
Two point six point seven Discussion
Two point six point eight Exercises
Two point seven point one Functions and Classes in a Module
Two point seven point two Specific Functions and Classes
sixty-seven Discussions sixty-seven.
three point one Linear Regression
three point one point one Basics
Minibatch Stochastic Gradient Descent
Three point one point two Vectorization for Speed
Three point one point three The Normal Distribution and Squared Loss
Three point one point four Linear Regression as a Neural Network
Three point one point five Summary
Three point one point six Exercises
Three point two point one. Utilities
Three point two point two. Models
Three point two point three. Data
Three point two point four. Training
Three point two point six. Exercises
Three point three. Synthetic Regression Data
Three point three point one. Generating the Dataset
Three point three point two. Reading the Dataset
Three point three point three. Concise Implementation of the Data Loader
Three point three point four. Summary
Three point three point five. Exercises
Three point four. Linear Regression Implementation from Scratch
Three point four point one. Defining the Model
Three point four point two Defining the Loss Function
Three point four point three Defining the Optimization Algorithm
Three point four point four Training
Three point four point five Summary
Three point four point six Exercises
Three point five point one. Defining the Model.
Three point five point two. Defining the Loss Function.
Three point five point three. Defining the Optimization Algorithm.
Three point five point four. Training.
Three point five point five. Summary.
Three point five point six Exercises
Three point six point one Training Error and Generalization Error
Three point six point two Underfitting or Overfitting?
Three point six point three Model Selection
Three point six point four Summary
Three point six point five Exercises
Three point seven Weight Decay
Three point seven point one Norms and Weight Decay
Three point seven point two High-Dimensional Linear Regression
Three point seven point three Implementation from Scratch
Defining L two Norm Penalty
Training without Regularization
Three point seven point four Concise Implementation
Three point seven point five Summary
Three point seven point six Exercises
Four Linear Neural Networks for Classification
Four point one Softmax Regression
Four point one point one Classification
Four point one point two Loss Function
Softmax and Cross-Entropy Loss
Four point one point nine
Four point one point three Information Theory Basics
Four point one point four Summary and Discussion
Four point one point five Exercises
Four point two point one. Loading the Dataset
Four point two point two. Reading a Minibatch
Four point two point four. Summary
Four point two point five. Exercises
Four point three. The Base Classification Model
Four point three point one The Classifier Class
Four point three point two Accuracy
Four point three point three Summary
Four point three point four Exercises
Four point four Softmax Regression Implementation from Scratch
Four point four point one The Softmax
Four point four point two The Model
Four point four point three The Cross-Entropy Loss
Four point four point five Prediction
Four point four point six Summary
Four point four point seven Exercises
Four point five. Concise Implementation of Softmax Regression
Four point five point one. Defining the Model
Four point five point two. Softmax Revisited
Four point five point three. Training
Four point five point four. Summary
Four point five point five. Exercises
Four point six. Generalization in Classification
Four point six point one The Test Set
Four point six point two Test Set Reuse
Four point six point three Statistical Learning Theory
Four point six point four Summary
Four point six point five Exercises
Four point seven Environment and Distribution Shift
Four point seven point one Types of Distribution Shift
Four point seven point two Examples of Distribution Shift
Nonstationary Distributions
Four point seven point three. Correction of Distribution Shift
Covariate Shift Correction
Four point seven point four A Taxonomy of Learning Problems
Four point seven point five Fairness, Accountability, and Transparency in Machine Learning
Four point seven point six Summary
Four point seven point seven Exercises
Five Multilayer Perceptrons
Five point one. Multilayer Perceptrons
Five point one point one. Hidden Layers
Limitations of Linear Models
Incorporating Hidden Layers
Five point one point two Activation Functions
Five point one point three Summary and Discussion
Five point one point four Exercises
Five point two Implementation of Multilayer Perceptrons
Five point two point one Implementation from Scratch
Initializing Model Parameters
Five point two point two Concise Implementation
Five point two point three Summary
Five point two point four Exercises
Five point three Forward Propagation, Backward Propagation, and Computational Graphs
Five point three point one Forward Propagation
Five point three point two Computational Graph of Forward Propagation
Five point three point three Backpropagation
Five point three point four Training Neural Networks
Five point three point five Summary
Five point three point six Exercises
Five point four Numerical Stability and Initialization
Five point four point one Vanishing and Exploding Gradients
Five point four point two Parameter Initialization
Five point four point three Summary
Five point four point four Exercises
Five point five Generalization in Deep Learning
Five point five point one Revisiting Overfitting and Regularization
Five point five point two Inspiration from Nonparametrics
Five point five point three Early Stopping
Five point five point four Classical Regularization Methods for Deep Networks
Five point five point five Summary
Five point five point six Exercises
Five point six point one Dropout in Practice
Five point six point two Implementation from Scratch
Five point six point three Concise Implementation
Five point six point four Summary
Five point six point five Exercises
Five point seven Predicting House Prices on Kaggle
Five point seven point one Downloading Data
Five point seven point two Kaggle
Five point seven point three. Accessing and Reading the Dataset
Five point seven point four. Data Preprocessing
Five point seven point five. Error Measure
Five point seven point six. K-Fold Cross-Validation
Five point seven point seven Model Selection
Five point seven point eight Submitting Predictions on Kaggle
Submitting data to Kaggle
Five point seven point ten Exercises
Six point one Layers and Modules
Six point one point one A Custom Module
Six point one point two The Sequential Module
Six point one point three Executing Code in the Forward Propagation Method
Six point one point four Summary
Six point one point five Exercises
Six point two Parameter Management
Six point two point one Parameter Access
Six point two point two Tied Parameters
Six point two point three Summary
Six point two point four Exercises
Six point three Parameter Initialization
Six point three point one Built-in Initialization
Six point three point two Summary
Six point three point three Exercises
Six point four point one Summary
Six point four point two Exercises
Six point five point one Layers without Parameters
Six point five point three Summary
Six point five point four Exercises
Six point six point one Loading and Saving Tensors
Six point six point three Summary
Six point six point four Exercises
Six point seven point one Computing Devices
Six point seven point two Tensors and GPUs
Six point seven point three Neural Networks and GPUS
Six point seven point four Summary
Six point seven point five Exercises
Seven Convolutional Neural Networks
Seven point one From Fully Connected Layers to Convolutions
Seven point one point one Invariance
Seven point one point two Constraining the MLP
Seven point one point two
Seven point one point three Convolutions
Seven point one point four Channels
Seven point one point five. Summary and Discussion
Seven point one point six. Exercises
Seven point two. Convolutions for Images
Seven point two point one. The Cross-Correlation Operation
Seven point two point two. Convolutional Layers
Seven point two point three. Object Edge Detection in Images
Seven point two point four Learning a Kernel
Seven point two point five Cross-Correlation and Convolution
Seven point two point six Feature Map and Receptive Field
Seven point two point seven Summary
Seven point two point eight Exercises
Seven point three Padding and Stride
Seven point three point one Padding
Seven point three point two Stride
Seven point three point three Summary and Discussion
Seven point three point four Exercises
Seven point four Multiple Input and Multiple Output Channels
Seven point four point one Multiple Input Channels
Seven point four point two Multiple Output Channels
Seven point four point three One by One Convolutional Layer
Seven point four point four Discussion
Seven point four point five Exercises
Seven point five point one Maximum Pooling and Average Pooling
Seven point five point two Padding and Stride
Seven point five point three Multiple Channels
Seven point five point four Summary
Seven point five point five Exercises
Seven point six Convolutional Neural Networks (LeNet)
Seven point six point one LeNet
Seven point six point two Training
Seven point six point three Summary
Seven point six point four Exercises
Eight. Modern Convolutional Neural Networks
Eight point one. Deep Convolutional Neural Networks AlexNet
Eight point one point one. Representation Learning
Missing Ingredient: Hardware
Eight point one point two AlexNet
Capacity Control and Preprocessing
Eight point one point three Training
Eight point one point four Discussion
Eight point one point five Exercises
Eight point two Networks Using Blocks (VGG)
Eight point two point one VGG Blocks
Eight point two point two VGG Network
Eight point two point three Training
Eight point two point four Summary
Eight point two point five Exercises
Eight point three Network in Network (NiN)
Eight point three point one NiN Blocks
Eight point three point two NiN Model
Eight point three point three Training
Eight point three point four Summary
Eight point three point five Exercises
Eight point four Multi-Branch Networks (GoogLeNet)
Eight point four point one Inception Blocks
Eight point four point two GoogLeNet Model
Eight point four point three Training
Eight point four point four Discussion
Eight point four point five Exercises
Eight point five Batch Normalization
Eight point five point one Training Deep Networks
Eight point five point two Batch Normalization Layers
Batch Normalization During Prediction
Eight point five point three Implementation from Scratch
Eight point five point four LeNet with Batch Normalization
Eight point five point five Concise Implementation
Eight point five point six Discussion
Eight point five point seven Exercises
Discussions One hundred thirty-two
Eight point six point one Function Classes
Eight point six point two Residual Blocks
Eight point six point three ResNet Model
Eight point six point four Training
Eight point six point five ResNeXt
Eight point six point six Summary and Discussion
Eight point six point seven Exercises
Eight point seven Densely Connected Networks (DenseNet)
Eight point seven point one From ResNet to DenseNet
Eight point seven point two Dense Blocks
Eight point seven point three Transition Layers
Eight point seven point five Training
Eight point seven point six Summary and Discussion
Eight point seven point seven Exercises
Eight point eight. Designing Convolution Network Architectures
Eight point eight point one. The AnyNet Design Space
Eight point eight point two Distributions and Parameters of Design Spaces
Eight point eight point three RegNet
Eight point eight point four Training
Eight point eight point five Discussion
Eight point eight point six Exercises
Nine Recurrent Neural Networks
Nine point one. Working with Sequences
Nine point one point one. Autoregressive Models
Nine point one point two Sequence Models
Nine point one point three Training
Nine point one point four Prediction
Nine point one point five Summary
Nine point one point six Exercises
Nine point two Converting Raw Text into Sequence Data
Nine point two point one. Reading the Dataset
Nine point two point two. Tokenization
Nine point two point three. Vocabulary
Nine point two point four. Putting It All Together
Nine point two point five. Exploratory Language Statistics
Nine point two point six Summary
Nine point two point seven Exercises
Nine point three Language Models
Nine point three point one Learning Language Models
Markov Models and n-grams
Nine point three point two Perplexity
Nine point three point three Partitioning Sequences
Nine point three point four Summary and Discussion
Nine point three point five Exercises
Discussions one hundred forty.
Nine point four point one Neural Networks without Hidden States
Nine point four point two Recurrent Neural Networks with Hidden States
Nine point four point three RNN-Based Character-Level Language Models
Nine point four point four Summary
Nine point four point five Exercises
Nine point five Recurrent Neural Network Implementation from Scratch
Nine point five point one RNN Model
Nine point five point two R N N-Based Language Model
Transforming R N N Outputs
Nine point five point three Gradient Clipping
Nine point five point four Training
Nine point five point five Decoding
Nine point five point six Summary
Nine point five point seven Exercises
Nine point six Concise Implementation of Recurrent Neural Networks
Nine point six point one Defining the Model
Concise Implementation of Recurrent Neural Networks
Nine point six point two Training and Predicting
Nine point six point three Summary
Nine point six point four Exercises
Nine point seven Backpropagation Through Time
Nine point seven point one Analysis of Gradients in RNNs
Nine point seven point two Backpropagation Through Time in Detail
Nine point seven point three Summary
Nine point seven point four Exercises
Ten Modern Recurrent Neural Networks
Ten point one. Long Short-Term Memory
Ten point one point one. Gated Memory Cell
Input Gate, Forget Gate, and Output Gate
Memory Cell Internal State
Ten point one point two Implementation from Scratch
Initializing Model Parameters
Ten point one point three Concise Implementation
Ten point one point four Summary
Ten point one point five Exercises
Discussions one hundred forty-five
Ten point two point one Reset Gate and Update Gate
Implementation from Scratch
Initializing Model Parameters
Ten point three point one Implementation from Scratch
Ten point three point two Concise Implementation
Ten point three point four Exercises
Ten point four Bidirectional Recurrent Neural Networks
Ten point four point one Implementation from Scratch
Ten point four point two Concise Implementation
Ten point four point three Summary
Ten point four point four Exercises
Ten point five Machine Translation and the Dataset
Ten point five point one Downloading and Preprocessing the Dataset
Ten point five point two Tokenization
Ten point five point three Loading Sequences of Fixed Length
Ten point five point four Reading the Dataset
Ten point five point five Summary
Ten point five point six Exercises
Ten point six The Encoder-Decoder Architecture
Ten point six point one Encoder
Ten point six point two Decoder
The Encoder-Decoder Architecture
Ten point six point three Putting the Encoder and Decoder Together
Ten point six point four Summary
Ten point six point five Exercises
Ten point seven Sequence-to-Sequence Learning for Machine Translation
Ten point seven point one Teacher Forcing
Ten point seven point two Encoder
Ten point seven point three Decoder
Ten point seven point four Encoder-Decoder for Sequence-to-Sequence Learning
Ten point seven point five Loss Function with Masking
Ten point seven point six Training
Ten point seven point seven Prediction
Ten point seven point eight Evaluation of Predicted Sequences
Ten point seven point four
Ten point seven point nine Summary
Ten point seven point ten Exercises
Ten point eight Beam Search
Ten point eight point one Greedy Search
Ten point eight point two Exhaustive Search
Ten point eight point three Beam Search
Ten point eight point four Summary
Ten point eight point five Exercises
Eleven Attention Mechanisms and Transformers
Eleven point one. Queries, Keys, and Values
Eleven point one point two
Eleven point one point three
Eleven point one point one Visualization
Eleven point one point two Summary
Eleven point one point three Exercises
Eleven point two Attention Pooling by Similarity
Eleven point two point one Kernels and Data
Eleven point two point two Attention Pooling via Nadaraya-Watson Regression
Eleven point two point three Adapting Attention Pooling
Eleven point two point four Summary
Eleven point two point five Exercises
Discussions one hundred fifty-six.
Eleven point three point one Dot Product Attention
Eleven point three point two Convenience Functions
Batch Matrix Multiplication
Eleven point three point three Scaled Dot Product Attention
Eleven point three point six
Eleven point three point four Additive Attention
Eleven point three point five Summary
Eleven point three point six Exercises
One hundred fifty-eight Discussions one hundred fifty-eight
Eleven point four point two Defining the Decoder with Attention
Eleven point four point three Training
Eleven point four point four Summary
Eleven point four point five Exercises
Eleven point five point one Model
Eleven point five point two Implementation
Eleven point five point three Summary
Eleven point five point four Exercises
One hundred sixty Discussions one hundred sixty
Eleven point six point one Self-Attention
Eleven point six point two Comparing CNNs, RNNs, and Self-Attention
Eleven point six point three Positional Encoding
Absolute Positional Information
Relative Positional Information
Eleven point six point four Summary
Eleven point six point five Exercises
Discussions one hundred sixty-one.
Eleven point seven point one Model
The Transformer architecture.
Eleven point seven point two Positionwise Feed-Forward Networks
Eleven point seven point three Residual Connection and Layer Normalization
Eleven point seven point four Encoder
Eleven point seven point five Decoder
Eleven point seven point six Training
Eleven point seven point seven Summary
Eleven point seven point eight Exercises
Eleven point eight. Transformers for Vision
Eleven point eight point one. Model
Eleven point eight point two. Patch Embedding
Eleven point eight point three. Vision Transformer Encoder
Eleven point eight point four. Putting It All Together
Eleven point eight point five. Training
Eleven point eight point six. Summary and Discussion
Eleven point eight point seven Exercises
Eleven point nine Large-Scale Pretraining with Transformers
Eleven point nine point one Encoder-Only
Eleven point nine point two Encoder-Decoder
Eleven point nine point three Decoder-Only
Eleven point nine point four Scalability
Eleven point nine point five Large Language Models
Eleven point nine point six Summary and Discussion
Eleven point nine point seven Exercises
Twelve Optimization Algorithms
Twelve point one Optimization and Deep Learning
Twelve point one point one Goal of Optimization
Twelve point one point two Optimization Challenges in Deep Learning
Twelve point one point one
Twelve point one point three Summary
Twelve point one point four Exercises
Twelve point two. Convexity
Twelve point two point one. Definitions
Twelve point two point one
Twelve point two point two Properties
Local Minima Are Global Minima
Convexity and Second Derivatives
Twelve point two point three Constraints
Twelve point two point four Summary
Twelve point two point five Exercises
Twelve point three Gradient Descent
Twelve point three point one One-Dimensional Gradient Descent
Twelve point three point three Adaptive Methods
Gradient Descent with Line Search
Twelve point three point four Summary
Twelve point three point five Exercises
Twelve point four Stochastic Gradient Descent
Twelve point four point one Stochastic Gradient Updates
Twelve point four point two Dynamic Learning Rate
Twelve point four point three Convergence Analysis for Convex Objectives
Twelve point four point four Stochastic Gradients and Finite Samples
Twelve point four point five Summary
Twelve point four point six Exercises
Twelve point five Minibatch Stochastic Gradient Descent
Twelve point five point one Vectorization and Caches
Twelve point five point two Minibatches
Twelve point five point one
Twelve point five point three Reading the Dataset
Twelve point five point four Implementation from Scratch
Twelve point five point five Concise Implementation
Twelve point five point six Summary
Twelve point five point seven Exercises
Discussions one hundred seventy-three
Twelve point six point one Basics
An Ill-conditioned Problem
Twelve point six point two Practical Experiments
Implementation from Scratch
Twelve point six point three Theoretical Analysis
Quadratic Convex Functions
Twelve point six point four Summary
Twelve point six point five Exercises
Twelve point seven Adagrad
Twelve point seven point one Sparse Features and Learning Rates
Twelve point seven point two Preconditioning
Twelve point seven point three The Algorithm
Twelve point seven point four Implementation from Scratch
Twelve point seven point five Concise Implementation
Twelve point seven point seven Exercises
Twelve point eight RMSProp
Twelve point eight point one The Algorithm
Twelve point eight point two Implementation from Scratch
Twelve point eight point three Concise Implementation
Twelve point eight point four Summary
Twelve point eight point five Exercises
Discussions one hundred seventy-nine.
Twelve point nine point one The Algorithm
Twelve point nine point two Implementation
Twelve point nine point three Summary
Twelve point nine point four Exercises
Twelve point ten point one The Algorithm
Twelve point ten point two Implementation
Twelve point ten point three Yogi
Twelve point ten point four Summary
Twelve point ten point five Exercises
Twelve point eleven Learning Rate Scheduling
Twelve point eleven point one Toy Problem
Twelve point eleven point two Schedulers
Twelve point eleven point three Policies
Twelve point eleven point four Summary
Twelve point eleven point five Exercises
Thirteen Computational Performance
Thirteen point one Compilers and Interpreters
Thirteen point one point one Symbolic Programming
Thirteen point one point two Hybrid Programming
Thirteen point one point three Hybridizing the Sequential Class
Acceleration by Hybridization
Thirteen point one point four Summary
Thirteen point one point five Exercises
Discussions one hundred eighty-four.
Thirteen point two point one Asynchrony via Backend
Thirteen point two point two Barriers and Blockers
Thirteen point two point four Summary
Thirteen point two point five Exercises
Discussions one hundred eighty-five
Thirteen point three point one Parallel Computation on GPUs
Thirteen point three point two Parallel Computation and Communication
Thirteen point three point three Summary
Thirteen point three point four Exercises
Thirteen point four point one Computers
Thirteen point four point two Memory
Thirteen point four point three Storage
Thirteen point four point four CPUs
Thirteen point four point five GPUs and other Accelerators
Thirteen point four point six Networks and Buses
Thirteen point four point seven More Latency Numbers
Thirteen point four point eight Summary
Thirteen point four point nine Exercises
Discussions two hundred six
Thirteen point five point one Splitting the Problem
Thirteen point five point two Data Parallelism
Thirteen point five point three A Toy Network
Thirteen point five point four Data Synchronization
Thirteen point five point five Distributing Data
Thirteen point five point six Training
Thirteen point five point seven Summary
Thirteen point five point eight Exercises
Thirteen point six Concise Implementation for Multiple GPUs
Thirteen point six point one A Toy Network
Concise Implementation for Multiple GPUs
Thirteen point six point three Training
Thirteen point six point four Summary
Thirteen point six point five Exercises
Thirteen point seven Parameter Servers
Thirteen point seven point one Data-Parallel Training
Thirteen point seven point two Ring Synchronization
Thirteen point seven point three Multi-Machine Training
Multi-machine multi-GPU distributed parallel training.
Thirteen point seven point four Key-Value Stores
Thirteen point seven point five Summary
Thirteen point seven point six Exercises
Fourteen point one. Image Augmentation
Fourteen point one point one. Common Image Augmentation Methods
Combining Multiple Image Augmentation Methods
Fourteen point one point two. Training with Image Augmentation
Fourteen point one point three Summary
Fourteen point one point four Exercises
Fourteen point two Fine-Tuning
Fourteen point two point one Steps
Fourteen point two point two Hot Dog Recognition
Defining and Initializing the Model
Fourteen point two point three Summary
Fourteen point two point four Exercises
Fourteen point three Object Detection and Bounding Boxes
Fourteen point three point one Bounding Boxes
Fourteen point three point two Summary
Fourteen point three point three Exercises
Fourteen point four Anchor Boxes
Fourteen point four point one Generating Multiple Anchor Boxes
Fourteen point four point two Intersection over Union (IoU)
Fourteen point four point three Labeling Anchor Boxes in Training Data
Assigning Ground-Truth Bounding Boxes to Anchor Boxes
Labeling Classes and Offsets
Fourteen point four point four Predicting Bounding Boxes with Non-Maximum Suppression
The following nms function sorts confidence scores in descending order and returns their indices.
Fourteen point four point five Summary
Fourteen point four point six Exercises
Fourteen point five Multiscale Object Detection
Fourteen point five point one Multiscale Anchor Boxes
Fourteen point five point two Multiscale Detection
Fourteen point five point three Summary
Fourteen point five point four Exercises
Fourteen point six The Object Detection Dataset
Fourteen point six point one Downloading the Dataset
Fourteen point six point two Reading the Dataset
Fourteen point six point three Demonstration
Fourteen point six point four Summary
Fourteen point six point five Exercises
Fourteen point seven Single Shot Multibox Detection
Fourteen point seven point one Model
Bounding Box Prediction Layer
Concatenating Predictions for Multiple Scales
Fourteen point seven point two Training
Reading the Dataset and Initializing the Model
Defining Loss and Evaluation Functions
Fourteen point seven point three Prediction
Fourteen point seven point four Summary
Fourteen point seven point five Exercises
(Fourteen point seven point one)
Fourteen point eight. Region-based CNNs
Fourteen point eight point one R-CNNs
Fourteen point eight point two Fast R-CNN
Fourteen point eight point three Faster R-CNN
Fourteen point eight point four Mask R-CNN
Fourteen point eight point five Summary
Fourteen point eight point six Exercises
Fourteen point nine Semantic Segmentation and the Dataset
Fourteen point nine point one Image Segmentation and Instance Segmentation
Fourteen point nine point two The Pascal VOC two thousand twelve Semantic Segmentation Dataset
Custom Semantic Segmentation Dataset Class
Fourteen point nine point three Summary
Fourteen point nine point four Exercises
Discussions Two hundred twenty-one.
Fourteen point ten point one Basic Operation
Fourteen point ten point two Padding, Strides, and Multiple Channels
Fourteen point ten point three Connection to Matrix Transposition
Fourteen point ten point four Summary
Fourteen point ten point five Exercises
Discussions two hundred twenty-two
Fourteen point eleven point one The Model
Fourteen point eleven point two Initializing Transposed Convolutional Layers
Fourteen point eleven point three Reading the Dataset
Fourteen point eleven point four Training
Fourteen point eleven point five Prediction
Fourteen point eleven point six Summary
Fourteen point eleven point seven Exercises
Fourteen point twelve Neural Style Transfer
Fourteen point twelve point one Method
Fourteen point twelve point two Reading the Content and Style Images
Fourteen point twelve point three Preprocessing and Postprocessing
Fourteen point twelve point four Extracting Features
Fourteen point twelve point five Defining the Loss Function
makes values of neighboring pixels on the synthesized image closer.
Fourteen point twelve point six Initializing the Synthesized Image
Fourteen point twelve point seven Training
Fourteen point twelve point eight Summary
Fourteen point twelve point nine Exercises
Fourteen point thirteen Image Classification (CIFAR-Ten) on Kaggle
Fourteen point thirteen point one Obtaining and Organizing the Dataset
Fourteen point thirteen point two Image Augmentation
Fourteen point thirteen point three Reading the Dataset
Fourteen point thirteen point four Defining the Model
Fourteen point thirteen point five Defining the Training Function
Fourteen point thirteen point six Training and Validating the Model
Fourteen point thirteen point seven Classifying the Testing Set and Submitting Results on Kaggle
Fourteen point thirteen point eight Summary
Fourteen point thirteen point nine Exercises
Fourteen point fourteen Dog Breed Identification ImageNet Dogs on Kaggle
Fourteen point fourteen point one Obtaining and Organizing the Dataset
Fourteen point fourteen point two Image Augmentation
Fourteen point fourteen point three Reading the Dataset
Fourteen point fourteen point four Fine-Tuning a Pretrained Model
Fourteen point fourteen point five Defining the Training Function
Fourteen point fourteen point six Training and Validating the Model
Fourteen point fourteen point eight Summary
Fourteen point fourteen point nine Exercises
Fifteen Natural Language Processing: Pretraining
Fifteen point one Word Embedding (word2vec)
Fifteen point one point one One-Hot Vectors Are a Bad Choice
Fifteen point one point two Self-Supervised word2vec
Fifteen point one point three The Skip-Gram Model
Fifteen point one point four The Continuous Bag of Words (CBOW) Model
Fifteen point one point five Summary
Fifteen point one point six Exercises
Fifteen point two Approximate Training
Fifteen point two point one Negative Sampling
Fifteen point two point two Hierarchical Softmax
Fifteen point two point three Summary
Fifteen point two point four Exercises
Discussions two hundred twenty-nine.
Fifteen point three point one Reading the Dataset
Fifteen point three point two Subsampling
Fifteen point three point three Extracting Center Words and Context Words
Fifteen point three point four Negative Sampling
Fifteen point three point five Loading Training Examples in Minibatches
Fifteen point three point six Putting It All Together
Fifteen point three point seven Summary
Fifteen point three point eight Exercises
Fifteen point four Pretraining word2vec
Fifteen point four point one The Skip-Gram Model
Defining the Forward Propagation
Fifteen point four point two Training
Binary Cross-Entropy Loss
Initializing Model Parameters
Defining the Training Loop
Fifteen point four point three Applying Word Embeddings
Fifteen point four point four Summary
Fifteen point four point five Exercises
Fifteen point five Word Embedding with Global Vectors (GloVe)
Fifteen point five point one Skip-Gram with Global Corpus Statistics
Fifteen point five point two The GloVe Model
Fifteen point five point three Interpreting GloVe from the Ratio of Co-occurrence Probabilities
Fifteen point five point four Summary
Fifteen point five point five Exercises
Fifteen point six Subword Embedding
Fifteen point six point one The fastText Model
Fifteen point six point two Byte Pair Encoding
Fifteen point six point three Summary
Fifteen point six point four Exercises
Fifteen point seven Word Similarity and Analogy
Fifteen point seven point one Loading Pretrained Word Vectors
Fifteen point seven point two Applying Pretrained Word Vectors
Fifteen point seven point three Summary
Fifteen point seven point four Exercises
Fifteen point eight Bidirectional Encoder Representations from Transformers (BERT)
Fifteen point eight point one From Context-Independent to Context-Sensitive
Fifteen point eight point two From Task-Specific to Task-Agnostic
Fifteen point eight point three BERT: Combining the Best of Both Worlds
Fifteen point eight point four Input Representation
Fifteen point eight point five Pretraining Tasks
Fifteen point eight point six Putting It All Together
Fifteen point eight point seven Summary
Fifteen point eight point eight Exercises
Fifteen point nine The Dataset for Pretraining BERT
Fifteen point nine point one. Defining Helper Functions for Pretraining Tasks
Generating the Next Sentence Prediction Task
Generating the Masked Language Modeling Task
Fifteen point nine point two. Transforming Text into the Pretraining Dataset
Fifteen point nine point three Summary
Fifteen point nine point four Exercises
Two hundred thirty-nine Discussions two hundred thirty-nine.
Fifteen point ten point one Pretraining BERT
Fifteen point ten point two Representing Text with BERT
Fifteen point ten point three Summary
Fifteen point ten point four Exercises
Two hundred forty Discussions two hundred forty.
Sixteen point one. Sentiment Analysis and the Dataset
Sixteen point one point one. Reading the Dataset
Sentiment Analysis and the Dataset
Sixteen point one point two. Preprocessing the Dataset
Sixteen point one point three. Creating Data Iterators
Sixteen point one point four. Putting It All Together
Sixteen point one point five. Summary
Sixteen point one point six. Exercises
Sixteen point two point one. Representing Single Text with RNNs
Sixteen point two point two Loading Pretrained Word Vectors
Sixteen point two point three Training and Evaluating the Model
Sixteen point two point four Summary
Sixteen point two point five Exercises
Discussions two hundred forty-four
Sixteen point three point one One-Dimensional Convolutions
Sixteen point three point two Max-Over-Time Pooling
Sixteen point three point three The textCNN Model
Loading Pretrained Word Vectors
Training and Evaluating the Model
Sixteen point three point five Exercises
Sixteen point four Natural Language Inference and the Dataset
Sixteen point four point one Natural Language Inference
Sixteen point four point two The Stanford Natural Language Inference Dataset
Defining a Class for Loading the Dataset
Sixteen point four point three Summary
Sixteen point four point four Exercises
Sixteen point five Natural Language Inference: Using Attention
Sixteen point five point one The Model
Sixteen point five point two Training and Evaluating the Model
Training and Evaluating the Model
Sixteen point five point three Summary
Sixteen point five point four Exercises
Sixteen point six Fine-Tuning BERT for Sequence-Level and Token-Level Applications
Sixteen point six point one Single Text Classification
Sixteen point six point two Text Pair Classification or Regression
Sixteen point six point three Text Tagging
Sixteen point six point four Question Answering
Sixteen point six point five Summary
Sixteen point six point six Exercises
Sixteen point seven Natural Language Inference: Fine-Tuning BERT
Sixteen point seven point one Loading Pretrained BERT
Sixteen point seven point two The Dataset for Fine-Tuning BERT
Sixteen point seven point three Fine-Tuning BERT
Sixteen point seven point four Summary
Sixteen point seven point five Exercises
Seventeen Reinforcement Learning
Seventeen point one Markov Decision Process
Seventeen point one point one Definition of a Markov Decision Process
Seventeen point one point two Return and Discount Factor
Seventeen point one point three Discussion of the Markov Assumption
Seventeen point one point four Summary
Seventeen point one point five Exercises
Seventeen point two Value Iteration
Seventeen point two point one Stochastic Policy
Seventeen point two point two Value Function
Seventeen point two point one
Seventeen point two point two
Seventeen point two point three
Seventeen point two point three Action-Value Function
Seventeen point two point four Optimal Stochastic Policy
Seventeen point two point five Principle of Dynamic Programming
Seventeen point two point six Value Iteration
Seventeen point two point seven Policy Evaluation
17.2.8 Implementation of Value Iteration
Seventeen point two point nine Summary
Seventeen point two point ten Exercises
Seventeen point three point one The Q-Learning Algorithm
Seventeen point three point two An Optimization Problem Underlying Q-Learning
Seventeen point three point three Exploration in Q-Learning
Seventeen point three point four The "Self-correcting" Property of Q-Learning
Seventeen point three point five. Implementation of Q-Learning
Seventeen point three point six. Summary
Seventeen point three point seven. Exercises
Two hundred fifty-seven Discussions two hundred fifty-seven. Gaussian Processes
Eighteen point one. Introduction to Gaussian Processes
Eighteen point one point one Summary
Eighteen point one point two Exercises
Eighteen point two Gaussian Process Priors
Eighteen point two point one Definition
Eighteen point two point two A Simple Gaussian Process
Eighteen point two point three From Weight Space to Function Space
Eighteen point two point four The Radial Basis Function Kernel
Eighteen point two point five The Neural Network Kernel
Eighteen point two point six Summary
Eighteen point two point seven Exercises
Eighteen point three Gaussian Process Inference
Eighteen point three point one Posterior Inference for Regression
Eighteen point three point two Equations for Making Predictions and Learning Kernel Hyperparameters in GP Regression
Eighteen point three point three Interpreting Equations for Learning and Predictions
Eighteen point three point four Worked Example from Scratch
Eighteen point three point five Making Life Easy with GPyTorch
Eighteen point three point six Summary
Eighteen point three point seven Exercises
Discussions two hundred sixty-four
Nineteen point one What Is Hyperparameter Optimization?
Nineteen point one point one The Optimization Problem
Nineteen point one point two Random Search
Nineteen point one point three Summary
Nineteen point one point four Exercises
Nineteen point two Hyperparameter Optimization API
Nineteen point two point one Searcher
Nineteen point two point two Scheduler
Nineteen point two point three Tuner
Nineteen point two point four Bookkeeping the Performance of HPO Algorithms
Nineteen point two point five Example: Optimizing the Hyperparameters of a Convolutional Neural Network
Nineteen point two point six Comparing HPO Algorithms
Nineteen point two point seven Summary
Nineteen point two point eight Exercises
Nineteen point three Asynchronous Random Search
Nineteen point three point one Objective Function
Nineteen point three point two Asynchronous Scheduler
Nineteen point three point three Visualize the Asynchronous Optimization Process
Nineteen point three point five Exercises
Two. Advanced. The goal of this exercise is to implement a new scheduler in Syne Tune.
Nineteen point four point one Successive Halving
Nineteen point four point two Summary
Discussions two hundred seventy-two.
Nineteen point five point one. Objective Function
Nineteen point five point two. Asynchronous Scheduler
Nineteen point five point three. Visualize the Optimization Process
Nineteen point five point four. Summary
Discussions two hundred seventy-three
Twenty point one. Generative Adversarial Networks
Twenty point one point one. Generate Some "Real" Data
Twenty point one point two. Generator
Twenty point one point three. Discriminator
Twenty point one point four. Training
Twenty point one point five. Summary
Twenty point one point six Exercises
Discussions two hundred seventy-five
Twenty point two point two The Generator
Twenty point two point one
Twenty point two point three Discriminator
Twenty point two point two
Twenty point two point three
Twenty point two point four Training
Twenty point two point five Summary
Twenty point two point six Exercises
A point one Geometry and Linear Algebraic Operations
A point one point one Geometry of Vectors
A point one point two. Dot Products and Angles
A point one point three. Hyperplanes
A point one four Geometry of Linear Transformations
A point one five Linear Dependence
A point one point six Rank
A point one point seven Invertibility
A point one point eight Determinant
A point one point nine Tensors and Common Linear Algebra Operations
Common Examples from Linear Algebra
A point one point ten Summary
A point one point eleven Exercises
A point two Eigendecompositions
A point two point one Finding Eigenvalues
A point two point two Decomposing Matrices
A point two point three Operations on Eigendecompositions
A point two point four Eigendecompositions of Symmetric Matrices
A point two point five Gershgorin Circle Theorem
A point two point six A Useful Application: The Growth of Iterated Maps
Eigenvectors as Long Term Behavior
Relating Back to Eigenvectors
A point two point seven Discussion
A point two point eight Summary
A point two point nine Exercises
A point three Single Variable Calculus
A point three point one Differential Calculus
A point three point two Rules of Calculus
A point three point three Summary
A point three point four Exercises
Discussions two hundred eighty
A point four point one Higher-Dimensional Differentiation
A point four point two Geometry of Gradients and Gradient Descent
A point four three A Note on Mathematical Optimization
A point four four Multivariate Chain Rule
Another more subtle example of the chain rule.
A point four five The Backpropagation Algorithm
A point four six Hessians
A point four point seven A Little Matrix Calculus
A point four point eight Summary
A point four point nine Exercises
A point five Integral Calculus
A point five point one Geometric Interpretation
A point five point two The Fundamental Theorem of Calculus
A point five point three Change of Variables
A point five point four A Comment on Sign Conventions
A point five point five Multiple Integrals
A point five point six Change of Variables in Multiple Integrals
A. five point seven Summary
A. five point eight Exercises
A. six point one Continuous Random Variables
From Discrete to Continuous
Probability Density Functions
Cumulative Distribution Functions