Deep Learning
28 videos • 764 views • by DataMListic
1
Why Batch Normalization (batchnorm) Works
DataMListic
Download
2
Capsule Networks Explained | Why Using Pooling is a Bad Idea
DataMListic
Download
3
Why Deep Neural Networks (DNNs) Underperform Tree-Based Models on Tabular Data
DataMListic
Download
4
AMSGrad - Why Adam FAILS to Converge
DataMListic
Download
5
Why Neural Networks Can Learn Any Function | The Universal Approximation Theorem
DataMListic
Download
6
Why Residual Connections (ResNet) Work
DataMListic
Download
7
Why Neural Networks (NN) Are Deep | The Number of Linear Regions of Deep Neural Networks
DataMListic
Download
8
Why ReLU Is Better Than Other Activation Functions | Tanh Saturating Gradients
DataMListic
Download
9
Why The Reset Gate is Necessary in GRUs
DataMListic
Download
10
Why Recurrent Neural Networks (RNN) Suffer from Vanishing Gradients - Part 2
DataMListic
Download
11
Why We Need Activation Functions In Neural Networks
DataMListic
Download
12
Why Convolutional Neural Networks Are Not Permuation Invariant
DataMListic
Download
13
Why Recurrent Neural Networks Suffer from Vanishing Gradients - Part 1
DataMListic
Download
14
Multi-Head Attention (MHA), Multi-Query Attention (MQA), Grouped Query Attention (GQA) Explained
DataMListic
Download
15
How to Fine-tune Large Language Models Like ChatGPT with Low-Rank Adaptation (LoRA)
DataMListic
Download
16
Gated Recurrent Unit (GRU) Equations Explained
DataMListic
Download
17
Long Short-Term Memory (LSTM) Equations Explained
DataMListic
Download
18
LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p
DataMListic
Download
19
Two Towers vs Siamese Networks vs Triplet Loss - Compute Comparable Embeddings
DataMListic
Download
20
LLM Tokenizers Explained: BPE Encoding, WordPiece and SentencePiece
DataMListic
Download
21
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits - Paper Explained
DataMListic
Download
22
Chain-of-Verification (COVE) Reduces Hallucination in Large Language Models - Paper Explained
DataMListic
Download
23
RLHF: Training Language Models to Follow Instructions with Human Feedback - Paper Explained
DataMListic
Download
24
BART Explained: Denoising Sequence-to-Sequence Pre-training
DataMListic
Download
25
Sliding Window Attention (Longformer) Explained
DataMListic
Download
26
BLEU Score Explained
DataMListic
Download
27
ROUGE Score Explained
DataMListic
Download
28
Vector Database Search - Hierarchical Navigable Small Worlds (HNSW) Explained
DataMListic
Download