Channel Avatar

DataMListic @UCRM1urw2ECVHH7ojJw8MXiQ@youtube.com

9.4K subscribers - no pronouns :c

Welcome to DataMListic (former WhyML)! On this channel I exp


05:40
Marginal, Joint and Conditional Probabilities Explained
04:49
Least Squares vs Maximum Likelihood
03:50
AI Reading List (by Ilya Sutskever) - Part 5
04:27
AI Reading List (by Ilya Sutskever) - Part 4
04:48
AI Reading List (by Ilya Sutskever) - Part 3
05:02
AI Reading List (by Ilya Sutskever) - Part 2
04:31
AI Reading List (by Ilya Sutskever) - Part 1
08:03
Vector Database Search - Hierarchical Navigable Small Worlds (HNSW) Explained
05:40
Singular Value Decomposition (SVD) Explained
03:27
ROUGE Score Explained
05:48
BLEU Score Explained
03:38
Cross-Validation Explained
03:51
Sliding Window Attention (Longformer) Explained
03:36
BART Explained: Denoising Sequence-to-Sequence Pre-training
20:28
RLHF: Training Language Models to Follow Instructions with Human Feedback - Paper Explained
27:43
Chain-of-Verification (COVE) Reduces Hallucination in Large Language Models - Paper Explained
13:59
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits - Paper Explained
05:14
LLM Tokenizers Explained: BPE Encoding, WordPiece and SentencePiece
03:15
Hyperparameters Tuning: Grid Search vs Random Search
26:16
Jailbroken: How Does LLM Safety Training Fail? - Paper Explained
02:44
Word Error Rate (WER) Explained - Measuring the performance of speech recognition systems
03:15
Spearman Correlation Explained in 3 Minutes
03:40
Two Towers vs Siamese Networks vs Triplet Loss - Compute Comparable Embeddings
08:11
LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p
03:21
Kullback-Leibler (KL) Divergence Mathematics Explained
04:36
Covariance and Correlation Explained
07:35
Eigendecomposition Explained
07:24
Multi-Head Attention (MHA), Multi-Query Attention (MQA), Grouped Query Attention (GQA) Explained
04:32
Kabsch-Umeyama Algorithm - How to Align Point Patterns
04:03
How to Fine-tune Large Language Models Like ChatGPT with Low-Rank Adaptation (LoRA)
04:54
Discrete Fourier Transform (DFT and IDFT) Explained in Python
02:55
XGBoost Explained in Under 3 Minutes
03:56
Why Batch Normalization (batchnorm) Works
04:57
Object Detection Part 8: Grounding DINO, Open-Set Object Detection
04:49
Gaussian Mixture Models (GMM) Explained
10:15
Fourier Transform Formula Explained
04:28
Object Detection Part 7: Detection Transformers (DETR), Object Queries
05:07
Object Detection Part 6: The Hungarian Matching Algorithm, Tracking, Bounding Box Matching
06:06
Capsule Networks Explained | Why Using Pooling is a Bad Idea
04:43
Object Detection Part 5: You Only Look Once (YOLO), YOLOv1 Architecture
08:52
Why We Don't Use the Mean Squared Error (MSE) Loss in Classification
05:45
Object Detection Part 4: Mask RCNN, Mask Prediction Branch and Region of Interest Align (ROIAlign)
05:55
Why Language Models Hallucinate
05:33
Object Detection Part 3: Faster R-CNN, Region Proposal Network and Intersection over Union
03:41
Object Detection Part 2: Fast R-CNN, Region Projection and Region of Interest (RoI) Pooling Layer
05:40
Object Detection Part 1: R-CNN, Sliding Window and Selective Search
07:18
How to Select the BEST Threshold for Your Model Using the ROC Curve
06:21
Why We Divide by N-1 in the Sample Variance (Standard Deviation) Formula | The Bessel's Correction
04:18
The Brier Score Explained | Model Calibration
04:09
Gradient Boosting with Regression Trees Explained
04:02
Measuring Artificial Intelligence (AI) Fairness - Disparate Impact Explained
04:47
Why Models Overfit and Underfit - The Bias Variance Trade-off
02:58
Spectral Features - Deltas and Delta-Deltas Explained
05:37
Why Deep Neural Networks (DNNs) Underperform Tree-Based Models on Tabular Data
04:23
Bagging vs Boosting - Ensemble Learning In Machine Learning Explained
08:19
AMSGrad - Why Adam FAILS to Converge
03:27
AdamW Optimizer Explained | L2 Regularization vs Weight Decay
01:54
Why AI Winters Happen....
14:48
Adversarial Discriminative Domain Adaptation (ADDA) Paper Explained
05:32
Why We Perform Feature Scaling In Machine Learning