DataMListic - Poke

DataMListic @UCRM1urw2ECVHH7ojJw8MXiQ@youtube.com

9.4K subscribers - no pronouns :c

Welcome to DataMListic (former WhyML)! On this channel I exp

Videos Shorts Community Playlists

Recently Uploaded Popular Oldest

Marginal, Joint and Conditional Probabilities Explained

Least Squares vs Maximum Likelihood

AI Reading List (by Ilya Sutskever) - Part 5

AI Reading List (by Ilya Sutskever) - Part 4

AI Reading List (by Ilya Sutskever) - Part 3

AI Reading List (by Ilya Sutskever) - Part 2

AI Reading List (by Ilya Sutskever) - Part 1

Vector Database Search - Hierarchical Navigable Small Worlds (HNSW) Explained

Singular Value Decomposition (SVD) Explained

ROUGE Score Explained

BLEU Score Explained

Cross-Validation Explained

Sliding Window Attention (Longformer) Explained

BART Explained: Denoising Sequence-to-Sequence Pre-training

RLHF: Training Language Models to Follow Instructions with Human Feedback - Paper Explained

Chain-of-Verification (COVE) Reduces Hallucination in Large Language Models - Paper Explained

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits - Paper Explained

LLM Tokenizers Explained: BPE Encoding, WordPiece and SentencePiece

Hyperparameters Tuning: Grid Search vs Random Search

Jailbroken: How Does LLM Safety Training Fail? - Paper Explained

Word Error Rate (WER) Explained - Measuring the performance of speech recognition systems

Spearman Correlation Explained in 3 Minutes

Two Towers vs Siamese Networks vs Triplet Loss - Compute Comparable Embeddings

LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p

Kullback-Leibler (KL) Divergence Mathematics Explained

Covariance and Correlation Explained

Eigendecomposition Explained

Multi-Head Attention (MHA), Multi-Query Attention (MQA), Grouped Query Attention (GQA) Explained

Kabsch-Umeyama Algorithm - How to Align Point Patterns

How to Fine-tune Large Language Models Like ChatGPT with Low-Rank Adaptation (LoRA)

Discrete Fourier Transform (DFT and IDFT) Explained in Python

XGBoost Explained in Under 3 Minutes

Why Batch Normalization (batchnorm) Works

Object Detection Part 8: Grounding DINO, Open-Set Object Detection

Gaussian Mixture Models (GMM) Explained

Fourier Transform Formula Explained

Object Detection Part 7: Detection Transformers (DETR), Object Queries

Object Detection Part 6: The Hungarian Matching Algorithm, Tracking, Bounding Box Matching

Capsule Networks Explained | Why Using Pooling is a Bad Idea

Object Detection Part 5: You Only Look Once (YOLO), YOLOv1 Architecture

Why We Don't Use the Mean Squared Error (MSE) Loss in Classification

Object Detection Part 4: Mask RCNN, Mask Prediction Branch and Region of Interest Align (ROIAlign)

Why Language Models Hallucinate

Object Detection Part 3: Faster R-CNN, Region Proposal Network and Intersection over Union

Object Detection Part 2: Fast R-CNN, Region Projection and Region of Interest (RoI) Pooling Layer

Object Detection Part 1: R-CNN, Sliding Window and Selective Search

How to Select the BEST Threshold for Your Model Using the ROC Curve

Why We Divide by N-1 in the Sample Variance (Standard Deviation) Formula | The Bessel's Correction

The Brier Score Explained | Model Calibration

Gradient Boosting with Regression Trees Explained

Measuring Artificial Intelligence (AI) Fairness - Disparate Impact Explained

Why Models Overfit and Underfit - The Bias Variance Trade-off

Spectral Features - Deltas and Delta-Deltas Explained

Why Deep Neural Networks (DNNs) Underperform Tree-Based Models on Tabular Data

Bagging vs Boosting - Ensemble Learning In Machine Learning Explained

AMSGrad - Why Adam FAILS to Converge

AdamW Optimizer Explained | L2 Regularization vs Weight Decay

Why AI Winters Happen....

Adversarial Discriminative Domain Adaptation (ADDA) Paper Explained

Why We Perform Feature Scaling In Machine Learning