Channel Avatar

Gabriel Mongaras @UCYUq87t77YNTG5m256fOXeQ@youtube.com

8.5K subscribers - no pronouns :c

Just some guy making exploring and making videos about curre


01:13:10
Deterministic Image Editing with DDPM Inversion, DDIM Inversion, Null Inversion and Prompt-to-Prompt
42:25
Attending to Topological Spaces: The Cellular Transformer
35:52
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
52:39
WARP: On the Benefits of Weight Averaged Rewarded Policies
28:52
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
01:14:43
Mamba 2 - Transformers are SSMs: Generalized Models and Efficient Algorithms Through SSS Duality
38:55
CoPE - Contextual Position Encoding: Learning to Count What's Important
45:48
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
43:26
xLSTM: Extended Long Short-Term Memory
37:09
KAN: Kolmogorov-Arnold Networks
30:07
LADD: Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation
37:00
Visual AutoRegressive Modeling:Scalable Image Generation via Next-Scale Prediction
32:49
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
40:14
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
04:54
Q* AGI Achieved (Apr Fools)
01:02:30
Stable Diffusion 3: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
37:08
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
46:25
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits and BitNet
31:15
DoRA: Weight-Decomposed Low-Rank Adaptation
01:02:38
OpenAI Sora and DiTs: Scalable Diffusion Models with Transformers
33:55
A Decoder-only Foundation Model For Time-series Forecasting
37:30
Lumiere: A Space-Time Diffusion Model for Video Generation
28:56
Exphormer: Sparse Transformers for Graphs
25:56
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
40:23
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
29:38
Cached Transformers: Improving Transformers with Differentiable Memory Cache
39:02
Translatotron 3: Speech to Speech Translation with Monolingual Data
44:02
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
47:32
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
28:39
Adversarial Diffusion Distillation
40:51
Unsupervised Discovery of Semantic Latent Directions in Diffusion Models
18:45
DALL-E 3 - Improving Image Generation with Better Captions
38:18
LRM: Large Reconstruction Model for Single Image to 3D
30:46
CodeFusion: A Pre-trained Diffusion Model for Code Generation
22:14
Matryoshka Diffusion Models Explained
36:04
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
57:43
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
33:27
StreamingLLM - Efficient Streaming Language Models with Attention Sinks Explained
28:51
FreeU: Free Lunch in Diffusion U-Net Explained
26:26
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation Explained
50:20
Llama/Wizard LM Finetuning with Huggingface on RunPod
50:14
2x Faster Language Model Pre-training via Masked Structural Growth
53:53
Bayesian Flow Networks (BFN) Explained
33:54
WizardLM: Empowering Large Language Models to Follow Complex Instructions Explained
43:59
From Sparse to Soft Mixtures of Experts Explained
42:16
BK-SDM: Architecturally Compressed Stable Diffusion for Efficient T2I Generation Explained
36:25
Direct Preference Optimization (DPO): Your Language Model is Secretly a Reward Model Explained
31:51
Universal and Transferable Adversarial Attacks on Aligned Language Models Explained
45:45
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Explained
47:16
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations Explained
35:57
ReLoRA: Stack More Layers Differently: High-Rank Training Through Low-Rank Updates Explained
43:49
MiniLLM: Knowledge Distillation of Large Language Models
01:09:57
RetNet: A Successor to Transformer for Large Language Models Explained
54:21
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Explained
39:17
Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for LLMs Explained
01:00:14
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Explained
37:21
LongNet: Scaling Transformers to 1,000,000,000 Tokens Explained
29:17
Extending Context Window of Large Language Models via Positional Interpolation Explained
39:52
RoFormer: Enhanced Transformer with Rotary Position Embedding Explained
37:47
RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation Explained