AI Coffee Break with Letitia

11:38

GaLore EXPLAINED: Memory-Efficient LLM Training by Gradient Low-Rank Projection

09:59

Shapley Values Explained | Interpretability for AI models, even LLMs!

18:49

Stealing Part of a Production LLM | API protect LLMs no more

09:22

Genie explained 🧞 Generative Interactive Environments paper explained

22:27

MAMBA and State Space Models explained | SSM explained

13:17

Sparse LLMs at inference: 6x faster transformers! | DEJAVU paper explained

19:48

Transformers explained | The architecture behind LLMs

08:55

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained

11:36

LLM hallucinations discover new math solutions!? | FunSearch explained

08:03

DALL-E 3 is better at following Text Prompts! Here is why. — DALL-E 3 explained

13:06

Adversarial Attacks and Defenses. The Dimpled Manifold Hypothesis. David Stutz from DeepMind #HLF23

08:22

What is LoRA? Low-Rank Adaptation for finetuning LLMs EXPLAINED

11:53

Are ChatBots their own death? | Training on Generated Data Makes Models Forget – Paper explained

14:37

The first law on AI regulation | The EU AI Act

50:36

Author Interviews, Poster Highlights, Summary of the ACL 2023 Toronto NLP

04:46

ChatGPT ist not an intelligent agent. It is a cultural technology. – Gopnik Keynote

06:55

[Own work] MM-SHAP to measure modality contributions

14:46

Eight Things to Know about Large Language Models

14:50

Moral Self-Correction in Large Language Models | paper explained

16:39

AI beats us at another game: STRATEGO | DeepNash paper explained

11:35

Why ChatGPT fails | Language Model Limitations EXPLAINED

16:05

"Watermarking Language Models" paper and GPTZero EXPLAINED | How to detect text by ChatGPT?

12:56

Training learned optimizers: VeLO paper EXPLAINED

16:23

ChatGPT vs Sparrow - Battle of Chatbots

10:12

Paella: Text to image FASTER than diffusion models | Paella paper explained

13:28

Generate long form video with Transformers | Phenaki from Google Brain explained

14:38

Movie Diffusion explained | Make-a-Video from MetaAI and Imagen Video from Google Brain

13:16

Beyond neural scaling laws – Paper Explained

13:16

How does Stable Diffusion work? – Latent Diffusion Models EXPLAINED

17:39

Machine Translation for a 1000 languages – Paper explained

09:11

DALLE-2 has a secret language!? | Theories and explanations

15:04

Imagen, the DALL-E 2 competitor from Google Brain, explained 🧠| Diffusion models illustrated

37:20

A New Physics-Inspired Theory of Deep Learning | Optimal initialization of Neural Nets

10:31

[Own work] VALSE 💃: Benchmark for Vision and Language Models Centered on Linguistic Phenomena

16:32

PaLM Pathways Language Model explained | 540 Billion parameters can explain jokes!?

10:47

SEER explained: Vision Models more Robust & Fair when pretrained on UNCURATED images!?

06:49

[Quiz] Regularization in Deep Learning, Lipschitz continuity, Gradient regularization

16:43

Diffusion models explained. How does OpenAI's GLIDE work?

19:15

How do Vision Transformers work? – Paper explained | multi-head self-attention & convolutions

19:20

ConvNeXt: A ConvNet for the 2020s – Paper Explained (with animations)

11:12

[Quiz] Interpretable ML, VQ-VAE w/o Quantization / infinite codebook, Pearson’s, PointClouds

09:42

[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder

18:18

Linear algebra with Transformers – Paper Explained

12:56

Masked Autoencoders Are Scalable Vision Learners – Paper explained and animated!

10:23

The efficiency misnomer | Size does not matter | What does the number of parameters mean in a model?

04:23

Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz

09:13

Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?

12:44

SimVLM explained | What the paper doesn’t tell you

12:56

Data BAD | What Will it Take to Fix Benchmarking for NLU?

11:10

Swin Transformer paper animated and explained

07:53

Eyes tell all: How to tell that an AI generated a face?

14:12

How modern search engines work – Vector databases explained! | Weaviate open-source

15:02

Foundation Models | On the opportunities and risks of calling pre-trained models “Foundation Models”

04:19

What is tokenization and how does it work? Tokenizers explained.

07:42

Data leakage during data preparation? | Using AntiPatterns to avoid MLOps Mistakes

10:18

Self-Attention with Relative Position Representations – Paper explained

09:21

Adding vs. concatenating positional embeddings & Learned positional encodings

09:40

Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

13:20

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

07:53

How cross-modal are vision and language models really? 👀 Seeing past words. [Own work]