AI Papers Academy

05:11

LLaMA-Mesh by Nvidia: LLM for 3D Mesh Generation

06:53

Tokenformer: The Next Generation of Transformers?

07:50

Generative Reward Models: Merging the Power of RLHF and RLAIF for Smarter AI

04:50

Writing in the Margins: Better LLM Inference Pattern for Long Context Retrieval

04:33

Sapiens by Meta AI: Foundation for Human Vision Models

07:37

Mixture of Nested Experts: Adaptive Processing of Visual Tokens | AI Paper Explained

04:41

Introduction to Mixture-of-Experts (MoE)

03:54

Mixture-of-Agents (MoA) Enhances Large Language Model Capabilities

04:52

Arithmetic Transformers with Abacus Positional Embeddings | AI Paper Explained

07:26

CLLMs: Consistency Large Language Models | AI Paper Explained

07:30

ReFT: Representation Finetuning for Language Models | AI Paper Explained

09:21

Stealing Part of a Production Language Model | AI Paper Explained

06:10

The Era of 1-bit LLMs by Microsoft | AI Paper Explained

11:35

V-JEPA by Meta AI - A Human-Like Computer Vision Video-based Model

06:50

Self-Rewarding Language Models by Meta AI - Path to Open-Source AGI?

11:59

Fast Inference of Mixture-of-Experts Language Models with Offloading

05:23

TinyGPT-V: Small but Mighty Multimodal Large Language Model

06:28

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

04:32

Introduction to Vision Transformers (ViT) | An Image is Worth 16x16 Words

06:20

Orca 2 by Microsoft: Teaching Small Language Models How to Reason

05:49

LCM-LoRA: From Diffusion Models to Fast SDXL with Latent Consistency Models

05:26

CODEFUSION by Microsoft: A Pre-trained Diffusion Model for Code Generation

09:30

Table-GPT by Microsoft: Empower LLMs To Understand Tables

09:20

Vision Transformers Need Registers - Fixing a Bug in DINOv2?

06:00

Emu by Meta AI: Enhancing Image Generation Models Using Photogenic Needles in a Haystack

09:14

NExT-GPT: Any-to-Any Multimodal LLM

06:27

Large Language Models As Optimizers - OPRO by Google DeepMind

06:46

FACET by Meta AI - Fairness in Computer Vision Evaluation Benchmark

08:31

Code Llama Paper Explained

08:25

WizardMath from Microsoft - Best Open Source Math LLM with Reinforced Evol-Instruct

07:36

Shepherd by Meta AI - A Critic for Large Language Models

07:31

Soft Mixture of Experts - An Efficient Sparse Transformer

06:43

Universal and Transferable LLM Attacks - A New Threat to AI Safety

06:36

Meta-Transformer: A Unified Framework for Multimodal Learning

07:36

Google HyperDreamBooth - HyperNetworks for Fast Personalization of Text-to-Image Models

05:12

LongNet from Microsoft - 1B Tokens Transformer with Dilated Attention

06:51

DreamDiffusion - Thought to Image Generation | Paper Summary

08:30

Wanda Network Pruning - Prune LLMs Efficiently

08:17

I-JEPA from Meta AI - A Human-Like Computer Vision Model | Paper Summary

05:55

Orca from Microsoft - The Future of Imitation Learning?

06:37

StyleDrop from Google AI - Text-to-Image Generation in Any Style!

06:09

LIMA from Meta AI - Less Is More for Alignment of LLMs

06:02

ImageBind from Meta AI - One Embedding Space To Bind Them All

06:02

MPT Model - Extrapolate LLM Context with ALiBi

04:58

YOLO-NAS - A New Best Object Detection Model!

04:38

How WizardLM Got Better Results Than ChatGPT For Complex Instructions?

07:31

DINOv2 from Meta AI - Finally a Foundational Model in Computer Vision?

05:20

Introduction to Consistency Models

17:58

What is a Topological Sort of a Graph and how to find it using Kahn's Algorithm

22:16

DFS Algorithm | Depth First Search Algorithm for Graph Search With Animated Example

14:04

BFS Algorithm | Breadth First Search Algorithm for Graph Search

16:10

Graphs Representations - Adjacency Lists vs Adjacency Matrix

11:25

Introduction to Computer Science | From Algorithm to Running a Program

10:43

HTML Tables Tutorial | How To Create and Customize Tables with HTML

10:56

Introduction to HTML - What are Tags, Elements and Attributes | HTML Document Structure