Poke | Interpretability

Interpretability

40 videos • 365 views • by Tunadorable MechInterp, Theoretic Interpretability, and Functional Interpretability

Turns out the AIs actually UNDERSTAND the world (Paper Breakdown)

Tunadorable
Download

Do Transformers Travel Through an Embedding Vector Landscape? (Paper Breakdown)

Tunadorable
Download

The AI Hydra Effect - Is DeepMind WRONG??? (Paper Breakdown)

Tunadorable
Download

Understanding Deep Learning (Still) Requires Rethinking Generalization (Paper Breakdown)

Tunadorable
Download

How to Win the AI Lottery (Paper Breakdown)

Tunadorable
Download

Explaining Emergent Abilities in AI (Paper Breakdown)

Tunadorable
Download

Neural Networks Meet Holographic Brain Theory - FUTURE of AI??? (Paper Breakdown)

Tunadorable
Download

AI Activation Functions Magnify Infinite-Dimensional Space (Paper Breakdown)

Tunadorable
Download

Local minima aren't a problem in ML training (Paper Breakdown)

Tunadorable
Download

Inceptionism - How AIs Dream and Compute (Paper Breakdown)

Tunadorable
Download

Critical Learning Periods Emerge Even in AI (Paper Breakdown)

Tunadorable
Download

What is meant by "Explainable Model or XAI" (Paper Breakdown)

Tunadorable
Download

Does the AI Loss landscape Have Fractal Crevices?

Tunadorable
Download

Turns Out Language Models Represent Space and Time (paper breakdown)

Tunadorable
Download

Discovering Moral Dimensions in ChatGPT (paper breakdown)

Tunadorable
Download

Loss Landscapes DO Exhibit Fractal Dynamics (paper breakdown)

Tunadorable
Download

Explaining Grokking Through Circuit Efficiency (Paper Breakdown)

Tunadorable
Download

Understanding LLMs Through The Problem They're Trained To Solve (paper breakdown)

Tunadorable
Download

Computational Consciousness vs the Feeling of Being Conscious

Tunadorable
Download

Structured World Representations in Maze-Solving Transformers

Tunadorable
Download

They Found Universal Neurons in GPT-2?!?!?!!?!

Tunadorable
Download

Interview w/ AI Researcher at Meta - Transformers are Multi-State RNNs

Tunadorable
Download

Uncovering Hidden Geometry in Transformers

Tunadorable
Download

Can (Looped) Transformers Learn Iterative Algorithms?

Tunadorable
Download

Hallucination in Language Models is NEVER Going Away

Tunadorable
Download

Deep Networks Always Grok and Here is Why

Tunadorable
Download

Peeking into Residual States

Tunadorable
Download

Mamba has HUNDREDS of Implicit Attention Heads?!?!

Tunadorable
Download

Generalization Benefits of Late Learning Rate Decay

Tunadorable
Download

Is Cosine Similarity Really About Similarity?

Tunadorable
Download

Anisotropy Is Inherent to Self-Attention in Transformers

Tunadorable
Download

What does AI have to do with Plato's Allegory of the Cave?

Tunadorable
Download

LASER: Improving LLMs with Layer-Selective Rank Reduction

Tunadorable
Download

Retrieval Heads Mechanistically Explain Long-Context Factuality

Tunadorable
Download

Anthropic's New Mech-Interp Paper, A Deep Dive

Tunadorable
Download

Transformers Represent Belief State Geometry in their Residual Stream

Tunadorable
Download

Information over-squashing in language tasks

Tunadorable
Download

Underlying Mechanisms Behind Learning Rate Warmup's Success

Tunadorable
Download

parallel processes in multi-hop LLM reasoning

Tunadorable
Download

What would it mean for an AI to "understand"?

Tunadorable
Download