Alignment Newsletter Podcast

16:43

Alignment Newsletter #173: Recent language model results from DeepMind

05:52

Alignment Newsletter #172: Sorry for the long hiatus!

14:22

Alignment Newsletter #171: Disagreements between alignment "optimists" and "pessimists"

13:02

Alignment Newsletter #170: Analyzing the argument for risk from power-seeking AI

15:08

Alignment Newsletter #169: Collaborating with humans without human data

16:21

Alignment Newsletter #168: Four technical topics for which Open Phil is soliciting grant proposals

17:11

Alignment Newsletter #167: Concrete ML safety problems and their relevance to x-risk

15:42

Alignment Newsletter #166: Is it crazy to claim we're in the most important century?

16:05

Alignment Newsletter #165: When large models are more likely to lie

18:41

Alignment Newsletter #164: How well can language models write code?

19:28

Alignment Newsletter #163: Using finite factored sets for causal and temporal inference

15:46

Alignment Newsletter #162: Foundation models: a paradigm shift within AI

17:39

Alignment Newsletter #161: Creating generalizable reward functions for multiple tasks by...

17:27

Alignment Newsletter #160

27:01

Alignment Newsletter #159: Building agents that know how to experiment, by training on...

15:40

Alignment Newsletter #158: Should we be optimistic about generalization?

14:18

Alignment Newsletter #157: Measuring misalignment in the technology underlying Copilot

14:17

Alignment Newsletter #156: The scaling hypothesis: a plan for building AGI

12:44

Alignment Newsletter #155: A Minecraft benchmark for algorithms that learn without reward functions

16:06

Alignment Newsletter #154: What economic growth theory has to say about transformative AI

15:38

Alignment Newsletter #153: Experiments that demonstrate failures of objective robustness

15:00

Alignment Newsletter #152: How we’ve overestimated few-shot learning capabilities

11:14

Alignment Newsletter #151: How sparsity in the final layer makes a neural net debuggable

12:35

Alignment Newsletter #150: The subtypes of Cooperative AI research

14:15

Alignment Newsletter #149: The newsletter's editorial policy

21:57

Alignment Newsletter #148: Analyzing generalization across more axes than just accuracy or loss

13:29

Alignment Newsletter #147: An overview of the interpretability landscape

15:11

Alignment Newsletter #146: Plausible stories of how we might fail to avert an existential...

13:40

Alignment Newsletter #145: Our three year anniversary!

12:46

Alignment Newsletter #144: How language models can also be finetuned for non-language tasks

14:46

Alignment Newsletter #143: How to make embedded agents that reason probabilistically about their...

15:56

Alignment Newsletter #142: The quest to understand a network well enough to reimplement it by hand

16:01

Alignment Newsletter #141: The case for practicing alignment work on GPT-3 and other large models

19:20

Alignment Newsletter #140: Theoretical models that predict scaling laws

22:15

Alignment Newsletter #139: How the simplicity of reality explains the success of neural nets

16:42

Alignment Newsletter #138: Why AI governance should find problems rather than just solving them

15:48

Alignment Newsletter #137: Quantifying the benefits of pretraining on downstream task performance

17:20

Alignment Newsletter #136: How well will GPT-N perform on downstream tasks?

15:49

Alignment Newsletter #135: Five properties of goal-directed systems

13:18

Alignment Newsletter #134: Underspecification as a cause of fragility to distribution shift

17:13

Alignment Newsletter #133: Building machines that can cooperate (with humans, institutions, or...

17:45

Alignment Newsletter #132: Complex and subtly incorrect arguments as an obstacle to debate

17:06

Alignment Newsletter #131: Formalizing the argument of ignored attributes in a utility function

18:31

Alignment Newsletter #128: Prioritizing research on AI existential safety based on its...

22:57

Alignment Newsletter #127: Rethinking agency: Cartesian frames as a formalization of ways to...

12:09

Alignment Newsletter #130: A new AI x-risk podcast, and reviews of the field

17:00

Alignment Newsletter #126: Avoiding wireheading by decoupling action feedback from action effects

15:00

Alignment Newsletter #123: Inferring what is valuable in order to align recommender systems

18:15

Alignment Newsletter #124: Provably safe exploration through shielding

14:41

Alignment Newsletter #125: Neural network scaling laws across multiple modalities

13:12

Alignment Newsletter #129: Explaining double descent by measuring bias and variance

21:33

Alignment Newsletter #119: AI safety when agents are shaped by environments, not rewards

18:50

Alignment Newsletter #97: Are there historical examples of large, robust discontinuities?

12:32

Alignment Newsletter #115: AI safety research problems in the AI-GA framework

21:11

Alignment Newsletter #108: Why we should scrutinize arguments for AI risk

19:48

Alignment Newsletter #118: Risks, solutions, and prioritization in a world with many AI systems

16:39

Alignment Newsletter #98: Understanding neural net training by seeing which gradients were helpful

20:25

Alignment Newsletter #105: The economic trajectory of humanity, and what we might mean by...

15:41

Alignment Newsletter #104: The perils of inaccessible information, and what we can learn about...

22:53

Alignment Newsletter #100: What might go wrong if you learn a reward function while acting