“reinforcement learning”
Hear how "reinforcement learning" sounds in real speech — 316 examples from 5 videos, movies, and TV shows. Channels: Lex Fridman, The Diary Of A CEO.
316
clips found
5
videos
Examples in video
YouTubeLex Fridman1.3M views · 2024-03-07
Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416
“reinforcement learning
with human feedback.”
- So you've mentioned RLHF, … Why do you still hate
reinforcement learning?
Play at 90:00
YouTubeThe Diary Of A CEO2.4M views · 2025-08-18
Brain Experts WARNING: Watch This Before Using ChatGPT Again! (Shocking New Discovery)
“reinforcement learning. and and if we”
have basil ganglia. They they don't use … want to make them uh to be adopt a
Play at 24:55
YouTubeLex Fridman4.3M views · 2022-02-26
Mark Zuckerberg: Meta, Facebook, Instagram, and the Metaverse | Lex Fridman Podcast #267
“reinforcement learning on it i was gonna”
around and i've just been doing … release it
Play at 36:20
YouTubeLex Fridman2.1M views · 2023-03-30
Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368
“reinforcement learning by human feedback”
showing that … has made the GPT series worse in some
Play at 9:40
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
“reinforcement learning with human feedback.
So it's more on the algorithmic side than the”
What was new was adding
supervised fine-tuning and … architecture.
Play at 45:40
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
“reinforcement learning with verifiable
rewards training just kind of let the models”
when we say enabled, is almost entirely
downstream of the fact that this … pick up these skills very
easily. So let the models learn,
Play at 50:21
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
“Reinforcement learning is all about
optimizing reward. In practice,”
completions, and these completions
are what you're going to grade. … you can have a lot of different actors
in different parts of the world
Play at 59:57
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
“reinforcement learning with verifiable
rewards. You can scale up the training”
- The biggest one from 2025 is learning this … there, which means doing a lot of
this kind of iterative generate-grade
Play at 97:35
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
“reinforcement learning from human
feedback, where in that era the score”
RL gradient updates. The
infrastructure evolved from … they were trying to optimize was
a learned reward model of human
Play at 100:05
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
“Reinforcement Learning with Verifiable
Rewards—in real scientific domains,”
- There are interesting bets. A lot
of people are trying to do RLVR— … where startups with hundreds of millions
of funding have wet labs where they're
Play at 193:42