ClipPhrase

reinforcement learning

Hear how "reinforcement learning" sounds in real speech — 316 examples from 5 videos, movies, and TV shows. Channels: Lex Fridman, The Diary Of A CEO.

316
clips found
5
videos

Examples in video

YouTubeLex Fridman1.3M views · 2024-03-07
Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416
reinforcement learning with human feedback.”
- So you've mentioned RLHF,Why do you still hate reinforcement learning?
Play at 90:00
YouTubeThe Diary Of A CEO2.4M views · 2025-08-18
Brain Experts WARNING: Watch This Before Using ChatGPT Again! (Shocking New Discovery)
reinforcement learning. and and if we”
have basil ganglia. They they don't usewant to make them uh to be adopt a
Play at 24:55
YouTubeLex Fridman4.3M views · 2022-02-26
Mark Zuckerberg: Meta, Facebook, Instagram, and the Metaverse | Lex Fridman Podcast #267
reinforcement learning on it i was gonna”
around and i've just been doingrelease it
Play at 36:20
YouTubeLex Fridman2.1M views · 2023-03-30
Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368
reinforcement learning by human feedback”
showing thathas made the GPT series worse in some
Play at 9:40
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with human feedback. So it's more on the algorithmic side than the”
What was new was adding supervised fine-tuning andarchitecture.
Play at 45:40
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with verifiable rewards training just kind of let the models”
when we say enabled, is almost entirely downstream of the fact that thispick up these skills very easily. So let the models learn,
Play at 50:21
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Reinforcement learning is all about optimizing reward. In practice,”
completions, and these completions are what you're going to grade.you can have a lot of different actors in different parts of the world
Play at 59:57
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with verifiable rewards. You can scale up the training”
- The biggest one from 2025 is learning thisthere, which means doing a lot of this kind of iterative generate-grade
Play at 97:35
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning from human feedback, where in that era the score”
RL gradient updates. The infrastructure evolved fromthey were trying to optimize was a learned reward model of human
Play at 100:05
YouTubeLex Fridman787K views · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Reinforcement Learning with Verifiable Rewards—in real scientific domains,”
- There are interesting bets. A lot of people are trying to do RLVR—where startups with hundreds of millions of funding have wet labs where they're
Play at 193:42
"reinforcement learning" — meaning, examples in real videos & pronunciation | ClipPhrase