ClipPhrase

reinforcement learning

"reinforcement learning" ifadesinin gerçek konuşmada nasıl duyulduğunu dinleyin — 5 video, film ve dizilerden 316 örnek. Kanallar: Lex Fridman, The Diary Of A CEO.

316
klip bulundu
5
video

Videodaki örnekler

YouTubeLex Fridman1.3M görüntülenme · 2024-03-07
Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416
reinforcement learning with human feedback.”
- So you've mentioned RLHF,Why do you still hate reinforcement learning?
Oynat 90:00
YouTubeThe Diary Of A CEO2.4M görüntülenme · 2025-08-18
Brain Experts WARNING: Watch This Before Using ChatGPT Again! (Shocking New Discovery)
reinforcement learning. and and if we”
have basil ganglia. They they don't usewant to make them uh to be adopt a
Oynat 24:55
YouTubeLex Fridman4.3M görüntülenme · 2022-02-26
Mark Zuckerberg: Meta, Facebook, Instagram, and the Metaverse | Lex Fridman Podcast #267
reinforcement learning on it i was gonna”
around and i've just been doingrelease it
Oynat 36:20
YouTubeLex Fridman2.1M görüntülenme · 2023-03-30
Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368
reinforcement learning by human feedback”
showing thathas made the GPT series worse in some
Oynat 9:40
YouTubeLex Fridman787K görüntülenme · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with human feedback. So it's more on the algorithmic side than the”
What was new was adding supervised fine-tuning andarchitecture.
Oynat 45:40
YouTubeLex Fridman787K görüntülenme · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with verifiable rewards training just kind of let the models”
when we say enabled, is almost entirely downstream of the fact that thispick up these skills very easily. So let the models learn,
Oynat 50:21
YouTubeLex Fridman787K görüntülenme · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Reinforcement learning is all about optimizing reward. In practice,”
completions, and these completions are what you're going to grade.you can have a lot of different actors in different parts of the world
Oynat 59:57
YouTubeLex Fridman787K görüntülenme · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with verifiable rewards. You can scale up the training”
- The biggest one from 2025 is learning thisthere, which means doing a lot of this kind of iterative generate-grade
Oynat 97:35
YouTubeLex Fridman787K görüntülenme · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning from human feedback, where in that era the score”
RL gradient updates. The infrastructure evolved fromthey were trying to optimize was a learned reward model of human
Oynat 100:05
YouTubeLex Fridman787K görüntülenme · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Reinforcement Learning with Verifiable Rewards—in real scientific domains,”
- There are interesting bets. A lot of people are trying to do RLVR—where startups with hundreds of millions of funding have wet labs where they're
Oynat 193:42