ClipPhrase

reinforcement learning

ฟังว่า "reinforcement learning" ฟังดูอย่างไรในการพูดจริง — 316 ตัวอย่างจาก 5 วิดีโอ ภาพยนตร์และซีรีส์ ช่อง: Lex Fridman, The Diary Of A CEO.

316
คลิปที่พบ
5
วิดีโอ

ตัวอย่างในวิดีโอ

YouTubeLex Fridman1.3M ครั้ง · 2024-03-07
Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416
reinforcement learning with human feedback.”
- So you've mentioned RLHF,Why do you still hate reinforcement learning?
เล่นที่ 90:00
YouTubeThe Diary Of A CEO2.4M ครั้ง · 2025-08-18
Brain Experts WARNING: Watch This Before Using ChatGPT Again! (Shocking New Discovery)
reinforcement learning. and and if we”
have basil ganglia. They they don't usewant to make them uh to be adopt a
เล่นที่ 24:55
YouTubeLex Fridman4.3M ครั้ง · 2022-02-26
Mark Zuckerberg: Meta, Facebook, Instagram, and the Metaverse | Lex Fridman Podcast #267
reinforcement learning on it i was gonna”
around and i've just been doingrelease it
เล่นที่ 36:20
YouTubeLex Fridman2.1M ครั้ง · 2023-03-30
Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368
reinforcement learning by human feedback”
showing thathas made the GPT series worse in some
เล่นที่ 9:40
YouTubeLex Fridman787K ครั้ง · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with human feedback. So it's more on the algorithmic side than the”
What was new was adding supervised fine-tuning andarchitecture.
เล่นที่ 45:40
YouTubeLex Fridman787K ครั้ง · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with verifiable rewards training just kind of let the models”
when we say enabled, is almost entirely downstream of the fact that thispick up these skills very easily. So let the models learn,
เล่นที่ 50:21
YouTubeLex Fridman787K ครั้ง · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Reinforcement learning is all about optimizing reward. In practice,”
completions, and these completions are what you're going to grade.you can have a lot of different actors in different parts of the world
เล่นที่ 59:57
YouTubeLex Fridman787K ครั้ง · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with verifiable rewards. You can scale up the training”
- The biggest one from 2025 is learning thisthere, which means doing a lot of this kind of iterative generate-grade
เล่นที่ 97:35
YouTubeLex Fridman787K ครั้ง · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning from human feedback, where in that era the score”
RL gradient updates. The infrastructure evolved fromthey were trying to optimize was a learned reward model of human
เล่นที่ 100:05
YouTubeLex Fridman787K ครั้ง · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Reinforcement Learning with Verifiable Rewards—in real scientific domains,”
- There are interesting bets. A lot of people are trying to do RLVR—where startups with hundreds of millions of funding have wet labs where they're
เล่นที่ 193:42

วลีที่เกี่ยวข้อง