“reinforcement learning”

ฟังว่า "reinforcement learning" ฟังดูอย่างไรในการพูดจริง — 316 ตัวอย่างจาก 5 วิดีโอ ภาพยนตร์และซีรีส์ ช่อง: Lex Fridman, The Diary Of A CEO.

316

คลิปที่พบ

5

วิดีโอ

ตัวอย่างในวิดีโอ

YouTubeLex Fridman1.3M ครั้ง · 2024-03-07

Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416

“reinforcement learning with human feedback.”

- So you've mentioned RLHF, … Why do you still hate reinforcement learning?

เล่นที่ 90:00

YouTubeThe Diary Of A CEO2.4M ครั้ง · 2025-08-18

Brain Experts WARNING: Watch This Before Using ChatGPT Again! (Shocking New Discovery)

“reinforcement learning. and and if we”

have basil ganglia. They they don't use … want to make them uh to be adopt a

เล่นที่ 24:55

YouTubeLex Fridman4.3M ครั้ง · 2022-02-26

Mark Zuckerberg: Meta, Facebook, Instagram, and the Metaverse | Lex Fridman Podcast #267

“reinforcement learning on it i was gonna”

around and i've just been doing … release it

เล่นที่ 36:20

YouTubeLex Fridman2.1M ครั้ง · 2023-03-30

Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368

“reinforcement learning by human feedback”

showing that … has made the GPT series worse in some

เล่นที่ 9:40

YouTubeLex Fridman787K ครั้ง · 2026-01-31

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

“reinforcement learning with human feedback. So it's more on the algorithmic side than the”

What was new was adding supervised fine-tuning and … architecture.

เล่นที่ 45:40

YouTubeLex Fridman787K ครั้ง · 2026-01-31

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

“reinforcement learning with verifiable rewards training just kind of let the models”

when we say enabled, is almost entirely downstream of the fact that this … pick up these skills very easily. So let the models learn,

เล่นที่ 50:21

YouTubeLex Fridman787K ครั้ง · 2026-01-31

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

“Reinforcement learning is all about optimizing reward. In practice,”

completions, and these completions are what you're going to grade. … you can have a lot of different actors in different parts of the world

เล่นที่ 59:57

YouTubeLex Fridman787K ครั้ง · 2026-01-31

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

“reinforcement learning with verifiable rewards. You can scale up the training”

- The biggest one from 2025 is learning this … there, which means doing a lot of this kind of iterative generate-grade

เล่นที่ 97:35

YouTubeLex Fridman787K ครั้ง · 2026-01-31

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

“reinforcement learning from human feedback, where in that era the score”

RL gradient updates. The infrastructure evolved from … they were trying to optimize was a learned reward model of human

เล่นที่ 100:05

YouTubeLex Fridman787K ครั้ง · 2026-01-31

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

“Reinforcement Learning with Verifiable Rewards—in real scientific domains,”

- There are interesting bets. A lot of people are trying to do RLVR— … where startups with hundreds of millions of funding have wet labs where they're

เล่นที่ 193:42

วลีที่เกี่ยวข้อง

break a leg break ground break even ice breaker cold shoulder thin ice