ClipPhrase

reinforcement learning

Nghe "reinforcement learning" trong lời nói thực tế — 316 ví dụ từ 5 video, phim và phim bộ. Kênh: Lex Fridman, The Diary Of A CEO.

316
đoạn tìm thấy
5
video

Ví dụ trong video

YouTubeLex Fridman1.3M lượt xem · 2024-03-07
Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416
reinforcement learning with human feedback.”
- So you've mentioned RLHF,Why do you still hate reinforcement learning?
Phát tại 90:00
YouTubeThe Diary Of A CEO2.4M lượt xem · 2025-08-18
Brain Experts WARNING: Watch This Before Using ChatGPT Again! (Shocking New Discovery)
reinforcement learning. and and if we”
have basil ganglia. They they don't usewant to make them uh to be adopt a
Phát tại 24:55
YouTubeLex Fridman4.3M lượt xem · 2022-02-26
Mark Zuckerberg: Meta, Facebook, Instagram, and the Metaverse | Lex Fridman Podcast #267
reinforcement learning on it i was gonna”
around and i've just been doingrelease it
Phát tại 36:20
YouTubeLex Fridman2.1M lượt xem · 2023-03-30
Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368
reinforcement learning by human feedback”
showing thathas made the GPT series worse in some
Phát tại 9:40
YouTubeLex Fridman787K lượt xem · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with human feedback. So it's more on the algorithmic side than the”
What was new was adding supervised fine-tuning andarchitecture.
Phát tại 45:40
YouTubeLex Fridman787K lượt xem · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with verifiable rewards training just kind of let the models”
when we say enabled, is almost entirely downstream of the fact that thispick up these skills very easily. So let the models learn,
Phát tại 50:21
YouTubeLex Fridman787K lượt xem · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Reinforcement learning is all about optimizing reward. In practice,”
completions, and these completions are what you're going to grade.you can have a lot of different actors in different parts of the world
Phát tại 59:57
YouTubeLex Fridman787K lượt xem · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning with verifiable rewards. You can scale up the training”
- The biggest one from 2025 is learning thisthere, which means doing a lot of this kind of iterative generate-grade
Phát tại 97:35
YouTubeLex Fridman787K lượt xem · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
reinforcement learning from human feedback, where in that era the score”
RL gradient updates. The infrastructure evolved fromthey were trying to optimize was a learned reward model of human
Phát tại 100:05
YouTubeLex Fridman787K lượt xem · 2026-01-31
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Reinforcement Learning with Verifiable Rewards—in real scientific domains,”
- There are interesting bets. A lot of people are trying to do RLVR—where startups with hundreds of millions of funding have wet labs where they're
Phát tại 193:42