Anything else I missed about Q*? Leave comment & let me know!
AI Jason
118K subscribersLinks - Jim Fan's tweet: https://twitter.com/DrJimFan/status/1728100123862004105 - Reinforcement learning deep dive: ...
Anything else I missed about Q*? Leave comment & let me...
22 Comments
TheAIGRID
22.1k views โข 3 days ago
AI Explained
103.0k views โข 1 month ago
AI Explained
117.6k views โข 1 month ago
TheAIGRID
62.7k views โข 1 month ago
Wes Roth
35.0k views โข 1 month ago
AI For Humans
5.4k views โข 2 months ago
Andrej Karpathy
34.7k views โข 2 months ago
TheAIGRID
217 views โข 3 months ago
TheAIGRID
48.5k views โข 3 months ago
AI For Humans
5.1k views โข 3 months ago
Shelf will be hidden for 30 daysUndo
Wes Roth
22.8k views โข 3 months ago
AI For Humans
1.0k views โข 3 months ago
Wes Roth
75.1k views โข 3 months ago
Wes Roth
43.1k views โข 3 months ago
Wes Roth
19.1k views โข 3 months ago
Wes Roth
6.4k views โข 3 months ago
Wes Roth
5.2k views โข 3 months ago
Wes Roth
9.0k views โข 4 months ago
Wes Roth
15.7k views โข 4 months ago
Wes Roth
2.3k views โข 4 months ago
Wes Roth
10.8k views โข 5 months ago
AI Jason
56.9k views โข 5 months ago
Wes Roth
82.8k views โข 5 months ago
Wes Roth
17.6k views โข 5 months ago
Wes Roth
81.6k views โข 5 months ago
Wes Roth
70.8k views โข 5 months ago
Wes Roth
183.9k views โข 5 months ago
Wes Roth
35.0k views โข 5 months ago
Wes Roth
53.5k views โข 6 months ago
Wes Roth
56.7k views โข 6 months ago
TheAIGRID
15.6k views โข 6 months ago
Wes Roth
59.7k views โข 6 months ago
AI For Humans
626 views โข 6 months ago
Wes Roth
10.6k views โข 6 months ago
Wes Roth
19.1k views โข 7 months ago
Wes Roth
53.5k views โข 7 months ago
Wes Roth
70.3k views โข 7 months ago
Wes Roth
49.0k views โข 7 months ago
Wes Roth
21.3k views โข 7 months ago
Wes Roth
40.2k views โข 7 months ago
Wes Roth
59.4k views โข 7 months ago
Wes Roth
55.4k views โข 7 months ago
AI Jason
56.1k views โข 7 months ago
Wes Roth
25.4k views โข 7 months ago
Wes Roth
40.3k views โข 7 months ago
Wes Roth
100.8k views โข 7 months ago
Wes Roth
90.4k views โข 7 months ago
AI Jason
13.0k views โข 7 months ago
Wes Roth
67.1k views โข 7 months ago
Wes Roth
34.6k views โข 7 months ago
Wes Roth
84.7k views โข 7 months ago
Wes Roth
42.6k views โข 8 months ago
Wes Roth
31.7k views โข 8 months ago
Wes Roth
48.3k views โข 8 months ago
Wes Roth
43.4k views โข 8 months ago
Wes Roth
25.9k views โข 8 months ago
Wes Roth
69.4k views โข 8 months ago
Wes Roth
46.3k views โข 8 months ago
Wes Roth
34.1k views โข 8 months ago
Wes Roth
75.3k views โข 8 months ago
AI Jason
15.0k views โข 8 months ago
AI For Humans
1.8k views โข 8 months ago
Andrej Karpathy
686.3k views โข 9 months ago
AI Jason
17.6k views โข 9 months ago
TheAIGRID
21.0k views โข 9 months ago
TheAIGRID
29.2k views โข 9 months ago
TheAIGRID
36.3k views โข 9 months ago
TheAIGRID
10.5k views โข 9 months ago
TheAIGRID
61.1k views โข 9 months ago
AI For Humans
5.6k views โข 9 months ago
TheAIGRID
14.1k views โข 9 months ago
AI Explained
151.7k views โข 9 months ago
TheAIGRID
4.9k views โข 9 months ago
TheAIGRID
95.1k views โข 9 months ago
TheAIGRID
16.8k views โข 9 months ago
TheAIGRID
54.5k views โข 9 months ago
TheAIGRID
43.5k views โข 9 months ago
TheAIGRID
18.7k views โข 9 months ago
TheAIGRID
30.1k views โข 9 months ago
TheAIGRID
39.1k views โข 9 months ago
AI Jason
75.1k views โข 9 months ago
TheAIGRID
176.2k views โข 9 months ago
TheAIGRID
37.7k views โข 9 months ago
TheAIGRID
17.5k views โข 9 months ago
TheAIGRID
35.1k views โข 9 months ago
AI Explained
388.7k views โข 9 months ago
TheAIGRID
71.3k views โข 9 months ago
TheAIGRID
55.3k views โข 9 months ago
TheAIGRID
6.2k views โข 10 months ago
TheAIGRID
27.9k views โข 10 months ago
TheAIGRID
14.6k views โข 10 months ago
AI For Humans
948 views โข 10 months ago
TheAIGRID
20.8k views โข 10 months ago
TheAIGRID
25.3k views โข 10 months ago
TheAIGRID
36.6k views โข 10 months ago
AI Explained
129.2k views โข 10 months ago
AI Explained
97.7k views โข 10 months ago
AI For Humans
5.7k views โข 10 months ago
AI Jason
354.8k views โข 10 months ago
AI For Humans
667 views โข 10 months ago
AI For Humans
3.5k views โข 10 months ago
Morningside AI
13.5k views โข 10 months ago
AI For Humans
781 views โข 10 months ago
AI Explained
129.9k views โข 10 months ago
AI For Humans
1.5k views โข 10 months ago
AI Jason
49.4k views โข 10 months ago
AI Explained
118.4k views โข 11 months ago
AI For Humans
3.0k views โข 11 months ago
AI Jason
113.7k views โข 11 months ago
AI For Humans
387 views โข 11 months ago
AI For Humans
3.6k views โข 11 months ago
AI Explained
118.3k views โข 11 months ago
AI For Humans
2.3k views โข 11 months ago
AI For Humans
1.7k views โข 11 months ago
AI Jason
30.7k views โข 11 months ago
AI For Humans
339 views โข 11 months ago
AI For Humans
2.6k views โข 11 months ago
AI Explained
106.4k views โข 11 months ago
AI Explained
131.0k views โข 11 months ago
AI For Humans
1.5k views โข 11 months ago
AI Jason
218.6k views โข 1 year ago
AI For Humans
1.4k views โข 1 year ago
AI For Humans
1.8k views โข 1 year ago
AI Explained
181.1k views โข 1 year ago
AI Jason
35.1k views โข 1 year ago
AI Explained
151.1k views โข 1 year ago
Andrej Karpathy
482.7k views โข 1 year ago
AI Explained
241.8k views โข 1 year ago
AI Jason
63.7k views โข 1 year ago
AI Explained
187.7k views โข 1 year ago
AI Explained
161.6k views โข 1 year ago
AI Jason
91.0k views โข 1 year ago
AI Explained
272.8k views โข 1 year ago
AI Jason
61.4k views โข 1 year ago
AI Explained
96.8k views โข 1 year ago
AI Jason
7.2k views โข 1 year ago
AI Explained
145.9k views โข 1 year ago
AI Explained
133.4k views โข 1 year ago
AI Explained
79.5k views โข 1 year ago
AI Jason
16.8k views โข 1 year ago
AI Explained
84.1k views โข 1 year ago
AI Explained
74.6k views โข 1 year ago
AI Explained
144.9k views โข 1 year ago
AI Jason
75.2k views โข 1 year ago
Morningside AI
4.1k views โข 1 year ago
AI Explained
83.7k views โข 1 year ago
AI Jason
140.3k views โข 1 year ago
AI Jason
33.7k views โข 1 year ago
Morningside AI
9.8k views โข 1 year ago
AI Explained
229.6k views โข 1 year ago
Andrej Karpathy
1.9M views โข 1 year ago
AI Explained
112.8k views โข 1 year ago
Morningside AI
26.1k views โข 1 year ago
AI Jason
16.3k views โข 1 year ago
AI Jason
71.9k views โข 1 year ago
AI Jason
53.8k views โข 1 year ago
AI Jason
20.4k views โข 1 year ago
AI Jason
53.4k views โข 1 year ago
AI Jason
28.9k views โข 1 year ago
Andrej Karpathy
4.3M views โข 2 years ago
Andrej Karpathy
157.3k views โข 2 years ago
Andrej Karpathy
172.8k views โข 2 years ago
Andrej Karpathy
247.8k views โข 2 years ago
Andrej Karpathy
278.3k views โข 2 years ago
22 Comments
hey , Can you please make a video on detection on some significant insight using the reinforcement learning.
I was curious about making the model to learn itself about the irregular pat     See More
Great overview! Jason, your videos on the AI topic are the best!
00:00 ๐ค "Q Star" is generating a lot of disc     See More
01:08 ๐ฎ Reinforcement learning is a machine learning framework where an agent learns from trial and error, aiming to maximize future rewards. It involves policy networks and value networks.
03:25 ๐ง Reinforcement learning allows AI agents to self-play and discover new strategies, as demonstrated by DeepMind's achievements in games like Breakout and AlphaGo.
08:01 ๐ There's speculation that "Q Star" could involve using policy networks and value networks, similar to AlphaGo, to improve reasoning and logic in large language models like GPT.
11:14 ๐ You can experiment with reinforcement learning in simple games with open-source projects, even if you're new to the field.
How does the reward system work for reinforcing behavior beyond Pavlovian bell sounds that signal approval?
๐ Summary of Key Points:
๐ Reinforced learning is a machine learning framework that allows AI to learn from its own trials and errors by receiving rewards or penalties based o     See More
I think Q* must be OPEN SOURCE for benefit humanity. Not only for big companies.
Q is question and * is repeat, so make sintezis of lot answer you got general inteligent ansver. My noob opinion.
Anything else I missed about Q*? Leave comment & let me know!     See Less
hey , Can you please make a video on detection on some significant insight using the reinforcement learning.
I was curious about making the model to learn itself about the irregular pat     See More needs to be classified using the reinforcement learning    See Less
Great overview! Jason, your videos on the AI topic are the best!
00:00 ๐ค "Q Star" is generating a lot of disc     See More the AI community, and it's associated with OpenAI's recent actions, but its exact nature remains speculative.
01:08 ๐ฎ Reinforcement learning is a machine learning framework where an agent learns from trial and error, aiming to maximize future rewards. It involves policy networks and value networks.
03:25 ๐ง Reinforcement learning allows AI agents to self-play and discover new strategies, as demonstrated by DeepMind's achievements in games like Breakout and AlphaGo.
08:01 ๐ There's speculation that "Q Star" could involve using policy networks and value networks, similar to AlphaGo, to improve reasoning and logic in large language models like GPT.
11:14 ๐ You can experiment with reinforcement learning in simple games with open-source projects, even if you're new to the field.    See Less
How does the reward system work for reinforcing behavior beyond Pavlovian bell sounds that signal approval?     See Less
Can't wait for it to be open sourced ๐     See Less
๐ Summary of Key Points:
๐ Reinforced learning is a machine learning framework that allows AI to learn from its own trials and errors by receiving rewards or penalties based o     See More ons.
๐ง AI systems like DeepMind's AlphaGo have achieved superhuman performance in tasks through reinforced learning, discovering new strategies in the process.
๐ Reinforced learning could be applied to large language models like GPT, improving reasoning and logic capabilities by proposing multiple solutions and evaluating their value.
๐ OpenAI's research paper "Let's Verify Step by Step" explores a reward model for large language models, involving another model critiquing the reasoning process for better results.
๐ก Additional Insights and Observations:
๐ฌ "The ability of AI to explore different paths and uncover novel solutions is seen as a promising development."
๐ No specific data or statistics were mentioned in the video.
๐ OpenAI's research paper "Let's Verify Step by Step" can be referenced for further information on the reward model for large language models.
๐ฃ Concluding Remarks:
Reinforced learning is a powerful framework in AI that allows machines to learn from their own experiences. It has shown remarkable success in tasks like playing games and could potentially enhance the reasoning and logic capabilities of large language models. OpenAI's recent breakthrough, qar, has sparked excitement and speculation within the AI community, and further research, like the "Let's Verify Step by Step" paper, is exploring new ways to improve language models through reinforced learning.
Generated using Talkbud (Browser Extension)    See Less
Very well organized and informative presentation.     See Less
I think Q* must be OPEN SOURCE for benefit humanity. Not only for big companies.     See Less
Open*AI     See Less
Q is question and * is repeat, so make sintezis of lot answer you got general inteligent ansver. My noob opinion.     See Less