Anything else I missed about Q*? Leave comment & let me know!
AI Jason
100K subscribersLinks - Jim Fan's tweet: https://twitter.com/DrJimFan/status/1728100123862004105 - Reinforcement learning deep dive: ...
Anything else I missed about Q*? Leave comment & let me...
20 Comments
Wes Roth
4.7k views โข 14 hours ago
TheAIGRID
11.2k views โข 23 hours ago
TheAIGRID
31.4k views โข 1 day ago
Wes Roth
39.6k views โข 1 day ago
TheAIGRID
22.7k views โข 3 days ago
Wes Roth
51.0k views โข 4 days ago
TheAIGRID
26.5k views โข 4 days ago
TheAIGRID
9.2k views โข 5 days ago
TheAIGRID
35.6k views โข 6 days ago
Wes Roth
62.8k views โข 6 days ago
Shelf will be hidden for 30 daysUndo
TheAIGRID
45.1k views โข 6 days ago
Wes Roth
27.1k views โข 1 week ago
Wes Roth
55.4k views โข 1 week ago
AI Jason
81.4k views โข 1 week ago
TheAIGRID
36.6k views โข 1 week ago
TheAIGRID
6.3k views โข 1 week ago
TheAIGRID
19.6k views โข 1 week ago
TheAIGRID
13.6k views โข 1 week ago
Wes Roth
51.9k views โข 1 week ago
AI For Humans
598 views โข 1 week ago
Wes Roth
60.9k views โข 1 week ago
TheAIGRID
147.6k views โข 1 week ago
TheAIGRID
42.4k views โข 1 week ago
AI For Humans
3.0k views โข 2 weeks ago
TheAIGRID
30.8k views โข 2 weeks ago
Wes Roth
48.9k views โข 2 weeks ago
Wes Roth
39.1k views โข 2 weeks ago
TheAIGRID
32.0k views โข 2 weeks ago
Wes Roth
59.6k views โข 2 weeks ago
TheAIGRID
33.5k views โข 2 weeks ago
TheAIGRID
85.7k views โข 2 weeks ago
Wes Roth
92.3k views โข 2 weeks ago
Morningside AI
8.0k views โข 2 weeks ago
Wes Roth
139.6k views โข 2 weeks ago
AI For Humans
763 views โข 2 weeks ago
TheAIGRID
28.6k views โข 2 weeks ago
Wes Roth
33.4k views โข 2 weeks ago
AI Explained
123.1k views โข 2 weeks ago
TheAIGRID
457.7k views โข 2 weeks ago
AI For Humans
1.4k views โข 3 weeks ago
TheAIGRID
21.7k views โข 3 weeks ago
Wes Roth
28.5k views โข 3 weeks ago
Wes Roth
48.6k views โข 3 weeks ago
TheAIGRID
398.9k views โข 3 weeks ago
Wes Roth
19.7k views โข 3 weeks ago
TheAIGRID
31.3k views โข 3 weeks ago
AI Jason
38.3k views โข 3 weeks ago
Wes Roth
60.8k views โข 3 weeks ago
Wes Roth
20.6k views โข 3 weeks ago
TheAIGRID
18.1k views โข 3 weeks ago
TheAIGRID
53.2k views โข 3 weeks ago
Wes Roth
46.5k views โข 3 weeks ago
Wes Roth
33.5k views โข 3 weeks ago
Wes Roth
83.3k views โข 3 weeks ago
Wes Roth
35.4k views โข 3 weeks ago
TheAIGRID
40.1k views โข 3 weeks ago
Wes Roth
37.4k views โข 3 weeks ago
Wes Roth
16.2k views โข 3 weeks ago
TheAIGRID
26.4k views โข 3 weeks ago
AI Explained
115.9k views โข 3 weeks ago
AI For Humans
2.9k views โข 4 weeks ago
TheAIGRID
41.8k views โข 4 weeks ago
TheAIGRID
40.3k views โข 4 weeks ago
AI Jason
68.2k views โข 1 month ago
TheAIGRID
40.6k views โข 1 month ago
AI For Humans
373 views โข 1 month ago
TheAIGRID
51.3k views โข 1 month ago
TheAIGRID
19.7k views โข 1 month ago
TheAIGRID
31.6k views โข 1 month ago
TheAIGRID
103.6k views โข 1 month ago
AI For Humans
3.5k views โข 1 month ago
TheAIGRID
45.8k views โข 1 month ago
TheAIGRID
56.0k views โข 1 month ago
AI Explained
115.3k views โข 1 month ago
AI For Humans
2.3k views โข 1 month ago
AI For Humans
1.7k views โข 1 month ago
AI Jason
27.4k views โข 1 month ago
AI For Humans
334 views โข 1 month ago
AI For Humans
2.6k views โข 1 month ago
AI Explained
105.5k views โข 1 month ago
AI Explained
130.2k views โข 1 month ago
AI For Humans
1.5k views โข 1 month ago
AI Jason
203.2k views โข 1 month ago
AI For Humans
1.4k views โข 1 month ago
AI Explained
177.3k views โข 2 months ago
AI Jason
33.0k views โข 2 months ago
AI Explained
150.6k views โข 2 months ago
Andrej Karpathy
453.8k views โข 2 months ago
AI Explained
240.7k views โข 2 months ago
AI Jason
62.2k views โข 2 months ago
AI Explained
187.2k views โข 2 months ago
AI Explained
160.9k views โข 3 months ago
AI Jason
87.8k views โข 3 months ago
AI Explained
270.8k views โข 3 months ago
AI Jason
60.7k views โข 3 months ago
AI Explained
96.5k views โข 3 months ago
AI Jason
7.1k views โข 3 months ago
AI Explained
145.7k views โข 3 months ago
AI Explained
133.0k views โข 4 months ago
AI Explained
79.4k views โข 4 months ago
AI Jason
16.2k views โข 4 months ago
AI Explained
84.0k views โข 4 months ago
AI Explained
74.5k views โข 4 months ago
AI Explained
144.7k views โข 5 months ago
AI Jason
68.7k views โข 5 months ago
Morningside AI
4.1k views โข 5 months ago
AI Explained
83.7k views โข 5 months ago
AI Jason
134.4k views โข 5 months ago
AI Jason
33.3k views โข 5 months ago
Morningside AI
9.7k views โข 5 months ago
AI Explained
229.0k views โข 5 months ago
Andrej Karpathy
1.8M views โข 5 months ago
AI Explained
112.8k views โข 5 months ago
AI Explained
167.2k views โข 5 months ago
AI Explained
156.8k views โข 5 months ago
Morningside AI
26.0k views โข 6 months ago
AI Jason
16.1k views โข 6 months ago
AI Explained
96.4k views โข 6 months ago
AI Jason
68.8k views โข 6 months ago
AI Explained
120.6k views โข 6 months ago
AI Jason
52.6k views โข 6 months ago
AI Jason
19.1k views โข 7 months ago
AI Jason
52.7k views โข 7 months ago
AI Jason
28.6k views โข 7 months ago
AI Jason
13.6k views โข 7 months ago
AI Jason
183.6k views โข 8 months ago
AI Jason
48.3k views โข 8 months ago
AI Jason
26.5k views โข 8 months ago
AI Jason
14.4k views โข 8 months ago
Andrej Karpathy
4.2M views โข 1 year ago
Andrej Karpathy
153.6k views โข 1 year ago
Andrej Karpathy
167.4k views โข 1 year ago
Andrej Karpathy
240.3k views โข 1 year ago
Andrej Karpathy
269.9k views โข 1 year ago
20 Comments
hey , Can you please make a video on detection on some significant insight using the reinforcement learning.
I was curious about making the model to learn itself about the irregular pat     See More
Great overview! Jason, your videos on the AI topic are the best!
00:00 ๐ค "Q Star" is generating a lot of disc     See More
01:08 ๐ฎ Reinforcement learning is a machine learning framework where an agent learns from trial and error, aiming to maximize future rewards. It involves policy networks and value networks.
03:25 ๐ง Reinforcement learning allows AI agents to self-play and discover new strategies, as demonstrated by DeepMind's achievements in games like Breakout and AlphaGo.
08:01 ๐ There's speculation that "Q Star" could involve using policy networks and value networks, similar to AlphaGo, to improve reasoning and logic in large language models like GPT.
11:14 ๐ You can experiment with reinforcement learning in simple games with open-source projects, even if you're new to the field.
How does the reward system work for reinforcing behavior beyond Pavlovian bell sounds that signal approval?
๐ Summary of Key Points:
๐ Reinforced learning is a machine learning framework that allows AI to learn from its own trials and errors by receiving rewards or penalties based o     See More
I think Q* must be OPEN SOURCE for benefit humanity. Not only for big companies.
Q is question and * is repeat, so make sintezis of lot answer you got general inteligent ansver. My noob opinion.
Anything else I missed about Q*? Leave comment & let me know!     See Less
hey , Can you please make a video on detection on some significant insight using the reinforcement learning.
I was curious about making the model to learn itself about the irregular pat     See More needs to be classified using the reinforcement learning    See Less
Great overview! Jason, your videos on the AI topic are the best!
00:00 ๐ค "Q Star" is generating a lot of disc     See More the AI community, and it's associated with OpenAI's recent actions, but its exact nature remains speculative.
01:08 ๐ฎ Reinforcement learning is a machine learning framework where an agent learns from trial and error, aiming to maximize future rewards. It involves policy networks and value networks.
03:25 ๐ง Reinforcement learning allows AI agents to self-play and discover new strategies, as demonstrated by DeepMind's achievements in games like Breakout and AlphaGo.
08:01 ๐ There's speculation that "Q Star" could involve using policy networks and value networks, similar to AlphaGo, to improve reasoning and logic in large language models like GPT.
11:14 ๐ You can experiment with reinforcement learning in simple games with open-source projects, even if you're new to the field.    See Less
How does the reward system work for reinforcing behavior beyond Pavlovian bell sounds that signal approval?     See Less
Can't wait for it to be open sourced ๐     See Less
๐ Summary of Key Points:
๐ Reinforced learning is a machine learning framework that allows AI to learn from its own trials and errors by receiving rewards or penalties based o     See More ons.
๐ง AI systems like DeepMind's AlphaGo have achieved superhuman performance in tasks through reinforced learning, discovering new strategies in the process.
๐ Reinforced learning could be applied to large language models like GPT, improving reasoning and logic capabilities by proposing multiple solutions and evaluating their value.
๐ OpenAI's research paper "Let's Verify Step by Step" explores a reward model for large language models, involving another model critiquing the reasoning process for better results.
๐ก Additional Insights and Observations:
๐ฌ "The ability of AI to explore different paths and uncover novel solutions is seen as a promising development."
๐ No specific data or statistics were mentioned in the video.
๐ OpenAI's research paper "Let's Verify Step by Step" can be referenced for further information on the reward model for large language models.
๐ฃ Concluding Remarks:
Reinforced learning is a powerful framework in AI that allows machines to learn from their own experiences. It has shown remarkable success in tasks like playing games and could potentially enhance the reasoning and logic capabilities of large language models. OpenAI's recent breakthrough, qar, has sparked excitement and speculation within the AI community, and further research, like the "Let's Verify Step by Step" paper, is exploring new ways to improve language models through reinforced learning.
Generated using Talkbud (Browser Extension)    See Less
Very well organized and informative presentation.     See Less
I think Q* must be OPEN SOURCE for benefit humanity. Not only for big companies.     See Less
Open*AI     See Less
Q is question and * is repeat, so make sintezis of lot answer you got general inteligent ansver. My noob opinion.     See Less