Andrej, please train an llm with dilated convs, please show people we were doing it wrong this whole time, I know you can do it.
Andrej Karpathy
446K subscribersWe take the 2-layer MLP from previous video and make it deeper with a tree-like structure, arriving at a convolutional neural ...
Andrej, please train an llm with dilated convs, please show...
29 Comments
Wes Roth
4.7k views • 5 hours ago
TheAIGRID
11.2k views • 14 hours ago
TheAIGRID
31.4k views • 22 hours ago
Wes Roth
39.6k views • 1 day ago
TheAIGRID
22.7k views • 3 days ago
Wes Roth
51.0k views • 4 days ago
TheAIGRID
26.5k views • 4 days ago
TheAIGRID
9.2k views • 4 days ago
TheAIGRID
35.6k views • 5 days ago
Wes Roth
62.8k views • 6 days ago
Shelf will be hidden for 30 daysUndo
TheAIGRID
45.1k views • 6 days ago
Wes Roth
27.1k views • 1 week ago
Wes Roth
55.4k views • 1 week ago
AI Jason
81.4k views • 1 week ago
TheAIGRID
36.6k views • 1 week ago
TheAIGRID
6.3k views • 1 week ago
TheAIGRID
19.6k views • 1 week ago
TheAIGRID
13.6k views • 1 week ago
Wes Roth
51.9k views • 1 week ago
AI For Humans
598 views • 1 week ago
Wes Roth
60.9k views • 1 week ago
TheAIGRID
147.6k views • 1 week ago
TheAIGRID
42.4k views • 1 week ago
AI For Humans
3.0k views • 1 week ago
TheAIGRID
30.8k views • 2 weeks ago
Wes Roth
48.9k views • 2 weeks ago
Wes Roth
39.1k views • 2 weeks ago
TheAIGRID
32.0k views • 2 weeks ago
Wes Roth
59.6k views • 2 weeks ago
TheAIGRID
33.5k views • 2 weeks ago
TheAIGRID
85.7k views • 2 weeks ago
Wes Roth
92.3k views • 2 weeks ago
Morningside AI
8.0k views • 2 weeks ago
Wes Roth
139.6k views • 2 weeks ago
AI For Humans
763 views • 2 weeks ago
TheAIGRID
28.6k views • 2 weeks ago
Wes Roth
33.4k views • 2 weeks ago
AI Explained
123.1k views • 2 weeks ago
TheAIGRID
457.7k views • 2 weeks ago
AI For Humans
1.4k views • 2 weeks ago
TheAIGRID
21.7k views • 3 weeks ago
Wes Roth
28.5k views • 3 weeks ago
Wes Roth
48.6k views • 3 weeks ago
TheAIGRID
398.9k views • 3 weeks ago
Wes Roth
19.7k views • 3 weeks ago
TheAIGRID
31.3k views • 3 weeks ago
AI Jason
38.3k views • 3 weeks ago
Wes Roth
60.8k views • 3 weeks ago
Wes Roth
20.6k views • 3 weeks ago
TheAIGRID
18.1k views • 3 weeks ago
TheAIGRID
53.2k views • 3 weeks ago
Wes Roth
46.5k views • 3 weeks ago
Wes Roth
33.5k views • 3 weeks ago
Wes Roth
83.3k views • 3 weeks ago
Wes Roth
35.4k views • 3 weeks ago
TheAIGRID
40.1k views • 3 weeks ago
Wes Roth
37.4k views • 3 weeks ago
Wes Roth
16.2k views • 3 weeks ago
TheAIGRID
26.4k views • 3 weeks ago
AI Explained
115.9k views • 3 weeks ago
AI For Humans
2.9k views • 3 weeks ago
TheAIGRID
41.8k views • 3 weeks ago
TheAIGRID
40.3k views • 4 weeks ago
AI Jason
68.2k views • 4 weeks ago
TheAIGRID
40.6k views • 1 month ago
AI For Humans
373 views • 1 month ago
TheAIGRID
51.3k views • 1 month ago
TheAIGRID
19.7k views • 1 month ago
TheAIGRID
31.6k views • 1 month ago
TheAIGRID
103.6k views • 1 month ago
AI For Humans
3.5k views • 1 month ago
TheAIGRID
45.8k views • 1 month ago
TheAIGRID
56.0k views • 1 month ago
AI Explained
115.3k views • 1 month ago
AI For Humans
2.3k views • 1 month ago
AI For Humans
1.7k views • 1 month ago
AI Jason
27.4k views • 1 month ago
AI For Humans
334 views • 1 month ago
AI For Humans
2.6k views • 1 month ago
AI Explained
105.5k views • 1 month ago
AI Explained
130.2k views • 1 month ago
AI For Humans
1.5k views • 1 month ago
AI Jason
203.2k views • 1 month ago
AI For Humans
1.4k views • 1 month ago
AI Explained
177.3k views • 2 months ago
AI Jason
33.0k views • 2 months ago
AI Explained
150.6k views • 2 months ago
Andrej Karpathy
453.8k views • 2 months ago
AI Explained
240.7k views • 2 months ago
AI Jason
62.2k views • 2 months ago
AI Explained
187.2k views • 2 months ago
AI Explained
160.9k views • 3 months ago
AI Jason
87.8k views • 3 months ago
AI Explained
270.8k views • 3 months ago
AI Jason
60.7k views • 3 months ago
AI Explained
96.5k views • 3 months ago
AI Jason
7.1k views • 3 months ago
AI Explained
145.7k views • 3 months ago
AI Explained
133.0k views • 4 months ago
AI Explained
79.4k views • 4 months ago
AI Jason
16.2k views • 4 months ago
AI Explained
84.0k views • 4 months ago
AI Explained
74.5k views • 4 months ago
AI Explained
144.7k views • 5 months ago
AI Jason
68.7k views • 5 months ago
Morningside AI
4.1k views • 5 months ago
AI Explained
83.7k views • 5 months ago
AI Jason
134.4k views • 5 months ago
AI Jason
33.3k views • 5 months ago
Morningside AI
9.7k views • 5 months ago
AI Explained
229.0k views • 5 months ago
Andrej Karpathy
1.8M views • 5 months ago
AI Explained
112.8k views • 5 months ago
AI Explained
167.2k views • 5 months ago
AI Explained
156.8k views • 5 months ago
Morningside AI
26.0k views • 5 months ago
AI Jason
16.1k views • 6 months ago
AI Explained
96.4k views • 6 months ago
AI Jason
68.8k views • 6 months ago
AI Explained
120.6k views • 6 months ago
AI Jason
52.6k views • 6 months ago
AI Jason
19.1k views • 7 months ago
AI Jason
52.7k views • 7 months ago
AI Jason
28.6k views • 7 months ago
AI Jason
13.6k views • 7 months ago
AI Jason
183.6k views • 8 months ago
AI Jason
48.3k views • 8 months ago
AI Jason
26.5k views • 8 months ago
AI Jason
14.4k views • 8 months ago
Andrej Karpathy
4.2M views • 1 year ago
Andrej Karpathy
153.6k views • 1 year ago
Andrej Karpathy
167.4k views • 1 year ago
Andrej Karpathy
240.3k views • 1 year ago
Andrej Karpathy
269.9k views • 1 year ago
29 Comments
Andrej, please train an llm with dilated convs, please show people we were doing it wrong this whole time, I know you can do it.
Just wanna say thank you for sharing your experience -- love this from-scratch series starting from first principles!
Hi @AndrejKarpathy thanks for recording this for us. I was following the whole way through, and funnily enough, I also wasn't able to beat the 1.993 that you got from this fancy hierarch     See More
I have challenging question ( for me :) ). I made a very simple network which takes x as an input and produce y as an output. the network looks like that (y = sin(ax + b)) where a and b are     See More
Um, can I find Part 6 somewhere?(RNN, LSTM, GRU..) I was under the impression that the next video in the playlist is about building GPT from skretch.
That was a very great playlist, easy to understand and very helpfull, thank you very much!!
So far THE BEST lecture series I came across on YouTube. Along side learning the neural networks in this series, I have learned the PyTorch more than learning it by waching a PyTorch video s     See More
Thank you so much for creating this video lecture series. Your passion for this topic comes through so vividly in your lectures. I learned so much from every lecture and especially appreciat     See More
Please show how to implement conv2D layers as matrix multiplication and cover the math
Andrej, please train an llm with dilated convs, please show people we were doing it wrong this whole time, I know you can do it.     See Less
Just wanna say thank you for sharing your experience -- love this from-scratch series starting from first principles!     See Less
Awesome series!     See Less
Hi @AndrejKarpathy thanks for recording this for us. I was following the whole way through, and funnily enough, I also wasn't able to beat the 1.993 that you got from this fancy hierarch     See More rk. I actually went back and tuned the single hidden layer network you mentioned above and was able to get that one to perform even better than 1.993. To be exact, I got
train 1.7930818796157837
val 1.9838893413543701
test 1.9920368194580078
just from making the network bigger and running it for longer. But don't worry, it's not that embarrassing, as I'm sure there is a setting in this hierarchical network that will product better results. It seems like there should be.    See Less
I have challenging question ( for me :) ). I made a very simple network which takes x as an input and produce y as an output. the network looks like that (y = sin(ax + b)) where a and b are     See More variables. training data is built out from sin(3x+4312). loss function is quadratic mean. using usual approaches I couldn't make it works ! what do you think the problem is?    See Less
Um, can I find Part 6 somewhere?(RNN, LSTM, GRU..) I was under the impression that the next video in the playlist is about building GPT from skretch.     See Less
That was a very great playlist, easy to understand and very helpfull, thank you very much!!     See Less
So far THE BEST lecture series I came across on YouTube. Along side learning the neural networks in this series, I have learned the PyTorch more than learning it by waching a PyTorch video s     See More 6 hrs from a youtuber.    See Less
Thank you so much for creating this video lecture series. Your passion for this topic comes through so vividly in your lectures. I learned so much from every lecture and especially appreciat     See More lectures started from the foundational concepts and built up to the state-of-the art techniques. Thank you!    See Less
Please show how to implement conv2D layers as matrix multiplication and cover the math     See Less