Naive way around that specific type of continual learning would be to see how well it lowers loss on knowable, falsifiable objectives the way you do with reasoning.
If you just feed racis     See More
Naive way around that specific type of continual learning would be to see how well it lowers loss on knowable, falsifiable objectives the way you do with reasoning.
If you just feed racis     See More into it, that's not going to help it write code better and if anything make it worse, because that would be teaching it to ignore empirical evidence.
In general, destructive, ignorant thinking doesn't help you achieve much pragmatically.
The one place I can think of where this might go wrong is those experiments performed by nazi neuroscientists, but even there, I think it already has the foundational knowledge to appreciate more nuance on this topic.    See Less
The Definition of AGI paper really feels more like a survey of different attempts to define AGI rather than a conclusive definition.
Aside from that, another excellent video dude! Wel     See More     See Less
I actually felt a sinking feeling in my gut after hearing 7:39.     See Less
Can't you just give each company a small private submodel that can be trained privately without contaminating the main GPT-5?     See Less
Continual learning is basically the holy grail of AI. Particularly if it can learn from a small amount of examples. But 1 of the issues is that when the model is updated it has a tendency to     See More degrade previous knowledge or abilities just like when you fine-tune a normal model. Eg. if you fine-tune an LLM to tell jokes, it becomes better at telling jokes but worse at doing everything else. I believe this is called "catastrophic forgetting".    See Less
What happened to your ringing endorsement for "Free Palestine"?
While U.K is under foreign occupation.     See Less
Sorry Philip, but seeing GPT-5 pro below Gemini 2.5 pro on Simplebench makes me question the relevance/utility of the benchmark. Anyone who've used these models know that they are not ev     See More same plane of existence.    See Less
Models learning from user interaction would have to be able to highly curate their own learning. How that could be done... no idea.     See Less
As for the methodology to enable RL from real time interaction, they could go back a couple decades to what Cisco did back then (haven't kept up to know if they still do or if others hav     See More something like this since I've retired). They used to give and track a sort of "reputation" score for end users of Cisco routers, etc. (often network engineers within major companies, top consultants, etc.) who would view active tech support issues (company name blanked out, so just the raw issue: what model of device, what the issue was), that if there were unsolvable tech support issues, the body of users at large could weigh in if they thought they had a solution to suggest. But the key was that highest tier of those super high reputation users, they (Cisco tech support) wouldn't VETTE the solution prior to releasing it: Cisco would allow the solution to go straight to the person facing the problem due to such a high earned trust level that they would, like a doctor "do no harm". This seemed brilliant since it could shorten the duration of what was an unknown but possibly huge issue for a major company and it got a body of users helping each other out of binds from their pooled knowledge.
I was involved in Knowledge Management in the early Internet years and wound up in a couple cover stories for the industry mags back then, PC Week and Infoworld due to adopting self service tech support quite early on in the Internet and recognized that the Cisco reputation system could have applicability in many other use cases, so today, it could score end users of AI systems high enough that they would again "do no harm" with their data which could then be trusted as an RL source for unfiltered inclusion.
Imagine the huge body of engineers and top users that should be able to earn a high enough trust score that their data can be intrinsically trusted. That could be hugely powerful. It might also create some "competition" for that user base since they might feel some "skin in the game" of a particular AI they are "supporting" in that manner and in fact, a company like OpenAI could even incentivize this by rewarding them. Imagine if, for example, ChatGPT Pro tier was discounted 25% for these trusted users and you've got a completely new symbiotic relationship.    See Less
How is sora money making?     See Less