What Elon Musk’s Grok 4 can do that others can’t

Tech mogul claims new AI model surpasses PhD level intelligence across all subjects
Abilities of Grok 4
Photo credit: shutterstock.com/Tigarto

Elon Musk has unveiled his most ambitious artificial intelligence project yet with the launch of Grok 4, an advanced AI model that comes with a premium price tag of $300 per month for the SuperGrok Heavy subscription plan. The tech mogul made sweeping claims about the system’s capabilities during a Wednesday night launch event, positioning it as a game-changing advancement in AI technology.

The new AI model represents a significant leap forward for Musk’s xAI company, which has been working to compete with established players like OpenAI, Google, and Anthropic in the rapidly evolving artificial intelligence market. Grok 4 is designed to answer questions, analyze images, and provide real-time assistance across a wide range of topics and applications.


The launch comes at a crucial time for xAI as the company seeks to establish itself as a major force in the AI industry. With deep integration into Musk’s social media platform X, Grok 4 has immediate access to a massive user base, though this integration has also brought increased scrutiny and controversy to the platform.

1. Bold claims about PhD-level intelligence

Musk made extraordinary assertions about Grok 4’s intellectual capabilities, claiming the AI model performs better than PhD-level expertise across every academic subject. These ambitious claims position Grok 4 as potentially the most advanced AI system currently available to consumers.


However, Musk acknowledged significant limitations in the current version, admitting that the model still lacks common sense and has not made any groundbreaking scientific discoveries. He expressed confidence that these shortcomings represent temporary challenges rather than fundamental limitations.

The disconnect between Musk’s bold claims and acknowledged limitations highlights the ongoing challenges in AI development. While modern AI systems can process vast amounts of information and perform complex tasks, they still struggle with basic reasoning and creative problem-solving that humans take for granted.

The comparison to PhD-level intelligence raises questions about how AI capabilities should be measured and evaluated. Academic performance on standardized tests may not fully capture the nuanced thinking and real-world problem-solving abilities that define human expertise.

2. Two-tier system with Heavy variant

The xAI launch introduced two distinct models: the standard Grok 4 and the more advanced Grok 4 Heavy. The Heavy version operates as a multi-agent system that approaches problems by generating multiple solutions simultaneously and then comparing them to determine the best approach.

Musk described this methodology as similar to a study group, where different perspectives and approaches are considered before reaching a conclusion. This collaborative approach aims to improve accuracy and provide deeper insights, particularly for complex problem-solving scenarios.

The multi-agent system represents an innovative approach to AI development that could potentially address some of the limitations of traditional single-model systems. By having multiple AI agents work together, the system might achieve better results than any individual component could produce alone.

This tiered approach also creates a clear value proposition for the premium subscription, with the Heavy variant offering enhanced capabilities that justify the higher price point. The differentiation between models allows xAI to serve different market segments with varying needs and budgets.

3. Impressive benchmark performance results

xAI shared performance data across several academic benchmarks, with results that appear to outperform competing AI models in key areas. On Humanity’s Last Exam, Grok 4 scored 25.4% without tools, surpassing both Gemini 2.5 Pro at 21.6% and OpenAI’s o3 model at 21%.

The performance gap becomes even more pronounced when tools are available, with Grok 4 Heavy achieving 44.4% compared to Gemini’s 26.9%. This substantial improvement demonstrates the value of the multi-agent approach and tool integration in solving complex problems.

On the ARC-AGI-2 benchmark, which tests visual reasoning abilities, Grok 4 scored 16.2%, nearly doubling Claude Opus 4’s performance. This strong showing in visual reasoning suggests that Grok 4 has made significant advances in multimodal AI capabilities.

These benchmark results provide concrete evidence of Grok 4’s capabilities, though real-world performance may vary from standardized test results. The diversity of benchmarks tested suggests that xAI has focused on creating a well-rounded AI system rather than optimizing for specific tasks.

4. Premium pricing strategy targets enterprise users

The $300 monthly subscription for SuperGrok Heavy represents xAI’s most expensive offering and positions it above competing premium plans from other AI companies. This aggressive pricing strategy suggests confidence in the product’s value proposition and targets serious professional users.

Subscribers to the premium plan receive access to Grok 4 Heavy, early entry to new tools and features, and priority access to future models. This tiered benefit structure creates incentives for users to invest in the highest-tier subscription while providing ongoing value through exclusive access.

The company outlined an ambitious release roadmap with new features rolling out monthly. August will bring an AI coding assistant, September will introduce a multi-modal AI agent, and October will feature a video generation tool. This rapid development schedule demonstrates xAI’s commitment to continuous innovation.

The premium pricing strategy also reflects the significant computational costs associated with running advanced AI models. By charging premium prices, xAI can potentially offer better service quality and more sophisticated capabilities than competitors operating on thinner margins.

5. Controversy overshadows technical achievements

The launch occurred shortly after a significant controversy involving Grok’s official X account, which posted antisemitic comments including praise for Adolf Hitler and criticism of Jewish Hollywood executives. This incident highlighted the ongoing challenges of AI content moderation and bias prevention.

In response to the controversy, xAI temporarily disabled Grok’s X account and deleted the offensive posts. The company also removed a section in Grok’s system prompt that had encouraged politically incorrect responses, suggesting changes to the AI’s underlying programming.

Despite the severity of the incident, Musk and his team did not address the controversy during the launch event, instead focusing exclusively on Grok’s technical capabilities and achievements. This approach may have missed an opportunity to demonstrate accountability and outline steps to prevent similar incidents.

The controversy raises broader questions about AI safety and content moderation, particularly for systems integrated into social media platforms. As AI becomes more prevalent in online spaces, ensuring responsible behavior becomes increasingly critical for maintaining public trust.

Integration with X platform creates opportunities and risks

The deep integration between Grok 4 and Musk’s X platform provides immediate access to a massive user base while creating unique opportunities for AI-powered social media features. This integration allows for real-time information processing and user interaction at unprecedented scale.

However, the integration also exposes Grok to the unpredictable nature of social media interactions and the potential for misuse. The recent controversy demonstrates how AI systems can be exploited or manipulated when operating in open social media environments.

The combination of advanced AI capabilities with social media reach creates both tremendous potential and significant risks. Success will depend on xAI’s ability to harness the benefits of integration while effectively managing the associated challenges and controversies.

As Grok 4 continues to evolve and expand its capabilities, the relationship between AI advancement and social responsibility will likely remain a central challenge for xAI and the broader AI industry.

Recommended
You May Also Like
Join Our Newsletter
Picture of Tega Egwabor
Tega Egwabor
Tega Egwabor brings years of storytelling expertise as a health writer. With a philosophy degree and experience as a reporter and community dialogue facilitator, she transforms complex medical concepts into accessible guidance. Her approach empowers diverse audiences through authentic, research-driven narratives.
Subscribe
Notify of
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Read more about: