The Current State of LLMs
Why this understanding matters? and what's next?

I am a software professional specialising in web development and generative AI. I enjoy sharing knowledge and learning from the community. I offer consulting services and speak at events, seminars, and workshops.

For the past few years, the story of artificial intelligence has been dominated by Large Language Models (LLMs). The leaps from GPT-3 to GPT-4, and later to GPT-5, were astonishing. Suddenly, machines could generate human-like text, write code, and even pass standardised exams. It felt like we were on an unstoppable growth curve.
However, the pace of breakthroughs has recently slowed. Many researchers and developers are starting to ask: Have LLMs reached a plateau? At the same time, a new trend is emerging: the rise of Small Language Models (SLMs). Far from a coincidence, these two developments are deeply connected.
Why are LLMs plateauing?
Diminishing Returns on Scale
Early on, increasing the size of a model (more parameters, more training data) led to dramatic improvements. Today, however, those gains are tapering off. Scaling from billions to trillions of parameters produces only marginal accuracy gains, often not worth the billions of dollars in compute required.
Inverse-Scaling Phenomenon
On certain nuanced or specialised tasks, larger models sometimes perform worse than smaller ones. This challenges the assumption that “bigger is always better.”
Sustainability Challenges
Training massive LLMs requires enormous energy and infrastructure. As the models grow, so does the financial and environmental cost, making it harder to justify the pursuit of scale for limited benefits.
Practicality Over Hype
Businesses are realising that a trillion-parameter LLM is often overkill for day-to-day tasks like customer support, document analysis, or code assistance. The mismatch between model power and business needs is becoming clearer.
Why end-users matter?

It’s easy to get lost in benchmark scores, parameter counts, or the latest model release. But the question that truly matters is: Does this solve a real user problem?
A customer service rep doesn’t care if a model has 175B or 1T parameters; they care if it reduces their workload and increases customer satisfaction.
A hospital doesn’t want the world’s largest LLM; it wants a model that accurately processes patient records while ensuring privacy.
An SME doesn’t want to pay for GPUs in the cloud forever; it wants cost-effective automation that actually grows revenue or reduces risk.
When the AI industry focuses too much on technology-first narratives, it risks forgetting the human-first outcomes that drive adoption.
Overhyping AI to sell LLMs

“AI will take your job” rhetoric isn’t just caution; it’s a marketing strategy
Let’s be blunt: a lot of the “AI will take your job” rhetoric isn’t just caution; it’s a marketing strategy. By amplifying fear and hype, vendors create urgency around adopting their products.
“Adopt now, or be disrupted.”
“If you don’t use our AI, you’ll be left behind.”
This narrative serves LLM sellers, not necessarily businesses or workers. It overshadows the reality that many problems don’t need a massive general-purpose LLM. Sometimes, a simple rules engine, SLM, or process redesign is enough.
The overhype risks creating AI fatigue, where organisations chase shiny demos rather than solutions that add measurable value.
The GPU monopoly problem

Behind the AI boom lies another story: hardware economics. NVIDIA’s GPUs are the beating heart of today’s AI models. The company holds a quasi-monopoly, with hyperscalers like AWS, Azure, and Google Cloud locked into the GPU supply chain.
GPU shortages inflate prices, creating barriers for startups and smaller players.
Hyperscalers resell GPU capacity at high margins, bundling it with their own AI services.
With one company dominating the AI hardware market, the pace of democratisation slows.
This dynamic isn’t sustainable. Just as cloud once shifted from proprietary stacks to multi-cloud and open standards, AI will need diverse hardware ecosystems (ASICs, TPUs, CPUs + SLMs at the edge) to truly scale.
Why are SLMs gaining traction?

As LLM growth slows, SLMs are stepping into the spotlight, not as a replacement, but as a strategic alternative.
1. Cost-Effective
SLMs are much cheaper to train, deploy, and run. This makes them accessible to startups, mid-sized businesses, and edge applications, not just tech giants with deep pockets.
2. Faster Training & Fine-Tuning
Smaller models can be fine-tuned on domain-specific datasets in days or even hours, enabling rapid iteration and deployment.
3. Higher Accuracy in Specialised Tasks
By focusing on a narrower domain (e.g., healthcare documents, legal contracts, or financial forecasting), SLMs can outperform general-purpose LLMs in accuracy and reliability.
4. Deploy Anywhere
SLMs can run on-premises or even on edge devices like smartphones, IoT devices, or local servers. This reduces latency, enhances data privacy, and unlocks real-time use cases.
What does this shift mean?
The rise of SLMs doesn’t mean LLMs are obsolete. Instead, it signals the maturation of the AI ecosystem.
LLMs will remain the general-purpose “brains” for complex, multi-faceted reasoning and multimodality.
SLMs will become the go-to tools for efficient, specialised, and cost-sensitive applications.
In other words, the future of AI isn’t about one model to rule them all; it’s about choosing the right tool for the job. Just like software diversified into operating systems, cloud services, and microservices, AI is diversifying into large, small, and hybrid models.
How to analyse if your business really needs AI?
![]()
Amid the hype, every business leader should pause and ask: Do we really need AI here, or is there a simpler, more cost-effective solution?
Here’s a simple framework:
Define the Problem Clearly
Is it efficiency (reducing cost/time), accuracy (fewer errors), or growth (new revenue)?
Many “AI use cases” collapse when the problem is vague.
Check if Existing Tools Work
Could automation, analytics, or rule-based software solve 80% of the problem already?
If yes, AI may not be necessary or should only be layered on top.
Assess Data Readiness
- AI thrives on quality data. If your data is fragmented, unclean, or scarce, investing in AI prematurely may fail.
Evaluate ROI and Sustainability
Can the AI solution pay for itself in 12–24 months?
Does it reduce reliance on hyperscaler GPUs or expensive models?
Start Small
- Pilot with Small Language Models (SLMs) or domain-specific tools before committing to expensive LLM contracts.
The way forward

The conversation about AI cannot be monopolised by GPU vendors, hyperscalers, or model makers alone. It must be anchored in end-user value, sustainability, and strategic adoption.
LLMs may be plateauing in raw gains, but they remain valuable for complex reasoning and orchestration.
SLMs are rising as the practical choice for businesses that need targeted, affordable, and deployable intelligence.
End-users must drive the narrative: asking not “What can AI do?” but “What problem are we solving, and is AI the best way?”
The companies that thrive in the next wave of AI won’t be those who bought into hype, but those who balanced ambition with pragmatism, choosing the right tools for the right problems, at the right time.





