Browser-Native AI: Key Findings
AI has to do more than just get bigger, and data shows this is increasingly true.
78% of organizations now use AI in at least one part of their business, according to a McKinsey’s 2025 State of AI survey. It’s a sign that adoption is accelerating across industries.
But as AI moves from pilots to daily workflows, the challenge now is building systems that enterprises can trust.
Yet many teams still focus on scaling models, while missing the messy, real-world layer where work actually happens: the browser.
They overlook that agents must to learn from experience and deliver consistent results, without needing retraining.
For today’s enterprise leaders, that means:
- Prioritize reliability over size to ensure consistent, trustworthy performance.
- Design agents to learn from experience, not just predict outcomes.
- Automate inside the browser where most real work still happens.
- Build in memory to drive long-term trust and smarter decisions.
The future of workplace automation belongs to tools that understand how people actually work.
Tessa is one of them. Built by General Agency AI, a browser-based automation startup, it learns by doing and takes repetitive tasks off your plate.
In an exclusive interview with DesignRush, the startup's founder, Mohammed Nasir, shares:
- GPT-5’s influence on how they diagnose and improve agent performance
- Why the browser is the most overlooked space for AI automation
- What it takes to build AI teammates people actually trust
Here’s what he revealed, and why it matters for every business betting on AI.
Who is Mohammed Nasir?
Mohammed Nasir is the co-founder and CEO of General Agency AI, the team behind Tessa AI — a browser-based coworker that actually learns from experience. Backed by Y Combinator, the startup is focused on taking repetitive, boring tasks off people’s plates so they can spend time on more meaningful work.
Before starting General Agency AI, Mo worked on self-optimizing motion control systems at Nvidia’s automotive division, where he filed four patents for his work.
Use GPT-5 Where Diagnosis Matters Most
When Nasir’s team put GPT-5 through its paces, one capability stood out: spotting bugs in code.
While it isn’t the best at writing new code compared to Anthropic’s models, it’s highly effective at surfacing issues that engineers might miss.
This makes it especially useful in agent workflows where code is generated on the fly and needs debugging in real time.
“One thing that we found that GPT-5 really excels at is diagnosing issues in code," says Nasir.
It's not as good as Anthropic's models at actually writing code, but identifying bugs that are very difficult to spot and could be happening for reasons that are unknown to the engineer, GPT-5 is really good at finding."
But strength comes with trade-offs.
GPT-5’s more rigid personality can make it feel less natural in enterprise use. As Nasir put it:
“Some call it somewhat robotic compared to some of Anthropic's or even the Gemini models.”
Build Agents With 4 Core Capabilities
Tessa AI is built around four essentials that Nasir considers non-negotiable:
- Understanding what the user wants
- Observing the environment
- Taking action
- Remembering past experiences
Without these, an agent can’t handle the messy reality of day-to-day work.
People don’t phrase requests like code, so Tessa looks at both current and past instructions to respond in a way that feels natural.
And because multimodal inputs only give single snapshots, the system needs scaffolding to hold context over time.
Then there’s the browser challenge. Unlike clean API integrations, every website has its own quirks and hidden states, which makes reliable automation much harder.
“Websites each function differently and there are often hidden states that we have to keep track of," Nasir explains.
"For example, in Amazon, if you add something to your cart, unless you look at the tiny cart icon in the corner, you wouldn't know how many items or even what items are in your cart."
Finally comes memory.
Tessa learns from single experiences and recalls them at runtime, a system built to mimic how humans get better with practice.
Prioritize Reliability Above All Else
Unlike traditional software, large language models are stochastic, meaning the same prompt can lead to slightly different answers each time.
“Reliability, reliability, reliability. We are no longer dealing with ordinary software where you build it once and it works once… LLMs and AI in general today is stochastic,” Nasir says.
That challenge only grows with GPT-5’s model routing system, which boosts reasoning.
It does this by switching between specialized models but also introduces more unpredictability.
The bottom line: unless systems are built to manage this variability, enterprises will remain cautious about rolling out agents at scale.
Automate Browser-Based Busy Work
Nasir’s career at Nvidia, TikTok, and YouTube gave him a front-row seat to how large companies actually work.
Much of it happens inside browsers, in tools that don’t offer APIs.
Employees end up spending countless hours updating, moving, and organizing data across these systems.
That experience shaped Tessa’s focus on browser-native automation. For enterprises, the reason is clear: free up people’s time so they can focus on growth, not repetitive processes.
“There is a huge opportunity for these browser-based agents to automate away busy work so that the employees can focus on actually building and growing the company,” Nasir says.
Rethink Learning Beyond Fine-Tuning
One of the biggest surprises for Nasir’s team was discovering that better performance didn’t come from fine-tuning.
Instead, they found that adding memory systems around foundation models delivered many of the same benefits.
“The fact that we don't need to fine-tune models to achieve improvements in performance because we can just build systems around the foundation models," Nasir says.
Our key innovation is that our agent can recall information about past experiences and iteratively figure out the best course of action".
The takeaway? You don’t need to retrain models to improve results.
If your AI system can remember what worked, and what didn’t, it gets smarter over time, just like people do.
Opportunity for Enterprise AI
Enterprise AI isn’t just about bigger models anymore.
The real opportunity is in building systems people can actually trust: tools that are reliable, adaptable, and useful in the day-to-day chaos of work.
Nasir and his team at General Agency AI are betting on browser-native agents with memory, designed to take repetitive tasks off employees’ plates so they can focus on what really matters.
The companies that get this right will be the ones who see the biggest return from AI.
Enterprise AI FAQs
Why does reliability matter so much for enterprise AI?
Because AI doesn’t always give the same answer twice.
If businesses can’t count on consistent results, they won’t feel comfortable rolling it out across teams.
What’s the benefit of browser-native automation?
A lot of work still happens inside web apps that don’t have APIs. Browser-native agents can jump in where APIs don’t exist, handling the copy-paste, data entry, and other repetitive tasks that eat up hours.
How does adding memory make AI better?
With memory, an agent doesn’t have to start from scratch every time. It can recall what worked before, learn from mistakes, and get smarter over time, much like people do.
What’s special about GPT-5 in this context?
It’s particularly good at spotting bugs in code. That makes it valuable in workflows where AI is generating code on the fly and needs to catch problems quickly.
-details.jpg)







