Skip to main content
Refine/Blog
All posts
13 min read

AI Agents: The Future of Enterprise Applications

Share on:
AI Agents: The Future of Enterprise Applications

Your customer support team is drowning. Tickets pile up overnight. Simple requests that should take minutes stretch into hours because someone has to log into three different systems, copy data between screens, and update a spreadsheet before closing the loop. Meanwhile, your competitors are answering inquiries in seconds and resolving issues end-to-end without human intervention.

This gap between manual operations and autonomous execution is where AI agents enter the picture. Not as another chatbot that forwards requests to humans, but as software that can reason through problems, access multiple systems, and complete tasks on your behalf.


Refine

What Are AI Agents and Why Do They Matter Now?

AI agents are software programs that can understand goals, plan actions, execute tasks, and adapt based on outcomes. They differ from traditional automation (which follows fixed rules) and chatbots (which respond to prompts but rely on humans for action).

Think of the difference this way: a chatbot can tell you your order status. An agent can check your order, notice it's delayed, contact the shipping provider, issue a partial refund, update your account, and email you with next steps, all without waiting for a human to approve each action.

This capability has been technically possible for some time, but three recent developments pushed agents from experimental to practical:

FactorWhat Changed
Reasoning modelsLLMs can now break complex problems into steps, evaluate options, and adjust plans when things go wrong
Tool use and APIsModels can call external systems, databases, and services as part of their workflow
Cost reductionRunning inference became cheap enough to let agents think through multi-step tasks economically

AWS describes four levels of AI agency, from basic rule-based automation up to fully autonomous systems that set their own goals and select their own tools. Most enterprise deployments in 2025-2026 sit at Level 2 or 3: agents that can plan and execute within defined domains, with some human oversight.


The Numbers Behind the Shift

The market data tells a clear story. According to Gartner, 15% of day-to-day work decisions will be made autonomously by AI agents by 2028, compared to essentially zero in 2024. The AI agents market itself is projected to grow from $13.2 billion in 2024 to $58.7 billion by 2028, a compound annual growth rate of 28.5%.

Google's 2025 ROI of AI Report found that 74% of executives achieved return on investment within the first year of deploying AI agents. Among companies reporting productivity gains, 39% saw productivity at least double. Perhaps most telling: 52% of executives now have AI agents running in production, not pilots, actual production systems.

But there's a trust gap. A Harvard Business Review survey of 603 business and technology leaders found that only 6% fully trust AI agents to autonomously run core business processes. Most organizations limit agents to routine tasks or supervised use cases. The enthusiasm is there; the confidence isn't, at least not yet.

The bottleneck isn't the technology. Only 20% of organizations say their infrastructure is ready to support agentic AI for core processes. Just 12% feel their governance controls are adequate. The challenge has shifted from "can agents do this?" to "are we ready to let them?"


How Real Companies Are Using AI Agents

The gap between pilot projects and production deployments is closing. Here's how several enterprises are putting agents to work.

Capital One: Turning Leads Into Buyers

Capital One deployed Chat Concierge, an agentic AI tool for auto dealerships. The agent handles customer questions about vehicles, schedules appointments, and sets up test drives. Rather than just answering FAQs, it actively works to convert interested shoppers into scheduled appointments.

The results: 55% higher conversion rate from leads to buyers. The company specifically chose this use case because it sits at the "low end of the risk spectrum" while still having enough complexity to generate real learning. After launch, Capital One reduced the agent's response latency fivefold by tuning their proprietary multi-agent workflow.

Genentech: Accelerating Drug Discovery

The biotech company built an agentic solution on AWS that automates the manual search process for biomarker validation. Scientists previously spent significant time searching across knowledge bases and internal databases. Now agents break down research tasks into dynamic workflows, adapting their approach based on what they find at each step.

The system accesses multiple knowledge bases using retrieval-augmented generation and interfaces with internal APIs and databases. Genentech expects this to reduce time-to-target identification and accelerate drug discovery across therapeutic areas.

Amazon: Upgrading Java at Scale

Amazon used Q Developer agents to migrate tens of thousands of production applications from Java 8/11 to Java 17. What traditionally required developers to manually update code, fix compatibility issues, and test changes became an agent-driven process.

Developers completed these upgrades in a fraction of the typical time. The project delivered both performance improvements and cost savings across Amazon's internal systems. This use case shows agents handling repetitive-but-complex work that would otherwise consume engineering cycles.

Rocket Mortgage: Personalized Financial Guidance

Rocket Mortgage needed a more personalized approach to home financing. They built an AI-powered support system using Amazon Bedrock Agents that aggregates 10 petabytes of financial data. The platform provides tailored mortgage recommendations and real-time personalized guidance.

Results include faster query resolution, improved personalization accuracy, and better customer experience navigating complex financing decisions. The agent doesn't just answer questions; it pulls relevant account data, analyzes the customer's situation, and provides specific recommendations.

PepsiCo: Testing Software and Serving Customers

PepsiCo deployed agents across three areas: technology infrastructure, customer service, and employee experience. The company measures productivity gains, cost impact, and how both customers and employees interact with these systems.

One notable finding: agents supporting software testing not only sped up validation cycles but also identified technical gaps that human testers missed. As their chief strategy officer noted, "It was better than expected in the pipeline. So that gives us a lot of confidence."

Salesforce: Scaling Agent Deployment

Salesforce launched Agentforce in October 2024 and has since closed 18,000 deals. Clients including Reddit, Pfizer, and OpenTable use the platform to build and launch agents across customer service, marketing, and sales functions.

The company has been iterating on its own internal deployment. After launching an agent for their support website, they discovered response times were too slow and the agent couldn't answer certain customer questions. Salesforce has since made improvements that reduced latency; the agent has now handled more than 2 million customer interactions. As their COO put it: "Once you start to get to scale, you want to monitor the agent's performance, analyze what the agent is doing, and you want to continuously improve the agent."

JLL: Property Management Automation

Commercial real estate firm JLL has 34 agents in discovery and development. One of them, an add-on to their Prism property management platform, handles autonomous tasks like adjusting building temperatures after tenant complaints.

The company's CTO frames her role as ensuring "control checks and balances between humans and agents" and managing agents "so that they don't go rogue." This reflects a broader theme: as agents become more capable, the governance question becomes central.


The Implementation Reality Check

Despite the success stories, adoption isn't straightforward. Deloitte's research shows that while 30% of organizations are exploring agentic AI, only 11% actively use it in production. McKinsey found that 88% of companies use AI in at least one business function, but only 20% work cross-functionally. Most AI tools remain siloed.

The challenges fall into predictable categories:

Security and privacy top the list. Thirty-one percent of organizations cite cybersecurity and privacy concerns as their main barrier. Agents need access to multiple systems and data sources to be useful, but that access creates risk.

Data quality comes next. Twenty-three percent worry about data output quality. As one executive put it: without solid data foundations, organizations risk "garbage in, garbage out" scenarios that erode trust rather than build it.

Process readiness matters more than technology. Twenty-two percent say their business processes aren't ready for agentic AI. Agents can only automate workflows that are well-defined; many enterprise processes are undocumented or inconsistent.

Governance gaps remain. Only 12% of organizations feel their risk and governance controls are adequate for agentic AI. The Harvard Business Review study found 44% are prioritizing training employees in AI oversight, while 39% focus on building responsible AI guardrails.


The Trust Problem

The 6% figure from Harvard Business Review deserves attention. Nearly three-quarters of executives say agentic AI benefits outweigh the risks, yet almost none trust agents with core processes. This isn't irrational; it's appropriate caution while capabilities mature.

IBM's research frames it as an "expectation gap": users demand perfection from technology while accepting imperfection in humans. An agent that makes errors at a far smaller rate than human workers can still erode trust faster than a human making the same mistakes.

The solution isn't perfect agents. It's explainability and traceability. As Maryam Ashoori from IBM put it: "Using an agent today is basically grabbing an LLM and allowing it to take actions on your behalf. What if this action is connecting to a dataset and removing a bunch of sensitive records? The challenge becomes transparency. And traceability of actions for every single thing that the agents do."

Organizations building trust successfully tend to:

  • Start with lower-risk use cases that build confidence
  • Maintain detailed audit logs of agent actions
  • Keep humans in the loop for high-stakes decisions
  • Gradually expand scope as reliability proves out

Multi-Agent Systems and What Comes Next

The next evolution is already emerging: multi-agent orchestration. Instead of single agents handling tasks, teams of specialized agents work together, coordinated by supervisor agents or orchestration platforms.

Several communication protocols are competing for dominance: Google's A2A, Cisco-led AGNTCY, Anthropic's MCP, and others. Deloitte predicts these will begin converging over the next year into two or three leading standards.

The key capabilities these protocols enable: peer-to-peer and hub-and-spoke interactions with shared context and memory, built-in negotiation and conflict resolution between agents, authentication and secure messaging, and audit trails for traceability. Businesses evaluating multi-agent systems should watch which protocols gain traction in their industry, since being locked into a "walled garden" ecosystem could limit future flexibility.

Gartner forecasts that by 2028, 33% of enterprise software applications will include agentic AI, up from less than 1% in 2024. This isn't agents as a separate category; it's agents embedded in the applications businesses already use.

For internal tools specifically, agents are starting to handle data entry, workflow automation, report generation, and CRUD operations. The story is the same everywhere: agents managing complete workflows that previously required human intervention, from reading data to making updates to notifying stakeholders.


Getting Started with AI Agents

If you're evaluating where agents fit in your organization, consider this framework:

QuestionWhat to Look For
Is the process well-defined?Agents need clear inputs, outputs, and success criteria
What's the risk of errors?Start with processes where mistakes are recoverable
Does it require multiple systems?Agents excel at coordinating across tools and databases
Is it repetitive but complex?Simple automation works for simple tasks; agents handle the complex cases
Can you measure success?You need metrics to evaluate whether the agent is actually helping

Common starting points include customer service resolution, internal tool automation, software testing, report generation, and data validation. These offer clear ROI metrics and build organizational confidence before tackling higher-stakes workflows.


Frequently Asked Questions

How are AI agents different from chatbots?

Chatbots respond to prompts and provide information. Agents take actions: they can access systems, make decisions, execute multi-step workflows, and complete tasks end-to-end. A chatbot tells you your order status; an agent resolves your order issue.

What's the typical ROI timeline for AI agents?

According to Google's research, 74% of executives achieved ROI within the first year. However, this varies significantly by use case complexity and organizational readiness. Companies with stronger data infrastructure tend to see faster returns.

Are AI agents ready for production use?

For well-defined use cases with appropriate oversight, yes. Fifty-two percent of executives now run agents in production. But most organizations limit agents to routine tasks rather than core processes. The technology is ready; organizational readiness is often the constraint.

What skills do teams need to work with AI agents?

Organizations are investing in training employees for AI oversight, not just AI building. The emerging need is "agent literacy": understanding how to supervise agents, interpret their actions, and intervene when needed. Some companies designate AI ambassadors in each function to identify use cases and guide adoption.

How do enterprises handle security concerns with agents?

Leading organizations implement detailed audit logs, role-based access controls, and human approval gates for sensitive actions. Agents inherit traditional security controls but their dynamic behavior requires additional safeguards. Context-aware guardrails and runtime monitoring are becoming standard.

What happens when an AI agent makes a mistake?

Accountability remains with humans. As one IBM researcher noted: "Technology doesn't think. It can't be responsible. A human being in that organization is going to be held responsible and accountable for those actions." Organizations need clear escalation paths and rollback mechanisms before deploying agents in production.

Should we build agents or buy platforms?

Both options exist. Platforms like Salesforce's Agentforce (which has closed 18,000 deals since October 2024) let businesses deploy agents without building from scratch. Custom development offers more control but requires more investment. Many organizations start with platforms for initial use cases and build custom solutions for differentiated capabilities.


What This Means for Internal Tools

The scenario from the opening, support teams drowning in manual work, is exactly where agents deliver immediate value. Internal tools that require humans to navigate multiple systems, copy data between screens, and follow multi-step procedures are prime candidates for agentic automation.

For teams building internal applications, this shifts the design question from "what screens do users need?" to "what outcomes should the system produce?" When agents can handle the tedious coordination work, humans focus on decisions that actually require judgment.

If you're building internal tools and want to incorporate AI assistance, you're designing for a future where the interface isn't just screens for humans, it's also structured data and clear workflows that agents can act on. The organizations getting this right today are the ones building tools that work well for both human and AI operators.

This is particularly relevant for teams building dashboards, admin panels, and back-office applications. These tools often involve repetitive workflows across multiple data sources: exactly the kind of work agents handle well. The question isn't whether to add AI capabilities, but how to structure your applications so that agents can participate effectively alongside human users.


This article draws on research from AWS, Google Cloud, McKinsey, Deloitte, Gartner, IBM, Harvard Business Review, and case studies from Capital One, Genentech, Amazon, Rocket Mortgage, PepsiCo, Salesforce, and JLL.