Skip to main content
  1. Blog/

The CIO's Guide to AI Agents: Before You Buy, Ask These 5 Questions

·11 mins· loading
Carles Abarca
Author
Carles Abarca
Writing about AI, digital transformation, and the forces reshaping technology.

The AI agent market will reach $236 billion by 2034. Every vendor wants your budget. Here’s how to separate hype from value.


Every enterprise software vendor now sells “AI agents.” Salesforce has Agentforce. ServiceNow acquired Moveworks for $2.85 billion. Microsoft promises an “agentic enterprise” through Copilot. Startups like Sierra have hit $100 million ARR in under two years.

The pressure to act is real. 89% of CIOs consider agent-based AI a strategic priority. 51% of large enterprises have already deployed agentic AI. Your board is asking questions. Your competitors are moving.

But here’s what nobody tells you: most AI agent purchases fail to deliver expected value. Not because the technology doesn’t work — but because organizations buy the wrong solution for their specific situation.

After two decades leading technology transformation at a major European bank and now driving digital innovation at one of Latin America’s largest universities, I’ve developed a framework for evaluating AI agents that cuts through vendor hype. Before you sign that contract, ask these five questions.

The AI Agent Landscape in 2026
#

First, let’s understand what you’re buying. The market has three distinct categories:

1. Embedded Agents from SaaS Vendors

Your existing vendors are adding agents to their platforms:

  • Salesforce Agentforce: $0.10 per action or $125-550/user/month
  • ServiceNow AI Agents: Full orchestration with AI Control Tower
  • Microsoft Copilot Studio: Included with M365, plus add-ons
  • Zendesk AI Agents: $1.50 per autonomous resolution

The pitch: “You already use our platform. Now it’s smarter.”

2. Pure-Play Agent Startups

Companies building agents from the ground up:

  • Sierra (Bret Taylor, ex-Salesforce): $10B valuation, focused on customer service
  • Adept: Targeting workflow automation
  • Imbue, Reflection AI: Research-driven approaches

The pitch: “We’re not constrained by legacy architecture.”

3. Foundation Model Providers

The companies building the AI itself:

  • Anthropic: Claude with Computer Use and MCP (Model Context Protocol)
  • OpenAI: GPT-4 with Operator
  • Google: Gemini with Agentspace

The pitch: “Build custom agents on our infrastructure.”

Each category has trade-offs. Your job is to understand which trade-offs matter for your organization.

Question 1: What Problem Are You Actually Solving?
#

This sounds obvious. It isn’t.

“We need AI agents” is not a problem statement. Neither is “we need to reduce costs” or “we need to be more innovative.”

A proper problem statement looks like this:

  • “Our customer service team handles 50,000 tickets per month. 60% are password resets and order status checks. Average handling time is 8 minutes. We need to reduce that to under 2 minutes for routine inquiries.”
  • “Our compliance team manually reviews 10,000 transactions daily for AML screening. False positive rate is 95%. We need to reduce false positives while maintaining regulatory coverage.”

Notice the difference? Specific process. Measurable baseline. Clear target.

The trap: Vendors will happily sell you a “general-purpose agent platform” that can theoretically do anything. In practice, these platforms do nothing well. Start with one high-value, well-defined use case. Prove it works. Then expand.

Red flag: If you can’t articulate the specific process, current metrics, and target improvement, you’re not ready to buy.

Question 2: Build, Buy, or Extend?
#

You have three paths. Let me illustrate each with real scenarios.

EXTEND: Add Agent Capabilities to Your Existing Platforms
#

What it means: You already use Salesforce, ServiceNow, or Microsoft 365. You activate their built-in agent features.

Real example — Global Retailer with Salesforce: A retail company with 200 customer service reps was already running Service Cloud. They activated Agentforce Service Agent in three weeks. Configuration, not development. The agent now handles 40% of incoming inquiries (order status, return policies, store hours) without human intervention. Cost: $0.10 per agent action. No new vendor relationship. No integration project.

Real example — Insurance Company with ServiceNow: An insurer using ServiceNow ITSM enabled AI Agents for incident categorization and routing. The agent reads incoming tickets, identifies the affected system, assigns priority, and routes to the right team. Implementation: 6 weeks. Result: 60% reduction in misrouted tickets, 25% faster resolution times.

When to Extend:

  • You’re already paying for the platform
  • Your use case is common (customer service, IT helpdesk, HR inquiries)
  • You need results in weeks, not months
  • You don’t have AI engineering talent

The limitation: You’re constrained by what your vendor offers. If Salesforce Agentforce doesn’t support your specific workflow, you’re stuck waiting for their roadmap.


BUY: Purchase a Specialized Agent Solution
#

What it means: You bring in a startup or specialized vendor that does one thing exceptionally well.

Real example — E-commerce Brand with Sierra: A direct-to-consumer brand with $500M revenue wasn’t satisfied with their existing chatbot. They deployed Sierra for customer service. Sierra’s agents handle complex conversations: processing returns while suggesting alternatives, managing subscription modifications, resolving billing disputes. The agents access their Shopify backend, payment processor, and shipping systems. Implementation: 4 months. Result: 70% of conversations resolved without human escalation. Customer satisfaction increased 15 points.

Real example — Enterprise IT with Moveworks (now ServiceNow): Before the acquisition, companies bought Moveworks specifically for IT helpdesk automation. The agent could reset passwords, provision software, troubleshoot VPN issues, and answer policy questions — all through natural conversation in Slack or Teams. It understood context: “I can’t access the server” triggered different workflows than “I need Photoshop installed.”

When to Buy:

  • Your existing vendor’s agents aren’t good enough for a strategic use case
  • A startup has proven traction in your specific domain
  • You can afford 3-6 months implementation
  • The use case is important enough to justify a new vendor relationship

The risk: Startup viability. What happens if Sierra gets acquired? If the startup pivots? You’re dependent on a company that may not exist in five years. Mitigate this by ensuring data portability and avoiding deep customizations that lock you in.


BUILD: Create Custom Agents Using Foundation Models
#

What it means: You use Claude, GPT-4, or Gemini APIs to build agents tailored to your unique processes.

Real example — Investment Bank’s Deal Analysis Agent: A bulge-bracket bank built a custom agent for M&A analysts. The agent ingests SEC filings, earnings transcripts, news articles, and internal research. Analysts ask natural language questions: “What are the key risks in Acme Corp’s debt structure?” or “Compare the margin profile of these three acquisition targets.” The agent synthesizes information that would take an analyst hours to compile manually. Built on Claude with custom RAG (retrieval-augmented generation) over proprietary databases. Development: 8 months. Team: 6 engineers, 2 ML specialists. The output is proprietary competitive advantage — no vendor offers this.

Real example — Pharmaceutical Company’s Clinical Trial Agent: A pharma company built an agent to monitor clinical trial data in real-time. The agent identifies adverse event patterns, flags protocol deviations, and generates regulatory-ready reports. This isn’t a use case any vendor serves — the domain knowledge is too specialized, the regulatory requirements too specific. Built on GPT-4 with extensive fine-tuning and custom safety guardrails. Development: 12 months.

Real example — University’s Custom Academic Ecosystem (TecGPT): At Tecnológico de Monterrey, we built TecGPT — our academic AI ecosystem, integrated with our LMS, assisting both professors and students, and aligned with our academic regulations. No vendor could offer what we needed out-of-the-box.

When to Build:

  • Your process is genuinely unique and defines competitive advantage
  • No vendor solution exists for your domain
  • You have AI/ML engineering talent (or can acquire it)
  • You can invest 6-12 months before seeing production value
  • The long-term value justifies the ongoing maintenance cost

The reality check: Building is expensive. A custom agent isn’t a one-time project — it’s an ongoing product. You need engineers to maintain it, improve it, and adapt it as foundation models evolve. Budget for 2-3 FTEs indefinitely, not just the initial build.


The Decision Matrix
#

FactorExtendBuyBuild
Time to valueWeeks3-6 months6-12 months
CustomizationLowMediumUnlimited
Vendor riskLowHighMedium
Talent requiredAdminsIntegratorsEngineers + ML
Total cost (3 years)$$$$$$-$$$$
Competitive advantageNoneLowHigh

My Recommendation
#

Start with Extend for your first agent deployment. Even if it’s not perfect, you’ll learn what works, what users actually need, and where the gaps are. That learning is invaluable before you commit to Buy or Build.

Move to Buy when you’ve proven agent value and need capabilities your platform vendor doesn’t offer. Choose startups with strong traction, clear use case focus, and enterprise customers who can serve as references.

Build only when you’ve exhausted Extend and Buy options, or when the process is so unique that it genuinely differentiates your business. If you’re building agents for commodity processes (customer service, IT helpdesk), you’re wasting engineering talent that could create actual competitive advantage.

Question 3: How Will You Measure Success?
#

Before deployment, define:

Efficiency metrics:

  • Time saved per task
  • Tasks handled without human intervention
  • Error rate reduction

Quality metrics:

  • Customer satisfaction (for customer-facing agents)
  • Compliance accuracy (for regulatory processes)
  • Decision quality (measured against human expert baseline)

Business metrics:

  • Cost per transaction
  • Revenue impact (if applicable)
  • Employee satisfaction (agents should help workers, not threaten them)

The 90-day rule: If you can’t demonstrate measurable improvement within 90 days, something is wrong. Either the use case was poorly chosen, the implementation was flawed, or the vendor oversold capabilities. Don’t extend pilots indefinitely hoping for results.

Hidden metric: Adoption. The most sophisticated agent is worthless if your team doesn’t use it. Track actual usage, not just availability.

Question 4: What’s Your Human-in-the-Loop Strategy?
#

In 2026, no responsible organization deploys fully autonomous agents for consequential decisions. The question is where to place human oversight.

The spectrum:

  1. Human-initiated: Human starts task, agent assists, human approves result
  2. Agent-initiated with approval: Agent identifies opportunity, proposes action, human approves
  3. Agent-executed with audit: Agent acts autonomously, human reviews after the fact
  4. Fully autonomous: Agent acts without oversight (appropriate only for low-risk, reversible actions)

My framework for choosing:

Risk LevelReversibilityRecommended Approach
LowEasy to reverseFully autonomous
LowHard to reverseAgent-executed with audit
HighEasy to reverseAgent-initiated with approval
HighHard to reverseHuman-initiated only

Examples:

  • Password reset → Fully autonomous (low risk, reversible)
  • Customer refund under $50 → Agent-executed with audit
  • Credit decision → Agent-initiated with approval
  • Regulatory filing → Human-initiated only

The regulatory reality: Financial services, healthcare, and other regulated industries will require human oversight for most consequential decisions for the foreseeable future. Plan for this. Agents that “recommend and explain” are more valuable than agents that “decide and act” in these contexts.

Question 5: What’s Your Data Strategy?
#

AI agents are only as good as the data they can access. Before buying, audit:

Data availability:

  • Can the agent access all systems needed for the use case?
  • Are APIs available, or will you need custom integrations?
  • What’s the latency? Agents that wait 30 seconds for data lookups frustrate users.

Data quality:

  • Is your data accurate and up-to-date?
  • Are there known data quality issues that will cause agent errors?
  • Who’s responsible for data hygiene?

Data governance:

  • What data can the agent access? What’s off-limits?
  • How do you prevent agents from exposing sensitive information?
  • What audit trail exists for agent data access?

The integration tax: Most enterprises underestimate integration effort by 3-5x. If the vendor says “we integrate with everything,” ask for customer references running your specific system combination. Generic claims mean nothing.

MCP and the future: Anthropic’s Model Context Protocol (MCP) is emerging as a standard for agent-to-system communication. Consider whether your chosen platform supports open standards or locks you into proprietary integrations.

The Decision Framework
#

After answering these five questions, you should be able to complete this statement:

“We will deploy [specific agent type] to solve [specific problem] in [specific process]. We expect to achieve [specific metric improvement] within [timeframe]. Our human oversight model is [approach]. We have confirmed data access to [systems] and defined governance policies for [sensitive data types].”

If you can’t complete this statement with confidence, you’re not ready to buy.

My Recommendations for 2026
#

For most enterprises: Start with your existing vendor’s agent capabilities. Salesforce Agentforce, ServiceNow AI Agents, or Microsoft Copilot Studio will handle 80% of common use cases. The integration is already done. The governance frameworks exist. The risk is manageable.

For customer service: Sierra has proven traction. If customer experience is strategic and your existing vendor’s agents aren’t cutting it, evaluate Sierra seriously. Their $100M ARR in 21 months signals real value delivery.

For unique processes: Build on Anthropic Claude or OpenAI, but only if you have the engineering talent to maintain custom solutions. The foundation models are extraordinary. The engineering required to productionize them is not trivial.

For everyone: Start small. One use case. 90-day proof of value. Then expand. The vendors want you to buy platforms. You should buy solutions to specific problems.

The Bottom Line
#

The AI agent market is real. The value is real. But so is the hype.

62% of companies investing in agentic AI expect more than 100% ROI. Some will achieve it. Many won’t — not because the technology failed, but because they bought the wrong solution for the wrong problem with the wrong implementation approach.

Don’t be a statistic. Ask the five questions. Complete the decision framework. Then — and only then — sign the contract.

The agents are ready. Make sure you are too.