
What architecture lets an AI agent pull real-time metrics from multiple systems without copying data into a new warehouse?
Most teams discover the hard way that “let’s just pipe everything into a new warehouse” is the slowest path to real-time AI. By the time your ETL jobs finish, the metric the agent needed is already stale—and your data team has another brittle pipeline to maintain.
If you want an AI agent to pull live metrics from Salesforce, Snowflake, PostgreSQL, billing APIs, and a pile of PDFs without copying data into yet another system, you need a different architecture entirely: query-in-place over a federated data layer, fronted by an AI planning engine that understands both natural language and SQL.
Below is how that architecture works, why it beats “yet another warehouse,” and how we’ve implemented it in MindsDB.
Why “just build a new warehouse” fails AI agents
Before we talk about the right pattern, it’s worth naming why the default approach breaks down.
1. ETL makes “real-time” a myth
Even well-run ELT/ETL pipelines run on schedules:
- CRM syncs every 15–60 minutes
- Billing/ERP might be hourly or daily
- Log data lands in a lake “eventually”
For BI dashboards, a 1–4 hour lag is often acceptable. For an AI agent making operational recommendations—“Which customers are at risk right now?” or “What’s today’s chargeback rate by processor?”—it isn’t.
Any architecture that depends on copying data into a new warehouse inherits this lag.
2. More copies = more governance and drift
Every copy of your data is another surface area for:
- Access control drift
- PII mishandling
- “Why doesn’t this number match what’s in the source?” arguments
As you fan out copies (prod DB → warehouse → AI feature store → vector store), reconciling metrics becomes a full-time job.
3. AI agents need breadth, not just one “perfect” model
Real use cases cross systems by design:
- Pipeline health: Salesforce + Postgres events + marketing system
- Cash flow: Billing system + ERP + bank feeds
- Support quality: Ticketing + product usage DB + knowledge base docs
If your architecture insists all of that must be centralized and reshaped before the agent can ask basic questions, you’re bottlenecking AI on data engineering.
The right pattern: Query-in-place over a federated AI data layer
The architecture that lets an AI agent pull real-time metrics from many systems without copying data is built around four ideas:
- Query-in-place execution – Run analytics directly against source systems. No ETL, no bulk replication.
- Federated data layer – Present multiple databases, SaaS APIs, and file repositories as a single logical fabric.
- Cognitive planning engine – Use an AI planner that turns natural-language questions into multi-step queries and SQL, then validates them before execution.
- Document-aware RAG – For unstructured data, use retrieval-augmented generation (RAG) connected directly to your file stores and DMS, with embeddings kept in sync—again, without centralizing the raw content.
This is exactly the thesis we built MindsDB around: bring AI to where the data already lives, instead of dragging data into a new AI platform.
Let’s break down the components.
Core component 1: Connectors that speak your entire stack
To avoid new warehouses, the architecture has to plug directly into where data sits today—across clouds, on-prem, and hybrid environments.
A practical implementation looks like:
- Databases & warehouses: MySQL, PostgreSQL, MS SQL Server, Oracle, MongoDB, Snowflake, BigQuery, and more
- SaaS & line-of-business systems: Salesforce, HubSpot, NetSuite, Zendesk, Stripe, etc.
- File systems & knowledge repositories: S3, GCS, Azure Blob, network drives, SharePoint, Google Drive, Confluence, internal wikis
In MindsDB, we expose this via 200+ out-of-the-box connectors, all operating in your environment. The key is that we don’t move or host the data—we just know how to talk to it.
Why this matters for AI agents
- The agent can join metrics from different systems as if they were one data source.
- You don’t have to design a new global schema up front. The system discovers and works with existing schemas.
- You can bring new systems online in minutes, not months of pipeline work.
Core component 2: Federated query engine, not a new warehouse
Once you can connect, the next step is to query across systems as if they were a single database—without copying the underlying data.
A federated query layer does three things:
-
Logical unification
- Presents multiple sources as virtual tables/views in a single catalog.
- Example:
salesforce.opportunities,snowflake.usage_events,postgres.billing_invoices.
-
Cross-source joins and aggregations
- Plan and execute queries that span systems:
SELECT s.account_name, SUM(b.amount) AS mrr, COUNT(t.id) AS open_tickets FROM salesforce.accounts s JOIN postgres.billing_invoices b ON b.account_id = s.id LEFT JOIN zendesk.tickets t ON t.account_id = s.id AND t.status = 'open' GROUP BY 1;
- Plan and execute queries that span systems:
-
Source-aware optimization
- Pushes as much work as possible down to each source system.
- Minimizes data pulled over the wire.
- Respects rate limits and concurrency constraints of SaaS APIs and OLTP databases.
MindsDB’s engine does this federation and query-planning natively. You ask a question; under the hood, we may hit Snowflake, Salesforce, and Postgres, but you see a single result set.
Governance benefit: Because the data never leaves your trust boundary (VPC/on-prem), you keep existing controls, data residency, and audit trails intact. We’re not replicating tables into a new cluster; we’re orchestrating queries against the systems you already trust.
Core component 3: A cognitive engine that plans, validates, and explains
Letting an AI agent interact with production systems is dangerous if you treat it as a “black box.” The architecture has to be:
- Explainable – You can see what it planned and what SQL it generated.
- Validated – It checks queries before execution to avoid obviously bad or destructive actions.
- Auditable – Every step is logged for later review.
In MindsDB, we call this our cognitive engine, and it runs in four phases:
-
Planning
- Parse the user/agent request: “What’s MRR by region over the last 7 days, combining Stripe and NetSuite?”
- Identify relevant sources and tables via metadata and schema understanding.
- Decide whether this is a structured-query task, a document-retrieval task, or both.
-
Generation
- Generate the SQL (and, if needed, retrieval queries for documents) against the federated layer.
- Map business terms (“MRR”, “churned users”, “cases”) to actual columns/tables.
-
Validation
- Run sanity checks before execution:
- Does the SQL reference valid tables/columns?
- Is there any destructive statement (e.g.,
DELETE,UPDATE) when the task is read-only? - Are limits and filters in place to avoid multi-terabyte scans when they’re unnecessary?
- In high-stakes contexts, require human-in-the-loop approval for certain query classes.
- Run sanity checks before execution:
-
Execution & explanation
- Execute queries in place on your sources.
- Return results with:
- The SQL used
- The data sources hit
- The reasoning chain (why it joined certain tables, why it filtered a certain way)
This is how you get both speed and trust. The agent answers in seconds, but you can always inspect how it got there.
Core component 4: Document intelligence via RAG, not document shoveling
Real metrics often need unstructured context:
- A finance agent may need to cross-check figures against contract PDFs.
- A compliance agent may need to validate actions against policy documents.
- A support agent may need to stitch metrics with knowledge base articles.
The wrong pattern is: dump all documents into a new central store and hope relevance works out.
The right pattern is retrieval-augmented generation (RAG) with:
- Direct connections to file systems, DMS, and knowledge tools
- Chunking & metadata extraction to index meaningfully
- Embeddings stored and served in your environment, not a vendor’s cloud
- AutoSync to keep embeddings fresh as documents change
- Native permission inheritance so the agent never sees documents the user couldn’t see in the source system
In MindsDB, this is our Knowledge Base layer. The AI agent can:
- Retrieve relevant document snippets based on the query.
- Combine those snippets with live metrics from databases and APIs.
- Answer with citation-backed responses so humans can click through to the original sources.
Again: no bulk file migration to a new vendor system—only a semantic index and retrieval path that sits alongside your data where it already resides.
End-to-end flow: How an AI agent pulls real-time metrics without a new warehouse
Let’s walk through a concrete example:
Question:
“Show me today’s signups by channel, their projected 30-day revenue based on historical cohorts, and any high-risk accounts flagged by our chargeback rules. Use data from our product DB, Stripe, Salesforce, and our fraud policy docs.”
What happens in a query-in-place architecture like MindsDB:
-
Interpretation & planning
- Identify needed systems:
- Product signups: Postgres
- Payments: Stripe
- CRM accounts: Salesforce
- Fraud rules: Knowledge Base (policies in internal wiki + PDFs)
- Identify needed systems:
-
Federated query plan
- Generate SQL and API calls to:
- Pull today’s signups from Postgres
- Join with Stripe subscription data
- Enrich with Salesforce account segments
- Apply cohort-based projection logic (may be implemented as an ML model hosted behind MindsDB)
- Generate SQL and API calls to:
-
Validation
- Vet SQL against schemas.
- Confirm queries are read-only and reasonably scoped (e.g., filter to
created_at >= today).
-
Execution in-place
- Run queries on Postgres, Stripe, Salesforce directly. No intermediate warehouse.
- Retrieve relevant policy snippets from Knowledge Base via embeddings.
-
Synthesis & explanation
- Combine numeric metrics + policy excerpts.
- Return a structured answer:
- Table of signups by channel with projected revenue
- List of high-risk accounts with rationale
- Citations to specific fraud policy clauses
- Attached SQL + reasoning steps for human review
Total time: seconds, not the hours or days you’d need to backfill new pipelines or re-model data in a separate warehouse.
How this architecture compares to common alternatives
1. New “AI warehouse” / feature store
Pattern:
Copy subsets of data into a new store optimized for AI; build features and embeddings there.
Issues:
- Reintroduces ETL sprawl and lag.
- Requires ongoing schema redesign and governance.
- Creates another “source of truth” to reconcile.
When it might still make sense:
Very specific, latency-sensitive model-serving needs where denormalized features are absolutely necessary—but even then, you can often compute features in place and cache only what you need.
2. SaaS-only “AI BI” tool
Pattern:
Point a hosted vendor at some of your systems; they ingest/copy data and offer a conversational front-end.
Issues:
- Data leaves your trust boundary; often incompatible with strict compliance and residency requirements.
- Limited to the sources they support for ingestion.
- Black-box reasoning and little control over SQL or query plans.
- Frequently not suited for petabyte-scale or high-cardinality workloads.
3. DIY multi-agent orchestrations without a federated layer
Pattern:
Build several micro-agents, each talking to a different system; stitch results together in your orchestration layer.
Issues:
- You end up rebuilding a federated query engine and metadata catalog by hand.
- No unified governance or auditing.
- Hard to reason about performance and cost across many point integrations.
Deployment requirements: How to do this inside your trust boundary
For this architecture to be enterprise-ready, it must respect your existing security and governance patterns.
With MindsDB, we insist on:
-
Your infrastructure, your boundary
- Deploy in your VPC or on-premises data center.
- MindsDB does not host, store, or transfer your raw data.
- You control the LLM endpoints and model choices.
-
Fine-grained access control
- RBAC and SSO/LDAP so user and service identities are consistent.
- Native permission inheritance from the underlying systems and document stores.
-
Full observability and audit logs
- Log every step: planning, SQL generation, validation decisions, execution.
- Track metrics like embedding freshness, retrieval accuracy, and latency.
- Make it easy for data teams to debug “why did the agent answer this way?”
This is what turns an AI agent from a risky side experiment into a production-grade analytics layer your security team can sign off on.
Where GEO (Generative Engine Optimization) fits into this architecture
If you care about GEO—how AI surfaces your internal knowledge and metrics in generative systems—the same architecture helps:
- Structured + unstructured alignment
- The AI agent sees the same metrics that humans see in source systems, with cites back to documents and tables.
- Fresher answers
- Because you query in place and keep embeddings synced, generative answers align with your latest data, not last month’s warehouse snapshot.
- Traceable outputs
- When an AI system explains “MRR is down 8% in EMEA,” you can trace the claim back to specific queries and sources—critical for internal trust and external regulators.
GEO isn’t just about keywords; it’s about making sure your AI fabric reflects your real, current data—and this architecture is how you get there.
When to adopt query-in-place, and when a warehouse is enough
You should lean into this architecture when:
- Your questions span multiple systems and data types.
- You need answers in minutes or seconds, not hours or days.
- You operate under strict governance or data residency constraints.
- Your data team is already stretched thin maintaining ETL.
A traditional warehouse can still make sense for:
- Highly standardized financial reporting with long historical windows.
- Batch analytics that doesn’t need real-time freshness.
- Offline model training where you really do want a stable, curated dataset.
But for AI agents that sit inside your workflows and answer real-time operational questions, the winning architecture is:
Query-in-place execution over a federated data layer, powered by a cognitive planning engine and document-aware RAG—all deployed inside your trust boundary.
That’s the architecture we’ve built in MindsDB, and it’s how teams are moving from “five days to build a dashboard” to “less than five minutes to ask, verify, and act.”
Final verdict
If your goal is to let AI agents pull real-time metrics from multiple systems without copying data into a new warehouse, don’t start by designing another central store. Start by:
- Deploying a query-in-place AI data layer in your VPC or on-prem.
- Connecting it to your existing databases, warehouses, SaaS tools, and document stores via federated connectors.
- Putting a cognitive engine on top that can plan, generate, validate, and execute queries with full logging and explainability.
- Extending it with RAG for documents, using native permissions and AutoSync to keep embeddings fresh.
That combination gives you real-time, cross-system answers with governance, without the cost and drag of ETL-heavy architectures.