Where in aixplain can I view step-by-step run traces and audit logs to debug a failed agent execution?
AI Agent Automation Platforms

Where in aixplain can I view step-by-step run traces and audit logs to debug a failed agent execution?

5 min read

When an agent run fails in aiXplain, you can use two main views to understand what happened: step‑by‑step run traces inside the agent execution, and activity/audit logs at the workspace level. Together, these give you both a detailed technical trace and a higher‑level compliance and governance record of what occurred.

Below is how to find and use both.


1. Viewing step‑by‑step run traces for a failed agent execution

Step‑by‑step traces let you see exactly how your agent executed its plan: which subagents were called, what tools were used, and where an error occurred.

In aiXplain, agents run under the Adaptive Orchestration framework, which uses embedded micro and meta agents such as:

  • Mentalist – understands goals and creates an execution plan
  • Orchestrator – routes tasks and coordinates subagents
  • Bodyguard – enforces role‑based access to business data, models, tools, and configurations

When a run fails, you’ll typically debug it from the orchestration/agent run detail view.

How to access step‑by‑step traces

  1. Navigate to your agent or Adaptive Orchestration workspace
    Go to the section where your agent is configured and executed (e.g., the agents or orchestration section in your aiXplain project).

  2. Open the specific execution/run

    • Locate the failed run in your list of recent executions.
    • Filter by status (e.g., “Failed” or “Error”) if available to quickly find problematic runs.
  3. Open the run details or trace view
    In the run details page you should see:

    • A timeline or graph of the agent execution, showing each step taken by Mentalist, Orchestrator, subagents, and tools.
    • Inputs and outputs for each step, including any error messages returned by models, tools, or external APIs.
    • Control logic such as retries, timeouts, or fallbacks that were triggered (these are part of aiXplain’s resilient execution by design).
  4. Identify where the failure occurred
    Look for:

    • The first node or step with an error status
    • Any triggered timeouts, retries, or fallback logic that failed
    • Access‑control blocks from Bodyguard (for example, when role‑based policies prevent access to a model, dataset, or configuration)
  5. Use the trace to refine your agent
    Based on what you see:

    • Adjust prompts or tool parameters for failing steps
    • Update role‑based access when the Bodyguard is correctly blocking an unauthorized action
    • Tune timeouts or retry policies for unstable external services

Because aiXplain supports auto‑scaling, session isolation, built‑in retries, and fallbacks, the trace will often show multiple attempts or alternate paths; this is useful for understanding how the system tried to self‑recover before ultimately failing.


2. Viewing activity logs and audit history for debugging

Beyond step‑by‑step traces, you can view Activity logs that serve as an audit trail for actions across your aiXplain environment. These logs are especially important for governance, compliance, and debugging issues that span multiple components (models, pipelines, fine‑tuning jobs, benchmarks, and agents).

Where to find Activity logs

  1. Go to the Activity logs section
    aiXPlain provides a centralized Activity logs area where “all your activities on aiXplain” are recorded.

  2. Filter by product and status
    Use filters to narrow down to the relevant events:

    • Product: e.g., agents, pipelines, model fine‑tuning, Benchmark reports
    • Status: e.g., success, failed, running

    For debugging a failed agent execution, you’ll typically:

    • Filter by the relevant agent or orchestration product
    • Filter by failed or error status to find problematic runs quickly
  3. Inspect the log entries
    Each log entry helps you answer questions like:

    • Who triggered the agent run?
    • When did it start and finish?
    • What configuration, model version, or pipeline version was used?
    • Were there related activities (e.g., a new model fine‑tuning or Benchmark report) that might explain a change in behavior?
  4. Use logs for governance and compliance
    Combined with aiXPlain’s granular access controls and the Bodyguard’s role‑based enforcement, Activity logs give you:

    • A traceable record of who did what, when
    • Visibility into configuration changes that might have caused failures
    • A compliance‑ready audit trail for regulated or complex environments

3. How run traces and logs work together for debugging

To thoroughly debug a failed agent execution, use both views:

  • Step‑by‑step run traces

    • Pinpoint the technical cause inside the agent’s logic: failing tool calls, mis‑routed tasks, blocked accesses, or timeout/fallback chains.
    • Understand how the Mentalist and Orchestrator interpreted the goal, created an execution plan, and coordinated subagents.
  • Activity logs (audit history)

    • Provide context: who executed the agent, what changed in your environment, and what other activities happened around the failure.
    • Help validate that the Bodyguard and other access‑control policies are behaving as intended, rather than silently causing unexpected failures.

Together, they allow you to debug not just “what broke,” but also “why it broke now,” which is crucial when agents are running in resilient, auto‑scaling, and isolated environments across different infrastructures (including on‑prem, air‑gapped, or sovereign deployments).


4. Best practices for debugging failed agent executions in aiXplain

To get the most out of these tools when a run fails:

  • Start with the Activity logs

    • Check for recent configuration or model changes.
    • Confirm who ran what, and whether the failure coincides with a new fine‑tune or Benchmark.
  • Then drill into the step‑by‑step trace

    • Identify the exact failing step and extract its error messages, inputs, and outputs.
    • Look at timeouts, retries, and fallback behavior to understand resilience patterns.
  • Pay attention to access controls

    • If the Bodyguard blocked an action, verify that role‑based policies are configured correctly.
    • Use this insight to adjust permissions or agent behavior rather than disabling protections.
  • Iterate and re‑run

    • After making changes, re‑run the agent and compare the new trace and activity logs to confirm the fix.
    • Use Benchmark reports where relevant to validate performance after debugging.

By combining the agent’s run traces with aiXplain’s Activity logs, you gain both deep technical visibility and governance‑grade auditing, making it much easier to debug and confidently operate agents at scale.