Amazon Bedrock Agentcore Observability With Langfuse

Bonisiwe Shabane

-Jan 27, 2026, 4:18 AM

amazon bedrock agentcore observability with langfuse

The rise of artificial intelligence (AI) agents marks a change in software development and how applications make decisions and interact with users. While traditional systems follow predictable paths, AI agents engage in complex reasoning that remains hidden from view. This invisibility creates a challenge for organizations: how can they trust what they can’t see? This is where agent observability enters the picture, offering deep insights into how agentic applications perform, interact, and execute tasks. In this post, we explain how to integrate Langfuse observability with Amazon Bedrock AgentCore to gain deep visibility into an AI agent’s performance, debug issues faster, and optimize costs. We walk through a complete implementation using Strands agents deployed on AgentCore Runtime followed by step-by-step code examples.

Amazon Bedrock AgentCore is a comprehensive agentic platform that can deploy and operate highly capable AI agents securely, at scale. It offers purpose-built infrastructure for dynamic agent workloads, powerful tools to enhance agents, and essential controls for real-world deployment. AgentCore is comprised of fully managed services that can be used together or independently. These services work with any framework including CrewAI, LangGraph, LlamaIndex, and Strands Agents, and any foundation model in or outside of Amazon Bedrock, offering flexibility and reliability. AgentCore emits telemetry data in standardized OpenTelemetry (OTEL)-compatible format, enabling easy integration with an existing monitoring and observability stack. It offers detailed visualizations of each step in the agent workflow, enabling inspection of an agent’s execution path, audit intermediate outputs, and debugging performance bottlenecks and failures.

Langfuse uses OpenTelemetry to trace and monitor agents deployed on Amazon Bedrock AgentCore. OpenTelemetry is a Cloud Native Computing Foundation (CNCF) project that provides a set of specifications, APIs, and libraries that define a standard way to collect distributed traces and metrics from an application. Users can now track performance metrics including token usage, latency, and execution durations across different processing phases. The system creates hierarchical trace structures that capture both streaming and non-streaming responses, with detailed operation attributes and error states. Through the /api/public/otel endpoint, Langfuse functions as an OpenTelemetry Backend, mapping traces to its data model using generative AI conventions. This is particularly valuable for complex large language model (LLM) applications utilizing chains and agents with tools, where nested traces help developers quickly identify and resolve issues.

The integration supports systematic debugging, performance monitoring, and audit trail maintenance, making it easier for teams to build and maintain reliable AI applications on Amazon Bedrock AgentCore. What is Amazon Bedrock AgentCore? Amazon Bedrock AgentCore is a managed service that enables you to build, deploy, and manage AI agents in production. It provides containerized agent runtimes that can execute complex workflows, use tools, and interact with external APIs while leveraging foundation models from Amazon Bedrock. What is Langfuse? Langfuse is an open-source platform for LLM engineering.

It provides tracing and monitoring capabilities for AI agents, helping developers debug, analyze, and optimize their products. Langfuse integrates with various tools and frameworks via native integrations, OpenTelemetry, and SDKs. This guide shows you how to integrate Langfuse with Amazon Bedrock AgentCore to trace your agent executions using OpenTelemetry. Install the required Python packages for building and deploying AgentCore agents with Langfuse tracing: Configure your AWS and Langfuse credentials: The emergence of artificial intelligence (AI) agents is redefining how software applications are developed, making decisions, and interacting with users.

Unlike traditional systems, which follow predictable pathways, AI agents employ complex reasoning processes that are often obscured from developers and stakeholders. This lack of transparency raises significant questions: How can organizations cultivate trust in systems that they cannot fully comprehend? Enter agent observability. Agent observability provides organizations with the tools to gain profound insights into their applications’ performance, interactions, and task execution. By making the once invisible workings of AI agents visible, organizations can monitor, debug, and optimize their AI systems efficiently and effectively. In this post, we’ll explore how to integrate Langfuse observability with Amazon Bedrock AgentCore.

This integration not only enhances visibility into an AI agent’s performance but also expedites issue resolution and cost optimization. Amazon Bedrock AgentCore is a robust platform designed to deploy and operate highly capable AI agents securely and at scale. It comprises fully managed services that can work together or independently, offering flexibility and reliability with purpose-built infrastructure for dynamic agent workloads. AgentCore is compatible with various frameworks, such as CrewAI, LangGraph, LlamaIndex, and Strands Agents, facilitating a seamless development experience. AgentCore emits telemetry data in a standardized OpenTelemetry (OTEL)-compatible format, allowing easy integration with existing monitoring and observability stacks. This capability enables detailed visualizations of each step in the agent workflow, facilitating inspections of execution paths and audits of intermediate outputs.

Amazon Web Services (AWS) has published a new technical guide showing how developers can integrate Langfuse observability with Amazon Bedrock AgentCore, enhancing visibility into the performance and behavior of advanced AI agents deployed on... This combined solution aims to help teams monitor, debug, and optimize agentic applications that rely on large language models (LLMs) and complex workflows. As organizations build more intelligent applications, traditional monitoring tools fall short in tracking the intricate reasoning and interactions that AI agents perform. Agent observability fills this gap by exposing detailed traces of agent execution, making it possible to understand how agents make decisions and interact with tools in real time. This transparency is essential for reliability, performance tuning, and cost management. Amazon Bedrock AgentCore is AWS’s managed platform for deploying and operating scalable, secure AI agents.

It provides purpose‑built infrastructure that integrates with open‑source frameworks such as Strands, CrewAI, LangGraph, and LlamaIndex, and supports models both inside and outside of Bedrock. AgentCore emits telemetry data in an OpenTelemetry (OTEL)‑compatible format, enabling seamless integration with monitoring solutions. By pairing this telemetry with Langfuse, developers gain hierarchical trace structures that map an agent’s lifecycle, including token usage, latency, tool invocations, errors, and other key metrics. Traces flow through Langfuse’s OTEL backend endpoint, giving teams the ability to diagnose issues, audit agent decisions, and identify performance bottlenecks quickly—capabilities that are especially valuable for complex LLM chains with tools and nested... The blog post includes a step‑by‑step walkthrough showing how to deploy a Strands‑based agent on AgentCore Runtime, export its telemetry via OTEL, and visualize traces in Langfuse. Developers can use this approach to improve operational confidence, support observability best practices, and maintain cost efficiencies as they scale agentic AI in production environments.

Amazon Web Services (AWS) has unveiled an important extension to its AI agent platform by integrating Langfuse observability into Amazon Bedrock AgentCore, giving developers and operations teams detailed visibility into agent behavior, performance metrics,... This move addresses one of the biggest emerging challenges in the era of agentic AI: understanding what autonomous AI systems are doing behind the scenes. Until now, AI agents software that takes actions autonomously using large language models (LLMs) have presented a black box problem: they can make decisions, call APIs, and execute multi-step workflows without leaving transparent traces... With this Langfuse integration, telemetry data from Bedrock agents flows through OpenTelemetry (OTEL) standards into Langfuse’s dashboards, enabling developers to debug issues faster, trace nested operations, and optimize performance costs in real time. The integration unlocks trace capture for execution details such as model token usage, tool calls, latency for each step, and hierarchical execution flows, allowing teams to pinpoint bottlenecks, audit unexpected behavior, and fine-tune cost... Amazon Bedrock AgentCore is AWS’s managed platform for deploying and scaling AI agents securely at enterprise scale, supporting any model inside or outside AWS and multiple popular frameworks such as Strands Agents, LangGraph, and...

AgentCore emits structured telemetry in OTEL format, which can natively integrate with observability systems developers already use. Langfuse an increasingly popular open-source observability and evaluation platform focused on LLM applications serves as the backend for OTEL export. Through this integration, teams get: The rise of AI agents is reshaping how software systems make decisions and interact with users. Much like a well-oiled machine, AI agents require regular maintenance and clear insights into their inner workings to ensure efficient performance. Integrating Langfuse observability with Amazon Bedrock AgentCore provides businesses with a powerful tool to monitor, debug, and optimize AI systems in production environments.

This integration leverages the strengths of technologies such as OpenTelemetry (OTEL)—an industry standard for capturing and exporting performance data. For those unfamiliar, OTEL acts as a digital magnifying glass, capturing key metrics like token usage (the units of processing in language models), latency, execution durations, and cost metrics. The process uses the Strands framework, a Python-based toolkit that simplifies the creation of AI agents, and an Anthropic Claude-based model hosted through Amazon Bedrock. Together, they enable granular observability that can turn complex debugging into a more intuitive process. At the heart of the integration is the ability to disable Amazon Bedrock AgentCore’s default observability and route telemetry data to Langfuse via its dedicated OTEL endpoint (/api/public/otel). This step is essential to benefit from Langfuse’s comprehensive metrics and detailed dashboards, which include hierarchical traces—a structured view that breaks down each step of an operation like layers of an onion.

“Through the /api/public/otel endpoint, Langfuse functions as an OpenTelemetry Backend, mapping traces to its data model using generative AI conventions.” The technical process involves setting up a Strands agent using Python and the Strands SDK. By carefully configuring the Bedrock runtime and disabling its built-in monitoring, teams can redirect performance data directly to Langfuse. This data includes: With AgentCore, you can trace, debug, and monitor AI agents' performance in production environments. AgentCore Observability helps you trace, debug, and monitor agent performance in production environments.

It offers detailed visualizations of each step in the agent workflow, enabling you to inspect an agent's execution path, audit intermediate outputs, and debug performance bottlenecks and failures. AgentCore Observability gives you real-time visibility into agent operational performance through access to dashboards powered by Amazon CloudWatch and telemetry for key metrics such as session count, latency, duration, token usage, and error rates. Rich metadata tagging and filtering simplify issue investigation and quality maintenance at scale. AgentCore emits telemetry data in standardized OpenTelemetry (OTEL)-compatible format, enabling you to easily integrate it with your existing monitoring and observability stack. By default, AgentCore outputs a set of key built-in metrics for agents, gateway resources, and memory resources. For memory resources, AgentCore also outputs spans and log data if you enable it.

You can also instrument your agent code to provide additional span and trace data and custom metrics and logs. See Add observability to your Amazon Bedrock AgentCore resources to learn more. All of the metrics, spans, and logs output by AgentCore are stored in Amazon CloudWatch, and can be viewed in the CloudWatch console or downloaded from CloudWatch using the AWS CLI or one of...

Amazon Bedrock Agentcore Observability With Langfuse

People Also Search

The Rise Of Artificial Intelligence (AI) Agents Marks A Change

Amazon Bedrock AgentCore Is A Comprehensive Agentic Platform That Can

Langfuse Uses OpenTelemetry To Trace And Monitor Agents Deployed On

The Integration Supports Systematic Debugging, Performance Monitoring, And Audit Trail

It Provides Tracing And Monitoring Capabilities For AI Agents, Helping