Application Observability#

Introduction#

This tutorial shows you how to enhance RTI Connext’s built-in telemetry by adding application-level metrics published as DDS topics. This approach allows for a unified view of both system performance and application behavior, with the following benefits:

  • Comprehensive Observability: Seamlessly combine Connext DDS system-level telemetry with application-level insights to gain a holistic view of your system’s health and performance.

  • Scalability: Efficiently scale telemetry collection across multiple DDS applications using RTI’s Routing Service, ensuring minimal impact on application performance.

  • Centralized Data Collection: Collect telemetry data from distributed sources, preprocess it with the Routing Service, and export it in a Prometheus-compatible format via OpenTelemetry, simplifying data aggregation.

  • Easy Integration: With minimal configuration changes and no need to modify application code, you can easily integrate application telemetry into your existing system.

  • Dockerized Setup: The solution is packaged within a Docker container, enabling easy testing, building, and deployment without affecting the host system.

  • Flexibility and Extensibility: The custom adapter and data model are designed for future extensibility, enabling the addition of new metrics without breaking compatibility.

Build this use case#

Here’s what you’ll build:

  • Custom RTI Routing Service Adapter: Collect application-level telemetry data from DDS topics and transform it into a Prometheus-compatible format using OpenTelemetry.

  • Data Model for Telemetry: Define a flexible data model that supports different types of metrics such as counters, histograms, and gauges using OpenTelemetry standards.

  • Prometheus & Grafana Configuration: Configure Prometheus to scrape telemetry data, and create Grafana dashboards to visualize both Connext and application-level metrics.

  • Grafana Dashboards: Combine DDS system metrics and application telemetry in Grafana for visualizing and analyzing system health.

  • Test Data Generation: Simulate telemetry data to validate your configuration and test the Prometheus-Grafana pipeline.

For all the code and instructions to build the application telemetry use case, see RTI’s Use-cases GitHub repository.