2. Components

Connext Observability Framework consists of three RTI components:

  • RTI Observability Library enables you to instrument a Connext application to emit telemetry data. The library also accepts remote commands to change the set of emitted telemetry data at runtime.

  • RTI Observability Collector Service scalably collects telemetry data from multiple Connext applications and stores this data in a third-party observability backend.

  • RTI Observability Dashboards enable you to visualize and alert based on the Connext application metrics, as well as display Connext log messages.

In addition, Observability Framework requires third-party components for storing and visualizing telemetry data. This release provides native integration with Prometheus for metrics, Grafana Loki for storage, and Grafana for visualization. Future releases will use OpenTelemetry to enable integration with other third-party storage components.

All references to Observability Framework in this documentation include all third-party and RTI components except Observability Library.

Figure 2.1 shows a simple representation of how Observability Framework components work together.

The RTI Observability Framework

Figure 2.1 Observability Framework Components

2.1. Observability Library

Observability Library includes the following key features:

  • Collection and emission of Connext metrics and logs. Secure logs will be supported in future releases.

  • Configuration using a new MONITORING QosPolicy (DDS Extension). The QoS policy can be set programmatically or via XML.

  • Runtime changes to the set of emitted telemetry data using remote commands from Observability Collector Service. In this release, the library only allows you to change the emission and generation log verbosity level for a Connext application.

  • Ability to enable and disable use of Observability Library at runtime by changing the Monitoring QoS policy.

  • Lower overhead as compared to using the RTI Monitoring Library.

2.2. Observability Collector Service

Observability Collector Service scalably collects telemetry data emitted by Observability Library in a Connext application. Collector Service is distributed as a Docker™ image and can work in two modes:

  • Forwarder: Collector Service forwards the telemetry data from a Connext application to other collector instances. This mode is not supported in the current release.

  • Storage: Collector Service stores the telemetry data in a third-party observability backend. In this release, the observability backend uses Prometheus for metrics and Grafana Loki for logs. Future releases will use OpenTelemetry to allow integration with other third-party components.

Observability Collector Service includes the following key features:

  • Collection and filtering of telemetry data emitted by Connext applications (using Observability Library) or other collectors. This release does not provide filtering capabilities.

  • Storage of telemetry data in third-party components (Prometheus for metrics and Grafana Loki for logs).

  • Remote command forwarding from Observability Dashboards to the Connext applications and other resources to which the commands are directed. This release only allows forwarding commands that change the logging verbosity of Connext applications.

2.2.1. Storage Components

In this release, Observability Collector Service includes native integration with Prometheus and Grafana Loki to store metrics and logs, respectively.

Future releases will allow integrating with other third-party storage components using OpenTelemetry and the OpenTelemetry Collector.

OpenTelemetry Integration

Figure 2.2 OpenTelemetry Integration

2.3. Observability Dashboards

A set of hierarchical Grafana dashboards sends alerts when a problem occurs and provide visualizations to help perform root cause analysis. The dashboards receive telemetry data from Observability Collector Service, which stores data gathered from the third-party backend tools.

The first layer of the Grafana dashboards provides a health status summary focused on five golden signals: Bandwidth, Saturation, Data Loss, System Errors, and Delays.

The top-level dashboard also provides access to the system logs and indicates the number of entities running in the system. To get additional details on error conditions, select any of the golden signals displaying an error.

Dashboard delay error

2.4. How We Provide the Components

This section describes how Observability Framework components are provided in the current release and how RTI will provide them in future releases.

2.4.1. Observability Library

Observability Library is provided as a shared and static library called rtimonitoring2. For details on how to use the library, refer to Observability Library.

2.4.2. Collection, Storage, and Visualization Components

2.4.2.1. Current Release

2.4.2.1.1. Docker Compose (Prepackaged)

The Observability Framework package enables you to deploy and run Observability Collector Service and third-party components Prometheus, Grafana Loki, and Grafana using Docker Compose in a single Linux® host. For details on the supported Docker Compose environments, see Supported Docker Compose Environments.

This installation option facilitates initial product evaluation because it does not require you to deploy all these components individually.

For additional information on how to use Docker Compose™ to run Observability Framework, see Configuring and Running Observability Framework Components.

2.4.2.1.2. Docker (Separate Deployment)

As an alternative to the prepackaged Docker Compose provided by RTI, you can also run Observability Framework components standalone.

Observability Collector Service is distributed as a Docker image hosted in Dockerhub. This is the same publicly available image used by the prepackaged Docker Compose installation, and it requires a valid RTI license to run.

The Docker image included with Collector Service is pre-configured to run in storage mode using Prometheus to store metrics and Grafana Loki to store logs. The built-in configuration also enables you to send telemetry data to the collector on a LAN using UDPv4 Transport, or on a WAN using RTI Real-Time WAN Transport.

For additional information on how to use the Docker image included with Collector Service, refer to Docker’s Collector Service article.

The third-party components Prometheus, Grafana Loki, and Grafana are also distributed as Docker images by their respective vendors. You can use these images standalone instead of RTI’s prepackaged Docker Compose.

2.4.2.2. Future releases

2.4.2.2.1. Executable

In future releases, Collector Service will be provided as a standalone executable without using Docker to deploy.

2.4.2.2.2. Kubernetes

In future releases, the Docker images for Collector Service, Prometheus, Grafana Loki, and Grafana can be deployed on an orchestrated platform such as Kubernetes. RTI will provide example deployment configurations for these deployments when they are made available.