Just a second...

Metrics

Diffusion™ Cloud metrics provide information about the server, client sessions, topics and log events. Diffusion Cloud can provide metrics in three main ways: via the web console, via JMX-compatible MBeans and via Prometheus.

Methods of accessing metrics

There are multiple ways to access the metrics. As of Diffusion Cloud 6.3, the same information is available through each access method.

Note: In previous versions of Diffusion Cloud , metrics were sometimes referred to as "statistics".
Web console metrics
The metrics are available through the Diffusion Cloud web console. This is the most convenient way to access metrics for development and testing purposes, but does not support aggregating metrics across multiple servers or recording and retrieving historical data. JMX or Prometheus access are more suitable for production systems.
MBeans for JMX
Diffusion Cloud registers MBeans with the Java Management Extensions (JMX) service. This enables monitoring of the metrics using the JMX tools that are available from a range of vendors.
Prometheus
Diffusion Cloud provides endpoints for the Prometheus monitoring system. To use Prometheus, your Diffusion Cloud server needs to have a Commercial with Scale & Availability license, or an evaluation license such as the Community Evaluation license. See License types for more information.

Accessing metrics

The metrics can be accessed in the following recommended ways:

  • As MBeans, using a JMX tool, such as VisualVM or JConsole. See the table below for MBean interfaces. For more information, see Using Java VisualVM or Using JConsole.
  • Using the Diffusion Cloud management console. For more information, see Diffusion management console.
  • As Prometheus endpoints at http://localhost:8080/metrics, provided you have a suitable license. If not accessing from the same machine as the Diffusion Cloud server, replace localhost with the IP address or hostname.

    You can change the request path or disable the Prometheus service by changing the http-service binding in the webserver configuration. See Configuring the Diffusion Cloud web server.

    The Prometheus service respects the Accept request header to determine whether to use the legacy Prometheus 0.0.4 or the OpenMetrics 1.0.0 text format. See https://prometheus.io/docs/instrumenting/exposition_formats/.

Collecting custom metrics using metric collectors

A metric collector is a way to collect metrics for a particular set of topics or sessions, configured by you.

You can use the Diffusion Cloud web console or JMX to define metric collectors. See Configuring metrics for details.

Collected metrics are published to the console, JMX and optionally via Prometheus.

Metric types

Metrics are divided into counters, gauges, and info metrics. These have the same meaning as OpenMetrics types used by Prometheus.

Counter metric
A counter metric has a cumulative value. The value is initialized to zero when the server is started and will only increase over a server's lifetime. For example, the total number of bytes received by the server is reported using a counter metric.
Gauge metric
A gauge metric has a value that can increase and decrease. For example, the number of connected sessions is reported using a gauge metric.
Info metric
An info metric reports informational text about the server. For example, the release version of the server is reported using an info metric.

Built-in metrics

This section describes the built-in metrics that are always available, aside from any metric collectors you may have created.

Metrics are not persisted between server restarts. Restarting the server will set all counter metrics back to zero.

The following is a list of all the top level statistics and their attributes.

There are some differences between the legacy Prometheus 0.0.4 text format and the OpenMetrics 1.0.0 text format which affect this table. In particular,
  • The metric name may differ. In this case, the legacy name is listed in the column 'Legacy export'.
  • The Info type does not exist in the legacy format. If using the legacy format, 'Info' should be read as 'Gauge'.
Table 1. Metrics provided by Diffusion Cloud
Metric name Type Description OpenMetrics export Legacy export
Log metrics LogMetrics MBean
count Counter Number of log events for a given ID code and severity level (levels are error, warn, info, debug, trace). diffusion_log_events_total{code="PUSH-12345",level="warn"} diffusion_log_events{code="PUSH-12345",level="warn"}
Network metrics NetworkMetrics MBean
inbound_bytes Counter Data received from the network in bytes. diffusion_network_inbound_bytes_total diffusion_network_inbound_bytes_count
outbound_bytes Counter Data sent to the network in bytes. diffusion_network_outbound_bytes_total diffusion_network_outbound_bytes
Remote server metrics MBean
bytes Gauge Stored data replicated from remote servers, in bytes. diffusion_remote_server_bytes  
Session metrics SessionMetrics MBean
connected Gauge Number of connected sessions. diffusion_sessions_connected  
inbound_bytes Counter Session data received from the network in bytes. diffusion_sessions_inbound_bytes_total diffusion_sessions_inbound_bytes
inbound_messages Counter Session data received from the network in messages. diffusion_sessions_inbound_messages_total diffusion_sessions_inbound_messages
open Gauge Number of open sessions. diffusion_sessions_open  
outbound_bytes Counter Session data sent to the network in bytes. diffusion_sessions_outbound_bytes_total diffusion_sessions_outbound_bytes
outbound_messages Counter Session data sent to the network in messages. diffusion_sessions_outbound_messages_total diffusion_sessions_outbound_messages
peak Gauge Peak number of sessions. diffusion_sessions_peak  
total Counter Total sessions opened. diffusion_sessions_total  
Topic metrics TopicMetrics MBean
count Gauge Current number of topics. diffusion_topics_current diffusion_topics_count
total Counter Total number of topics. diffusion_topics_total  
bytes Gauge The value data stored by the topics, in bytes. diffusion_topics_bytes  
persistence_bytes Gauge The value data stored to persist the topic, in bytes. diffusion_topics_persistence_bytes  
subscriptions Gauge Number of direct subscriptions to the topics. diffusion_topics_subscriptions  
subscribers Gauge Number of sessions subscribed to one or more topics. diffusion_topics_subscribers  
subscriber_updates Counter Number of updates sent to subscribers. diffusion_topics_subscriber_updates_total diffusion_topics_subscriber_updates
subscriber_update_bytes Counter Data sent to subscribers, before message compression, in bytes. diffusion_topics_subscriber_update_bytes_total diffusion_topics_subscriber_update_bytes
subscriber_update_compressed_bytes Counter Data sent to subscribers, after message compression, in bytes. diffusion_topics_subscriber_update_compressed_bytes_total diffusion_topics_subscriber_update_compressed_bytes
value_updates Counter Number of updates to a topic that provide a full value. diffusion_topics_value_updates_total diffusion_topics_value_updates
value_count Gauge Number of values held by topics. diffusion_topics_value_count  
delta_updates Counter Number of updates to a topic that provide a partial value. diffusion_topics_delta_updates_total diffusion_topics_delta_updates
value_bytes Counter On each change of topic value, this metric increases by the size of the new value. diffusion_topics_value_bytes_total diffusion_topics_value_bytes
delta_bytes Counter On each change of topic value, this metric increases by the size of an internal delta representing the difference the previous and new values. diffusion_topics_delta_bytes_total diffusion_topics_delta_bytes
Server metrics Server MBean
release Info Diffusion release information. diffusion_release_info diffusion_release
license_properties Info Diffusion license information. diffusion_license_info diffusion_license
free_memory Gauge Free memory available in the java heap. diffusion_server_free_memory_bytes
license_expiry_date Info License expiry date. diffusion_server_license_expiry_date
max_memory Gauge Maximum java heap memory that can be allocated. diffusion_server_max_memory_bytes
number_of_topics Gauge Number of topics hosted by this server. diffusion_server_number_of_topics
session_locks Info Allocated session locks. diffusion_server_session_locks
start_date Info Date and time at which this server was started. diffusion_server_start_date
start_date_millis Gauge Time at which this server was started, as milliseconds since the epoch. diffusion_server_start_date_millis
time_zone Info Time zone this server is using. diffusion_server_time_zone
total_memory Gauge Total memory allocated to the java heap. diffusion_server_total_memory_bytes
uptime Info Time this server has been running, as a formatted string. for example, "3 hours 4 minutes 23 seconds". diffusion_server_uptime
uptime_millis Gauge Time this server has been running, in milliseconds. diffusion_server_uptime_millis
used_physical_memory_size Gauge Used physical memory, in bytes. diffusion_server_used_physical_memory_size_bytes
used_swap_space_size Gauge Used swap space, in bytes. diffusion_server_used_swap_space_size_bytes
user_directory Info Directory in which this server was started. diffusion_server_user_directory
user_name Info User account under which this server is running. diffusion_server_user_name
Operating System Metrics OperatingSystem MBean
os_architecture Info Operating system architecture. os_architecture
os_name Info Operating system name. os_name
os_version Info Operating system version. os_version
os_max_file_descriptors Gauge Maximum number of open file descriptors. os_max_file_descriptors
os_available_processors Gauge Available processors. os_available_processors
os_physical_memory_bytes Gauge Physical memory size in bytes. os_physical_memory_bytes
os_process_cpu_load Gauge Server cpu utilization. os_process_cpu_load
os_system_cpu_load Gauge System cpu utilization. os_system_cpu_load
os_system_load_average Gauge System load average. os_system_load_average
Memory Metrics Memory MBean
java_memory_heap_usage Gauge Jvm heap memory usage. java_memory_heap_usage
java_memory_non_heap_usage Gauge Jvm non-heap memory usage. java_memory_non_heap_usage
Connector Metrics Connector MBean
keep_alive_queue_maximum_depth Gauge The maximum queue depth used for clients in the keep-alive state. diffusion_connector_keep_alive_queue_maximum_depth
keep_alive_time Gauge The time in milliseconds that an unexpectedly disconnected client is kept alive before closing. diffusion_connector_keep_alive_time
number_of_acceptors Gauge The number of acceptors. diffusion_connector_number_of_acceptors
queue_definition Info The queue definition. diffusion_connector_queue_definition
total_number_of_connections Counter The number of connections accepted since the connector was started. diffusion_connector_total_number_of_connections_total
uptime Info The time this connector has been running as a formatted string, or 0 if the connector is not running. diffusion_connector_uptime
uptime_millis Gauge The time this connector has been running in milliseconds, or 0 if the connector is not running. diffusion_connector_uptime_millis
Client Statistics Metrics ClientStatistics MBean
client_output_frequency Gauge Statistics output frequency in milliseconds. diffusion_client_statistics_client_output_frequency
client_reset_frequency Gauge The frequency at which the counters are reset. diffusion_client_statistics_client_reset_frequency
concurrent_client_count Gauge The current client session count. diffusion_client_statistics_concurrent_client_count
connection_counts Info The current client session count, broken down by client type. diffusion_client_statistics_connection_counts
maximum_concurrent_client_count Gauge The maximum number of concurrent client sessions. diffusion_client_statistics_maximum_concurrent_client_count
maximum_daily_client_count Gauge The count of client sessions started in a day. diffusion_client_statistics_maximum_daily_client_count
Multiplexer Manager Metrics MultiplexerManager MBean
number_of_multiplexers Gauge The number of multiplexers. diffusion_multiplexer_manager_number_of_multiplexers
System Properties Metrics
java_version Info Java version. java_version
java_vendor Info Java vendor. java_vendor
java_vm_name Info Java VM name. java_vm_name

Delta compression ratio

value_bytes and delta_bytes can be used to capture the theoretical delta compression ratio of the application data flowing through the topics. Both the console and the JMX MBean perform this calculation. The ratio is a value between 0 and 1. The closer the ratio is to 1, the more benefit the application data will obtain from delta streaming. If value_bytes is 0, there have been no updates, so the delta compression ratio is reported as zero. Otherwise it is calculated as:

1 - delta_bytes / value_bytes

Delta streaming is enabled for subscriptions by default, but can be disabled on a per-topic basis using the PUBLISH_VALUES_ONLY topic property. If delta streaming is enabled, a stable set of subscribers remain connected, and no session has a significant backlog (so conflation is not applied), the following relationship should hold:

subscriber_update_bytes ≅ delta_updates x subscribers

Delta streaming can also be used to update topic values. If the delta compression ratio is high, but delta_updates is zero (or low, relative to value_updates), consider whether your application can use the stateful update stream API to take advantage of delta streaming.

Log metrics

Log metrics record information about server log events. Separate metrics are kept for each unique pair of log code and log severity level that has been logged.

The log severity levels are: error, warn, info, debug, trace.

A JMX MBean is created for each pair of log code and log severity that has been logged at least once.

Here is an example MBean name: com.pushtechnology.diffusion:type=LogMetrics,server="server_name",level=warncode=PUSH-12345

Session metrics versus network metrics

The network inbound_bytes and outbound_bytes metrics include bytes that are not counted by the equivalent session metrics.

The session metrics include bytes from transport framing and all session traffic (including additional HTTP traffic from long polling).

The network metrics include all bytes included in the session metrics as well as non-session bytes such as:
  • TLS overhead
  • Web server traffic (for example, browsers downloading the web console pages)
  • Rejected connection attempts