Diffusion 6.8 Release Notes

6.8.0 (28 February 2022)

New features in 6.8.0

Kafka Adapter

27942: Include headers of Kafka records in content of Diffusion topic updates

Headers in Kafka records can now be included in Diffusion updates. In the config, a new config param 'headers' is introduced for 'regexSubscriptions' and 'topicSubscriptions' in publisher. This expects a list of header keys whose values would be looked into in Kafka record and published to Diffusion, together with Key, value and partition details. If "$all" is used a first item in list, all headers will be included.

Logging

27365: Journal feature

The new Diffusion journal allows certain 'actions' to be written out to a log file. Actions that are written contain data about what is being performed, along with which principal performed the action. The journal uses Log4J2 allowing the file output to be configured as required.

Please refer to the user manual for more details on how to configure the journal.

Python Client

27531: Python Core repo with CBOR and Delta bindings

This new package provides Python bindings for native functionality in the Python Client.

This includes:
1. the CBOR support previously provided by diffusion-cbor
2. Myers-Diff Binary deltas, used for deltas in Diffusion.

At present, we provide binary wheels *only*. We cover all Manylinux platforms, as well as for Python 3.7-3.9 on MacOS (10.14-11.1) and Windows.

Other binaries can be built as required, although the covered platforms should cover the vast majority of use cases.

Topic Views

20654: New 'process' transformation - providing conditionals and calculations

This release introduces a new feature to 'topic views'.

There is now a new 'process' transformation that can be used within a topic view to perform calculations on fields within JSON input and set the results in the output JSON. Conditional processing is also supported, so it is possible to generate reference topics only if certain conditions (based upon the JSON input) are true. Conditions and calculations can be used together, so it is possible to conditionally set fields in the output based upon calculations performed upon the input.

See the SDK documentation or the user manual for full details of how to use the new 'process' transformation.

Improvements in 6.8.0

.NET Client

26071: New SessionEstablishmentTransientException from SessionFactory.open

A new exception called SessionEstablishmentTransientException has been introduced which can be returned from SessionFactory.Open. This exception indicates a transient failure and the client application can reasonably retry the connection.

Adapters

24656: Changes to Adapter Security Permissions

From this release users will need specific permissions to control adapters (Kafka, CDC, and JMS) and to implement adapters.
A console user that controls adapters will need VIEW_SERVER permissions to view connected adapters and CONTROL_SERVER permissions to manipulate them.
The principal used to implement an adapter will need REGISTER_HANDLER permission.

C Client

27588: Include OpenSSL in C Client

OpenSSL is now internally linked in the Diffusion C Client.

Environment

26833: Additional environment variables for the Diffusion start scripts

From this release, the server start scripts can be customised using the additional environment variables DIFFUSION_EXT_DIR, LOG4J_CONFIGURATION, JVM_LOG_DIR, and EXTRA_JAVA_PARAMETERS. See the explanatory comments in the scripts for more details.

Java Client

28021: New subclasses of SessionSecurityException - AuthenticationException and PermissionsException

SessionSecurityException now has subclasses of AuthenticationException and PermissionsException to allow for differentiation between the two possible reasons for the security exception.

Replication

28074: Improve memory footprint of cluster partition log compaction

The log compaction process has been tuned, significantly reducing the working memory required to handle a series of large messages sent to a replicated topic.

System Monitoring/Statistics

23364: Topic metric grouping by path segments

A topic metric collector can be configured to partition its results into groups based on topic path. If the new "group by path segments" setting is configured to be a positive number, the metrics will be grouped by path prefix. The setting specifies the number of path segments in the prefix. This avoids the need to create and maintain separate metric collectors for each child path. The setting can be changed using the console or the client API.

In the path a/x, the path segments are "a" and "x". A topic metric collector with the topic selector of ?a// will produce a single set of aggregated metrics for the topics with paths starting a/. If the metric collector is altered to set group by path segments to 2, it will produce separate aggregated metrics for the topic with the path a, for topics with paths starting with a/x, for topics with paths starting a/y, and so on.

See also 27330 for a complementary, separate new setting to limit the number of groups created by a metric collector.

27330: Option to limit the number of groups created by a metric collector

A single metric collector can produce many sets of metrics. For example, a session metric collector can be configured to group by $SessionId, which will create a separate set of metrics for every unique session. Similarly from this release (see 23364), a topic metric collector can group by path segments to create a separate set of metrics for every branch of the topic tree having a unique path prefix with the configured number of segments.

A new "maximum groups" setting has been added to both session and topic metric collectors to place an upper limit on the number of groups created. This provides protection against a metric collector creating an arbitrary number of metric sets, potentially impacting system performance. The setting can be changed using the console or the client API.

Topic Views

27870: New getTopicView method in the TopicViews feature

The TopicViews feature of the Client APIs now has a new getTopicView method allowing a single topic view to be retrieved by name.

Deprecations in 6.8.0

JavaScript Client

27553: All existing members on enum-like types deprecated

All members on enum-like types have been deprecated. Affected types are CloseReason, ErrorReason, UnsubscribeReason, UpdateFailReason, and TopicAddFailReason

Removals in 6.8.0

Configuration

27820: Deprecated WhoIs Service has been removed

The deprecated WhoIs service has been removed.

27822: Deprecated store directory removed from PersistenceConfig and Server.xml

The deprecated store-directory item from the persistence element in Server.xsd has been removed along with the corresponding deprecated methods in the PersistenceConfig interface of the server configuration API.

Logging

27821: Deprecated Diffusion logging library has been removed

Diffusion uses Log4j2 as its default logging library.

Previous releases included a legacy logging library, which was deprecated in Diffusion 6.4. The legacy library is no longer supported and has been removed from this release.

Fixes in 6.8.0

.NET Client

27630: Disconnection due to "Http fragmentation and extension not supported"

An issue was identified with the .Net client's handling of partial reads. This has now been addressed.

C Client

27527: hash_num_new is using minimum slots instead of maximum slots provided as parameter

hash_num_new now correctly uses the maximum number of slots when creating a hash map.

Console

17776: Console shows fractional users connected

The Diffusion management console previously displayed some metrics in graphs with unnecessary decimal places. Graphs consisting only of only integer metrics will no longer have fractional ticks on the Y axis.

27314: Unable to set remote server missing topic notification filter through console

The Diffusion management console did not allow the configuration of a topic notification filter while creating remote servers, functionality which was added to the server in Diffusion 6.7.0. This setting can now be configured through the console user interface.

27371: Topic paths with trailing spaces not handled correctly

In the Diffusion management console, there was no provision made for distinguishing topics whose paths differed in trailing whitespace. Behaviour in such cases has been improved.

27377: Topic view editor discards patch when attempting to edit existing topic view

When using the Diffusion management console to view existing topic views with a JSON patch clause, the console could incorrectly display the topic view without the JSON patch clause. This has been fixed.

27860: Console does not allow connection timeout to be specified

The Diffusion management console did not allow a connection timeout to be specified at login. This option has been added.

27913: Nonsense on license page for commercial license

The Diffusion management console's license page could show some contradictory text when deployed with a commercial license. This has been resolved.

Federation

28042: Remote server connection failure to connect stalls multiplexer

A problem with thread locking could cause multiplexers to stall if secondary remote servers fail to connect. This problem has now been resolved.

28180: Inbound threads can be indefinitely blocked by Remote Server API calls

If a remote server connection was blocking for a long time due to other issues then other calls to the remote server feature (create, remove, check, get) could also block inbound threads indefinitely.
This has now been changed so that such calls will time out if unable to proceed due to locks being held by remote server connections.

Java & Android Client

27491: Java examples do not build out of the box

It was not possible to build the Java examples with "mvn package" without first adding a dependency for jackson-annotations. This has now been resolved.

Java Client

27783: Memory leak in Java client on multiple reconnections

Repeated reconnections from the Java client could cause a memory leak of session related objects in the client VM. This has now been resolved.

JavaScript Client

24910: Sessions can reconnect even if explicitly closed by another session

An issue has been resolved where the server allowed clients to reconnect during the reconnection timeout, even if they had been explicitly closed by another session. This would only occur if session replication was enabled.

26449: Unresponsive shared session prevents login

When connecting to a shared session, a timeout has been introduced in the case where the SharedWorker is unresponsive

27510: TypeScript definition for RemoteServerBuilder.missingTopicNotificationFilter doesn't allow null parameter

The documentation of RemoteServerBuilder.missingTopicNotificationFilter states that a null parameter can be used to clear the filter. The TypeScript definition didn't allow null to be passed. Now, the type definition has been updated to allow a null parameter.

27859: Connection timeout not configurable

The JavaScript client was missing an option to specify the connection timeout on establishing a session. This has been rectified.

27884: Authenticator throws 'Cannot read properties of null'

When closing an authenticator, it would throw a "REGISTERED_HANDLER_EXCEPTION TypeError: Cannot read properties of null". This has been fixed.

Kafka Adapter

27187: Time series topic creation does not work in Kafka adapter

Fixed a bug where creating timeseries Diffusion topic was not working when publishing to DIffusion from Kafka Topics.

27260: Editing Diffusion topic related detail in Kafka adapter from console does not work

Fixed a bug where updating Diffusion publisher service configuration during runtime prevented updates to be published to updated Diffusion topics.

Persistence

28159: Persistence restore failure due to file corruption restores no topics

There is the possibility for topic persistence files to become corrupt. The most likely cause of this is some resource issue (memory or disk space) at the time of writing which can lead to a truncated file.
Previously, when restarting a server with such corrupt files the restore would be abandoned, files would be moved to the recovery directory, and the server started with no topics restored.
This fix allows the server to proceed with topics restored so far if a file corruption is detected. The faulty files will still be copied to the recovery directory but the current state already read from files will be written back to the persistence directory as a compacted file.
An error will be logged if this occurs, but as file corruption typically occurs at the end of the persistence files then in most cases this will mean that all topic state at the point of failure, except for the very last write, will be restored successfully.
Files copied to recovery are for diagnostic purposes only and should be manually deleted to save space. However, before deletion, they may be sent to Push Technology support for analysis.

Python Client

27347: Recursively decode Model-based objects

Fixes an issue where some pydantic.BaseModel-based objects were not being fully decoded from the CBOR. This only affected Session Metrics.

27678: python/client-docs/docs/usage.md embedded code is invalid

Updated API Documentation usage example to reflect breaking changes in the API.

Replication

27283: Correct cluster recovery of time-series topics

In previous releases, due to a coding error, re-distributing time series topic data when servers join and leave a cluster did not scale to large numbers of time series events. This could cause protracted instability whenever the cluster topology changed. The problem has been fixed in this release.

27321: "replicated-topics-restored" start condition does not work

Connectors can be configured not to accept connections until a set of conditions is satisfied.

Due to a coding error in previous releases, the "replicated-topics-restored" condition was never triggered. This has been corrected in this release. The condition is satisfied after a server has joined the cluster and received all of the topic data from existing members of the cluster. The server will log a PUSH-000834 INFO message when this happens.

27970: A server joining a stable cluster should not merge topics recovered from file persistence

When servers configured for topic replication first form a cluster, the replicated topics are initialised from the servers persistent files. Each server that forms the initial cluster is responsible for recovering a proportion of the topics. While the cluster continues to run, persistent files are written but not read again.

Due to a bug in previous releases, a server joining a stable cluster could add topics from its persistent files. This bug has been fixed in this release.

27986: Unnecessary assertion from compaction

Due to a bug in previous releases, if the server was run with assertions enabled (-ea), topic replication could fail due to an assertion error during compaction of the persistence log. The bug has been fixed in this release.

28136: Connecting sessions that time out due to Hazelcast blocking never complete and remain in memory and metrics

When running in a cluster connecting sessions could time out if there was never a response from Hazelcast during the connection phase. This led to the session remaining in an unconnected state in the server memory and still showing in session metrics. This problems has now been resolved so that if the Hazelcast interaction does not complete the session closes tidily.

28201: Servers in a cluster are unresponsive when loading a large persistence file

Starting replicated servers in a Diffusion cluster could slow down to such a degree that it would appear that they had completely stalled when restoring from very large persistence files.This was due to unnecessary delta calculations occurring when restoring the cluster.
This has now been resolved.

28266: A server recovering from a persistent file can corrupt a cluster's replicated topic data

Due to a bug in previous releases, a server joining a cluster and recovering topics from a persistent store could corrupt the topic data in the cluster.

This bug had several symptoms, including inconsistencies between the cluster members' topic trees, and internal failures to apply delta updates. (E.g. PUSH-000843 ... state REPLICA cannot have delta applied by REPLICA).

The bug has been fixed in this release.

Security

27968: Setting READ_TOPIC permissions update could fail with an IllegalArgumentException

Due to a bug in previous releases, applying particular combinations of path permission assignments using the security control API could fail with an IllegalArgumentException. The bug has been fixed in this release.

27972: Concurrency issue could lead to a corrupt permissions index

In previous releases, a bug in the code that creates internal index of security permissions could leave the index in a corrupt form. The bug has been fixed in this release.

27976: Upgrade log4j2 to address CVE-2021-44228 security vulnerability

The log4j2 logging library used by Diffusion has been upgraded to version 2.15.0. This addresses a critical security bug [CVS-2021-44228] in log4j2. See https://logging.apache.org/log4j/2.x/security.html for details.

28009: Upgrade log4j2 to address CVE-2021-45046 security vulnerability

The log4j2 logging library used by Diffusion has been upgraded to version 2.17.0. This addresses a critical security bug [CVS-2021-45046] in log4j2. See https://logging.apache.org/log4j/2.x/security.html for details.

28053: Changes to the security configuration fail in a cluster if the security store file is read-only

A bug was introduced in Diffusion 6.6.0 and later releases which corrupted the propagation of security configuration changes across a cluster if the corresponding security store file (SystemAuthentication.store, Security.store) is read-only.

Changing the file permissions so the security store files can be read but not written is supported, and can be useful if a separate mechanism is used to seed security configuration after a cluster is cold-started.

The bug has been fixed in this release. Security store changes are again propagated correctly across the cluster, regardless of whether the security store file is read-only.

Server

27605: Possible leak of sessions (and session metrics) that time out during connection

In certain situations, a client session failing to connect due to a timeout could lead to a memory leak where the server side client object remains. This would also affect metrics as the failed session would remain in the 'open' and 'connected' counts.

This problem has now been resolved.

28247: New subscription inadvertently removed existing subscriptions

Due to a bug in previous releases, the topic selectors maintained by the server for a session could be corrupted by subscription and unsubscription operations. Specifically, the problem could be triggered if a session subscribed to a topic selector with a descendant pattern qualifier ("/", or "//"), for example "?a//", then later redundantly subscribed to a topic selector that is a strict sub-selector of the first one, for example "a/b". The bug could cause another topic selector to be removed in ways that were hard to predict.

The bug has been fixed in this release.

Topic Views

27865: Inserts before patch clauses can cause indeterminate results

It was possible that having an 'insert' clause in a topic view specification before a 'patch' clause could produce indeterminate results and in some situations even lead to orphaned reference topics.
The validation of topic views has now been changed to ensure that any 'insert' clauses happen after 'patch' clauses. A failure will occur when parsing a topic view specification if this is not the case.

Topics

28170: JSON patch exception message is misleading

The error message given by an applyJSONPatch operation or a patch operation within a topic view could be misleading. The message would read 'failed on operation [1] of [2]' if the second operation failed because it was using the index of the failed operation rather than its number.
This has now been changed so that if the second operation fails then it will read 'failed on operation [2] of [2]'.

28309: IllegalArgumentException in TopicTreeNodeImpl

A concurrency bug in previous releases could corrupt topics in the topic tree. One side-effect is that a subsequent attempt to add a topic could fail with an IllegalArgumentException. The bug has been fixed in this release.