MongoDB Replica Set: No Common Algorithm

I am using MongoDB Community Edition to self-host a private replica set with two members:

  • Windows 10 (19045.4046) - mongod 7.0.7
  • Debian 12 - mongod 7.0.8

Without TLS enabled, the replica set functions perfectly.

I have now enabled TLS. I also generated a certificate authority and additional certificates for each server and some clients, all using RSA/SHA-256.

The servers have a config of the form:

storage:
  dbPath: <path to data>

systemLog:
  destination: file
  logAppend: true
  path:  <path to log>

net:
  port: 27017
  bindIp: 0.0.0.0
  tls:
    mode: requireTLS
    CAFile: <path to ca file>
    certificateKeyFile: <path to key file>

replication:
  replSetName: rs0

I can successfully connect to both servers using mongosh or Compass, but there are issues with the state of the replica set.

rs.status() from the primary node:

    {
      _id: 0,
      name: '<primary node hostname>:27017',
      health: 1,
      state: 1,
      stateStr: 'PRIMARY',
      uptime: 82,
      optime: { ts: Timestamp({ t: 1715243969, i: 1 }), t: Long('60') },
      optimeDate: ISODate('2024-05-09T08:39:29.000Z'),
      lastAppliedWallTime: ISODate('2024-05-09T08:39:29.172Z'),
      lastDurableWallTime: ISODate('2024-05-09T08:39:29.172Z'),
      syncSourceHost: '',
      syncSourceId: -1,
      infoMessage: 'Could not find member to sync from',
      electionTime: Timestamp({ t: 1715243909, i: 1 }),
      electionDate: ISODate('2024-05-09T08:38:29.000Z'),
      configVersion: 8,
      configTerm: 60,
      self: true,
      lastHeartbeatMessage: ''
    },
    {
      _id: 1,
      name: '<secondary node hostname>:27017',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
      uptime: 79,
      optime: { ts: Timestamp({ t: 1715238754, i: 1 }), t: Long('52') },
      optimeDurable: { ts: Timestamp({ t: 1715238754, i: 1 }), t: Long('52') },
      optimeDate: ISODate('2024-05-09T07:12:34.000Z'),
      optimeDurableDate: ISODate('2024-05-09T07:12:34.000Z'),
      lastAppliedWallTime: ISODate('2024-05-09T07:12:34.747Z'),
      lastDurableWallTime: ISODate('2024-05-09T07:12:34.747Z'),
      lastHeartbeat: ISODate('2024-05-09T08:39:37.264Z'),
      lastHeartbeatRecv: ISODate('1970-01-01T00:00:00.000Z'),
      pingMs: Long('1'),
      lastHeartbeatMessage: '',
      syncSourceHost: '',
      syncSourceId: -1,
      infoMessage: '',
      configVersion: 8,
      configTerm: 52
    }
  ]

rs.status() from the secondary node:

    {
      _id: 0,
      name: '<primary node hostname>:27017',
      health: 0,
      stateStr: '(not reachable/healthy)',
      uptime: 0,
      optime: { ts: Timestamp({ t: 0, i: 0 }), t: Long('-1') },
      optimeDurable: { ts: Timestamp({ t: 0, i: 0 }), t: Long('-1') },
      optimeDate: ISODate('1970-01-01T00:00:00.000Z'),
      optimeDurableDate: ISODate('1970-01-01T00:00:00.000Z'),
      lastAppliedWallTime: ISODate('1970-01-01T00:00:00.000Z'),
      lastDurableWallTime: ISODate('1970-01-01T00:00:00.000Z'),
      lastHeartbeat: ISODate('2024-05-09T08:39:00.829Z'),
      lastHeartbeatRecv: ISODate('2024-05-09T08:38:59.331Z'),
      pingMs: Long('0'),
      lastHeartbeatMessage: 'Error connecting to <primary node hostname>:27017 (<ip address>:27017) :: caused by :: onInvoke :: caused by :: The client and server cannot communicate, because they do not possess a common algorithm.',
      syncSourceHost: '',
      syncSourceId: -1,
      infoMessage: '',
      configVersion: -1,
      configTerm: -1
    },
    {
      _id: 1,
      name: '<secondary node hostname>:27017',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
      uptime: 53,
      optime: { ts: Timestamp({ t: 1715238754, i: 1 }), t: Long('52') },
      optimeDate: ISODate('2024-05-09T07:12:34.000Z'),
      lastAppliedWallTime: ISODate('2024-05-09T07:12:34.747Z'),
      lastDurableWallTime: ISODate('2024-05-09T07:12:34.747Z'),
      syncSourceHost: '',
      syncSourceId: -1,
      infoMessage: '',
      configVersion: 8,
      configTerm: 52,
      self: true,
      lastHeartbeatMessage: ''
    }
  ]

Replication does seem to work to some extent: write operations can be performed when connected to either sever (again with either mongosh or Compass), but they hang, and need to be explicitly terminated, but changes are reflected in subsequent queries. This is not an issue when TLS is disabled.

Using

security:
  clusterAuthMode: x509

gives the same results, except an admin user needs to be created and authenticated before rs.status() can be invoked.

It seems that the issue might be related to TLS incompatibilities, and I suspect that Windows is the culprit.

I have messed around with different TLS versions in Windows’ Internet Options. I have also messed around with net.tls.disabledProtocols in the server configs, but I have not had any success.

Any help would be greatly appreciated!

To add to this, the secondary node’s logs are filled with the same error message:

{"t":{"$date":"2024-05-09T15:31:37.936+02:00"},"s":"I",  "c":"REPL_HB",  "id":23974,   "ctx":"ReplCoord-6","msg":"Heartbeat failed after max retries","attr":{"target":"<primary node hostname>:27017","maxHeartbeatRetries":2,"error":{"code":6,"codeName":"HostUnreachable","errmsg":"Error connecting to <primary host name>:27017 (<ip address>:27017) :: caused by :: onInvoke :: caused by :: The client and server cannot communicate, because they do not possess a common algorithm."}}}

There are no traces of this in the primary node’s logs.