CSHARP-5693: Allow Instantiated MongoClients to Send Client Metadata On-Demand#2044
CSHARP-5693: Allow Instantiated MongoClients to Send Client Metadata On-Demand#2044adelinowona wants to merge 8 commits into
Conversation
There was a problem hiding this comment.
Warning
- Copilot's review of this pull request may be incomplete because some of the changed files are excluded by your Copilot content exclusion settings. See Excluding content from Copilot for details.
Pull request overview
Adds support for updating/sending MongoDB handshake client metadata on-demand from an already-instantiated MongoClient, and wires this through the cluster/connection initialization path. It also introduces unified and prose spec test coverage for the MongoDB handshake metadata update behavior.
Changes:
- Add
MongoClient.AppendMetadata(LibraryInfo)API and propagate metadata updates into cluster handshake metadata used for new connections. - Introduce
ClientMetadata(cached/appendable client handshake document) and thread it through cluster/connection factory construction. - Add unified-spec operation support (
appendMetadata) and handshake prose/unified tests.
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/MongoDB.Driver.Tests/UnifiedTestOperations/UnifiedTestOperationFactory.cs | Registers the new unified test operation name appendMetadata for client entities. |
| tests/MongoDB.Driver.Tests/UnifiedTestOperations/UnifiedAppendMetadataOperation.cs | Implements the unified test operation and argument parsing for driverInfoOptions. |
| tests/MongoDB.Driver.Tests/Specifications/UnifiedTestSpecRunner.cs | Adds a new unified spec suite entry for mongodb_handshake.tests.unified. |
| tests/MongoDB.Driver.Tests/Specifications/mongodb-handshake/prose-tests/MongoDbHandshakeProseTests.cs | Adds prose tests asserting metadata updates affect subsequent handshakes on new connections. |
| tests/MongoDB.Driver.Tests/MongoClientTests.cs | Adds unit tests for MongoClient.AppendMetadata argument validation and disposal behavior. |
| tests/MongoDB.Driver.Tests/Core/Connections/HelloHelperTests.cs | Updates reflection-based test to account for ConnectionInitializer now holding ClientMetadata. |
| tests/MongoDB.Driver.Tests/Core/Connections/ClientMetadataTests.cs | Adds targeted tests for ClientMetadata caching, append/dedup, and normalization behavior. |
| tests/MongoDB.Driver.Tests/Core/Connections/ClientDocumentHelperTests.cs | Adds coverage for appending library platform into the top-level platform handshake field. |
| tests/MongoDB.Driver.Tests/Core/Connections/BinaryConnectionFactoryTests.cs | Updates factory construction to pass ClientMetadata. |
| tests/MongoDB.Driver.Tests/Core/Configuration/LibraryInfoTests.cs | Adds tests for new LibraryInfo.Platform behavior and equality semantics. |
| src/MongoDB.Driver/MongoClient.cs | Adds public AppendMetadata(LibraryInfo) and validates separator constraints before delegating to cluster. |
| src/MongoDB.Driver/Core/Connections/ConnectionInitializer.cs | Switches from a static _clientDocument to _clientMetadata.GetClientDocument() at handshake time. |
| src/MongoDB.Driver/Core/Connections/ClientMetadata.cs | New class managing cached handshake client document + appended LibraryInfo entries (thread-safe). |
| src/MongoDB.Driver/Core/Connections/ClientDocumentHelper.cs | Appends LibraryInfo.Platform onto the handshake platform field when provided. |
| src/MongoDB.Driver/Core/Connections/BinaryConnectionFactory.cs | Accepts and validates ClientMetadata, passing it into ConnectionInitializer. |
| src/MongoDB.Driver/Core/Configuration/LibraryInfo.cs | Adds Platform field/ctor and updates equality/to-string accordingly. |
| src/MongoDB.Driver/Core/Configuration/ClusterBuilder.cs | Creates a shared ClientMetadata instance and threads it through server/pool/monitor factories and cluster factory. |
| src/MongoDB.Driver/Core/Clusters/SingleServerCluster.cs | Plumbs ClientMetadata through to base Cluster. |
| src/MongoDB.Driver/Core/Clusters/MultiServerCluster.cs | Plumbs ClientMetadata through to base Cluster. |
| src/MongoDB.Driver/Core/Clusters/LoadBalancedCluster.cs | Stores ClientMetadata and implements AppendClientMetadata for load balanced mode. |
| src/MongoDB.Driver/Core/Clusters/ICluster.cs | Adds internal AppendClientMetadata(LibraryInfo) to the IClusterInternal contract. |
| src/MongoDB.Driver/Core/Clusters/ClusterFactory.cs | Passes shared ClientMetadata when constructing clusters. |
| src/MongoDB.Driver/Core/Clusters/Cluster.cs | Stores ClientMetadata and implements AppendClientMetadata for non-load balanced clusters. |
Files excluded by content exclusion policy (2)
- specifications/mongodb-handshake/tests/unified/metadata-not-propagated.json
- specifications/mongodb-handshake/tests/unified/metadata-not-propagated.yml
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| lock (_lock) | ||
| { | ||
| if (_libraryInfos.Contains(libraryInfo)) |
There was a problem hiding this comment.
Should we use ConcurrentDictionary<LibraryInfo, bool> instead to avoid exclusive lock? As per discussion with @BorisDog most likely EF will need to AppendMetadata on each DbContext creation, which most likely is per-request. We might want to reduce on locks here.
There was a problem hiding this comment.
I believe the hot path here will be GetClientDocument(), which runs on every connection handshake — and it's already lock-free: it does a volatile read of the cached document and returns without taking the lock once warmed. The lock only guards Append, which the EF per-DbContext pattern will probably call with the same LibraryInfo each time, so after the first call every subsequent one is a dedup no-op over a tiny List.Contains. The critical section is a few nanoseconds and uncontended in that pattern, so I don't think the per-request lock is a real cost.
ConcurrentDictionary<LibraryInfo, bool> also wouldn't be a solution: its enumeration order is unspecified, and we rely on insertion order to build the name|version|platform lists deterministically (covered by Append_should_accumulate_library_infos_in_order). Switching to it would make the handshake string nondeterministic.
There was a problem hiding this comment.
_libraryInfos.Contains(libraryInfo) is under the lock as well. Should we switch to use ReaderWriterLockSlim then? So the Contains part could be done under shared read lock?
There was a problem hiding this comment.
IMO, as Adelin mentioned, it's not crucial optimization, as it's not a hot path, happens once on startup. So we can prefer a simpler code in this case. But not harm in introducing a quick optimization, if we can agree on one that doesn't complicate things.
There was a problem hiding this comment.
I think ReaderWriterLockSlim has materially higher per-acquire overhead than a plain Monitor/lock here. It only pays off when read critical sections are long and highly contended. Our read critical section is an O(n) Contains over a list of ~1–3 entries — a handful of nanoseconds. Wrapping that in RW-lock bookkeeping would likely make the common path slower, not faster but not sure on how to test this exactly. WDYT? @sanych-sun
There was a problem hiding this comment.
it's not a hot path, happens once on startup
I do not think it's actually true. If EF for example has to set this on DbContext creation (probably need input from @damieng here) it means this code will be executed once per request if we are talking about Web API application:
A DbContext instance is designed to be used for a single unit-of-work. This means that the lifetime of a DbContext instance is usually very short.
https://learn.microsoft.com/en-us/ef/core/dbcontext-configuration/
I still think that Contains should be done outside of the exclusive lock. I suppose we can do it outside of lock at all, because underlying collection is append only. Even if a concurrent thread will change the collection it could be false-negative check, which we will mitigate by double-checking inside the exclusive lock while adding the library info.
There was a problem hiding this comment.
a handful of nanoseconds.
It's true that Contains for a small lists are extremally fact, but having the exclusive lock - it makes all requests go through the single exclusive really short lock - which makes it a bottleneck for system performance.
There was a problem hiding this comment.
Yes, I missed this scenario, append with same metadata can actually be on a hot path.
So we should try to avoid lock for duplicates check.
| // entries are normalized on insertion, so null is the only unset form | ||
| var name = string.Join("|", _libraryInfos.Select(libraryInfo => libraryInfo.Name)); | ||
| var version = string.Join("|", _libraryInfos.Where(libraryInfo => libraryInfo.Version != null).Select(libraryInfo => libraryInfo.Version)); | ||
| var platform = string.Join("|", _libraryInfos.Where(libraryInfo => libraryInfo.Platform != null).Select(libraryInfo => libraryInfo.Platform)); |
There was a problem hiding this comment.
If we filtering out nulls, will it make the result readable?:
"A|B|C;5.1|1.2;"
How do we know if 1.2 it's a version for B or C?
Is this prescribed by spec?
There was a problem hiding this comment.
The handshake spec only requires that appended values go to their respective fields delimited by "|" . It doesn't prescribe positional alignment or empty placeholders for libraries missing a version. Omitting unset versions also matches our original single-append behavior (AppendLibraryInfoToDriverDocument already appends name unconditionally and version only when present), so this keeps the two paths consistent.
I am not sure how the analytics side parses this info accurately but I'll check
There was a problem hiding this comment.
Yeah that's a good question.
I'd expect "|" to be omitted for null values as well.
It's discussed here. I agree with Valentin that this should be clarified in the spec, and have proper tests.
- Move client metadata separator validation into LibraryInfo constructor - Move library info concatenation from ClientMetadata into ClientDocumentHelper - Remove unused ConnectionInitializer constructor overload - Clarify LibraryInfo.Platform documentation
|
|
||
| lock (_lock) | ||
| { | ||
| if (_libraryInfos.Contains(libraryInfo)) |
There was a problem hiding this comment.
IMO, as Adelin mentioned, it's not crucial optimization, as it's not a hot path, happens once on startup. So we can prefer a simpler code in this case. But not harm in introducing a quick optimization, if we can agree on one that doesn't complicate things.
| // entries are normalized on insertion, so null is the only unset form | ||
| var name = string.Join("|", _libraryInfos.Select(libraryInfo => libraryInfo.Name)); | ||
| var version = string.Join("|", _libraryInfos.Where(libraryInfo => libraryInfo.Version != null).Select(libraryInfo => libraryInfo.Version)); | ||
| var platform = string.Join("|", _libraryInfos.Where(libraryInfo => libraryInfo.Platform != null).Select(libraryInfo => libraryInfo.Platform)); |
There was a problem hiding this comment.
Yeah that's a good question.
I'd expect "|" to be omitted for null values as well.
It's discussed here. I agree with Valentin that this should be clarified in the spec, and have proper tests.
There was a problem hiding this comment.
Warning
- Copilot's review of this pull request may be incomplete because some of the changed files are excluded by your Copilot content exclusion settings. See Excluding content from Copilot for details.
Pull request overview
Copilot reviewed 28 out of 28 changed files in this pull request and generated 1 comment.
Files excluded by content exclusion policy (2)
- specifications/mongodb-handshake/tests/unified/metadata-not-propagated.json
- specifications/mongodb-handshake/tests/unified/metadata-not-propagated.yml
- Add IMongoClient.AppendMetadata extension method as a bridge until the method is added to the interface in 4.0 - Remove unnecessary doc comment on internal ClientMetadata - Add delegation tests for the instance and extension AppendMetadata
|
|
||
| // constructors | ||
| public ClusterFactory(ClusterSettings settings, IClusterableServerFactory serverFactory, IEventSubscriber eventSubscriber, ILoggerFactory loggerFactory) | ||
| public ClusterFactory(ClusterSettings settings, IClusterableServerFactory serverFactory, IEventSubscriber eventSubscriber, ILoggerFactory loggerFactory, ClientMetadata clientMetadata = null) |
There was a problem hiding this comment.
In what cases clientMetadata can be null?
There was a problem hiding this comment.
Only in tests. In production, ClusterFactory always receives a non-null clientMetadata instance from ClusterBuilder and forwards it to the created cluster. The null default exists purely so test code that constructs factories/clusters directly — for SDAM/topology tests that have nothing to do with metadata — doesn't have to supply one. I guess a nullable that "can't be null in prod" is a code smell? Lmk if you're not ok with that.
There was a problem hiding this comment.
I'd prefer not to have a code tailored to our testing here.
|
|
||
| if (!string.IsNullOrWhiteSpace(libraryInfo.Version)) | ||
| { | ||
| driverVersion = $"{driverVersion}|{libraryInfo.Version}"; |
There was a problem hiding this comment.
Following up on this comment
Has a decision to not append a '|' for empty value been made?
There was a problem hiding this comment.
I don't know if a decision has been made, I can't access the document you shared in the thread.
There was a problem hiding this comment.
You should have access now.
Intuitively I am leaning towards emitting an pipe for null values, but let's align with that doc.
|
|
||
| lock (_lock) | ||
| { | ||
| if (_libraryInfos.Contains(libraryInfo)) |
There was a problem hiding this comment.
Yes, I missed this scenario, append with same metadata can actually be on a hot path.
So we should try to avoid lock for duplicates check.
|
|
||
| var updated = new LibraryInfo[current.Length + 1]; | ||
| Array.Copy(current, updated, current.Length); | ||
| updated[current.Length] = libraryInfo; |
There was a problem hiding this comment.
Minor: I believe we can use fancy C# 12 feature here:
var updated = [..current, libraryInfo]
| } | ||
|
|
||
| // empty strings are considered unset, so normalize them to null for deduplication | ||
| private static LibraryInfo Normalize(LibraryInfo libraryInfo) |
There was a problem hiding this comment.
Should we move this to LibraryInfo too? Same way as we validate for | symbol, we can validate and adjust the values, instead of instantiating a new instance. Can save a little on allocations.
|
|
||
| // constructors | ||
| public ClusterFactory(ClusterSettings settings, IClusterableServerFactory serverFactory, IEventSubscriber eventSubscriber, ILoggerFactory loggerFactory) | ||
| public ClusterFactory(ClusterSettings settings, IClusterableServerFactory serverFactory, IEventSubscriber eventSubscriber, ILoggerFactory loggerFactory, ClientMetadata clientMetadata = null) |
There was a problem hiding this comment.
I'd prefer not to have a code tailored to our testing here.
|
|
||
| if (!string.IsNullOrWhiteSpace(libraryInfo.Version)) | ||
| { | ||
| driverVersion = $"{driverVersion}|{libraryInfo.Version}"; |
There was a problem hiding this comment.
You should have access now.
Intuitively I am leaning towards emitting an pipe for null values, but let's align with that doc.
… clusters - Store appended library infos as a copy-on-write immutable array so the dedup check and document build are lock-free; only a genuine addition takes the lock - Move empty/whitespace normalization (and keep '|' validation) into LibraryInfo, removing ClientMetadata.Normalize; simplify ClientDocumentHelper null checks - Require clientMetadata in ClusterFactory and the cluster constructors rather than defaulting to null for tests; remove the now-unreachable null guards
No description provided.