[EventPipe] Add Non-Lossy EventPipe Mode for gcdump by mdh1418 · Pull Request #129457 · dotnet/runtime

mdh1418 · 2026-06-16T04:45:42Z

Runtime side to fix dotnet/diagnostics#2404.

Problem

EventPipe's buffer manager is lossy by design: when a producer's per-thread buffer is full
and the circular pool is exhausted, the event is dropped (accounted via a sequence number)
rather than blocking the writer.

This corrupts dotnet-gcdump. A heap snapshot is rebuilt from the GCBulkNode/GCBulkEdge
stream emitted during a stop-the-world heap walk; on a large heap that burst overruns the buffer,
edges are dropped, and the graph cannot be reconstructed - gcdump fails with
System.ApplicationException: RootIndex not set and writes no dump (dotnet/diagnostics#2404).

This PR adds an opt-in, per-session EventPipeBufferingMode that defaults to DROP (today's
behavior) and can be set to BLOCK to make a streaming session non-lossy. Two opt-ins are
included - the IPC CollectTracing command (used by dotnet-gcdump --non-lossy) and a
DOTNET_EventPipeBufferingMode environment variable for the startup session.

Parking producers instead of dropping

The non-lossy guarantee comes down to one change in the write path: when a producer in Block mode
has filled its buffer and the pool cannot hand it another, instead of dropping the event it
parks until the reader frees capacity and then retries the same write. Parked producers wait
in a strict-FIFO queue; only the thread at the front may reserve the next freed buffer, each
thread waits on its own event, and the reader wakes the front waiter every time it returns a
drained buffer to the pool. A parked producer makes forward progress only once the event actually
lands.

Parking has to cooperate with the existing buffer hand-off, which is where a new pin flag comes
in. Each producer advertises, in its per-thread session_use_in_progress slot, the index of the
session it is writing to; teardown spins on that slot to keep the session and its buffer manager
alive until every in-flight writer is done. We widen that slot with a high
EP_SESSION_USE_WRITE_BUFFER_IN_USE bit meaning "I currently own a write buffer," and the reader
drains a producer's buffer only once that bit is clear. A parked producer therefore clears the
bit but keeps the index: clearing the bit lets the reader drain the very buffer the producer
just filled - the only way capacity is ever freed - while keeping the index leaves the session
pinned so it cannot be torn down from under the sleeping thread. On wake the producer re-sets the
bit, reloads the session pointer (disable may have cleared it), and retries.

Blocking is only ever allowed when it is actually safe to wait for a reader: the session is in
Block mode, it is not tearing down, and the caller is not the rundown thread. Teardown is
the first escape hatch - disable raises an aborting flag and wakes every parked producer, and
the producer, which re-checks aborting immediately before and after each wait, gives up and
drops rather than waiting on a reader that is going away. Rundown is the second - rundown events
are emitted from the drain side during disable, so a rundown writer that blocked would be waiting
on itself; it is excluded from blocking and falls back to the normal drop path.

Expressing these three outcomes cleanly is why the write path's old success/fail boolean became an
EventPipeWriteEventResult enum - WRITTEN (the event landed), DROPPED (discarded:
oversized, provider disabled, suspended, or a genuine Drop-mode overflow, all already tracked by
sequence numbers), and BLOCKED (Block mode, buffer full, session still live - park and retry).
Only BLOCKED drives the park loop.

Starting the drain thread before the heap walk

Enabling a session invokes each provider's enable callback. For a gcdump (GCHeapSnapshot)
session that callback synchronously forces a full, stop-the-world heap walk that emits the entire
GCBulkNode/GCBulkEdge stream - which in Block mode is exactly what fills the buffer and parks
the producer, and the producer can only make progress if a drain thread already exists. Originally
the callbacks fired inline inside ep_enable_3, before streaming starts, so the walk ran during
SuspendEE with no drainer: the buffer filled, the heap-walk thread parked forever, and the
target deadlocked.

Two changes fix the ordering:

session_init / enable split. ep_enable_3 now only publishes the session inert
(session_init) and callback dispatch moves to a single site in ep_start_streaming, which
collects the provider-enable callbacks under the lock, starts the drain thread, waits for it
to be running (EP_YIELD_WHILE (!ep_session_has_started)), and only then dispatches them.
A consumer is therefore guaranteed to be in place before the walk can fill the buffer, and the
drain loop runs in preemptive GC mode, so it keeps draining even while the heap walk holds
the EE suspended.
Native, eager drain threads. The session drain thread is now a raw native thread (like the
diagnostics server thread) instead of a managed CLR thread, so it needs no GC / Thread Store and
starts eagerly at enable time. This removes the previous startup deferral (sessions enabled
early no longer wait until ep_finish_init to start streaming), so even a startup gcdump-style
session has a live drainer before its heap walk runs.

Where Block is allowed

Block parks a producer until a drain frees capacity, so it is only safe for session types that
have a continuous native drain thread: FILESTREAM and IPCSTREAM. FILE and LISTENER
sessions use the buffer manager but have no streaming thread (a FILE session flushes only at
disable; a LISTENER session is pumped by an in-proc managed poll), and SYNCHRONOUS /
USEREVENTS have no buffer manager. A central guard in ep_session_alloc therefore degrades
Block to Drop for any session type without a streaming thread, so a request can never deadlock a
session that has no drainer - regardless of how the mode was requested.

The two opt-ins:

IPC - CollectTracing6 (0x0207). Extends CollectTracing5 with a trailing uint32
buffering mode for streaming (IPCSTREAM) sessions: 0 = Drop, non-zero = Block. Older command
versions are unchanged. dotnet-gcdump --non-lossy sends it (dotnet/diagnostics).
Env var - DOTNET_EventPipeBufferingMode. Lets the DOTNET_EnableEventPipe startup session
opt in: 0 = Drop (default), non-zero = Block. It only takes effect for a streaming session
(DOTNET_EventPipeOutputStreaming=1, i.e. FILESTREAM); on a non-streaming FILE session the
guard above degrades it to Drop. Wired on CoreCLR and NativeAOT (Mono stubs the getter to Drop).

Managed (EventListener), profiler, and gen-analysis sessions go through paths that always pass
Drop and are unaffected.

Single-threaded builds

Under PERFTRACING_DISABLE_THREADS (single-threaded, e.g. browser WASM) there is no separate
drain thread - the drain runs cooperatively on the same thread - and ep_rt_wait_event_wait is a
no-op, so a parked producer could never be woken. The Block park/abort/fair-reserve machinery is
#ifndef-d out of those builds, and a BLOCK request degrades to the existing Drop path.

Testing

Validated on a Checked CoreCLR build (asserts on; none fired). CoreCLR Checked builds clean
(0 warnings, 0 errors).

End-to-end gcdump (the actual use case) - dotnet-gcdump --non-lossy (which sends
CollectTracing6 with Block) against a target holding a 2,000,000-node object graph: the command
is accepted, the heap walk completes, and the report reconstructs exactly 2,000,000 nodes with
exit 0 and no hang/deadlock.

Block vs Drop differential - a startup FILESTREAM session with a 1 MB circular buffer
(DOTNET_EnableEventPipe=1, DOTNET_EventPipeOutputStreaming=1,
DOTNET_EventPipeBufferingMode), target emits 1,000,000 events as fast as possible:

	Drop (`=0`)	Block (`=1`)
Tick events captured	384,976 / 1,000,000 (61.5% dropped)	1,000,000 / 1,000,000 (lossless)

Guard - a non-streaming FILE session (DOTNET_EventPipeOutputStreaming=0) with
DOTNET_EventPipeBufferingMode=1 completes normally with no hang: the guard degrades it to
Drop rather than parking a producer that has no drainer.

Introduce EventPipeBufferingMode (Drop/Block) and carry it as a per-session opt-in on EventPipeSessionOptions (default Drop), plumbed through enable() and ep_session_alloc onto the buffer manager. Block is wired up here but not yet acted on; producers start parking on full buffers in a later change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR extends native EventPipe to support an opt-in non-lossy (“blocking”) buffering mode intended for GC heap snapshot (gcdump) scenarios, and reworks provider-enable callback dispatch to avoid blocking producers before a drain thread exists.

Changes:

Introduces EventPipeBufferingMode (DROP/BLOCK) and EventPipeWriteEventResult (WRITTEN/DROPPED/BLOCKED), plumbing the buffering mode through session options to the buffer manager.
Adds a block-and-retry path in the write pipeline: writers may park on buffer exhaustion in BLOCK mode, and teardown can abort/wake parked writers.
Defers provider enable-callback invocation to ep_start_streaming, storing callbacks on the session until streaming begins.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
src/native/eventpipe/ep.h	Adds `buffering_mode` to `EventPipeSessionOptions`.
src/native/eventpipe/ep.c	Implements blocking retry loop, abort-on-disable, and deferred provider callback dispatch.
src/native/eventpipe/ep-types-forward.h	Adds new enums for buffering mode and write results.
src/native/eventpipe/ep-thread.h	Adds `EP_SESSION_USE_WRITE_BUFFER_IN_USE` bit for writer/reader coordination.
src/native/eventpipe/ep-thread.c	Adjusts assertions to account for the new session-use bit.
src/native/eventpipe/ep-session.h	Stores provider callback queue on session; changes `ep_session_write_event` return type.
src/native/eventpipe/ep-session.c	Plumbs buffering mode into buffer manager; updates inflight-wait logic; updates write path to return `EventPipeWriteEventResult`.
src/native/eventpipe/ep-buffer-manager.h	Adds BLOCK-mode fields/APIs; changes write API to return `EventPipeWriteEventResult`.
src/native/eventpipe/ep-buffer-manager.c	Implements BLOCK-mode signaling/aborting and returns BLOCKED on buffer exhaustion when appropriate.
src/mono/mono/eventpipe/test/ep-tests.c	Updates tests for new `ep_session_write_event` return type (currently has incorrect enum constant usages).
src/mono/mono/eventpipe/test/ep-buffer-manager-tests.c	Updates tests for new `ep_buffer_manager_write_event` return type (currently has incorrect enum constant usage).

In Block mode a producer that finds the buffers full parks until the reader frees capacity and then retries, instead of dropping the event. The reader signals a per-buffer-manager auto-reset event on each capacity release; a parked producer clears the WRITE_BUFFER_IN_USE bit of its session_use_in_progress (keeping the session index) so the reader can drain its full buffer while the retained index pins the buffer manager. Teardown raises an abort flag and reuses the existing in-flight-write wait to release parked producers, so no separate parked-writer count is needed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

enable()/ep_enable_3 only sets the session up and collects its provider-enable callbacks onto the session; ep_start_streaming is now the single site that invokes them, for every session. This is required for a blocking GCHeapSnapshot session: its provider-enable callback triggers a GC heap walk that parks producers when the buffer fills, so it must run only after the drain thread is live to consume - otherwise the walk parks with no reader and deadlocks. ep_start_streaming waits for the drain thread to reach its preemptive drain loop before invoking the callbacks; when streaming is parked for ep_finish_init at early startup it invokes inline, which is safe because the heap walk no-ops until the GC is fully initialized. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ep_start_streaming invoked the provider-enable callbacks that enable() collected onto the session *after* leaving the EventPipe lock, reading, clearing, and freeing session->provider_callbacks with no synchronization. A concurrent disable - most reachably the session's own streaming thread self-disabling on a write failure - runs ep_session_dec_ref under the lock and frees that same queue, so the two paths could race into a use-after-free / double-free. Detach the queue under the lock instead (grab the pointer and NULL the field), then invoke and free that private copy outside the lock. Ownership is now unambiguous: a concurrent disable's ep_session_dec_ref sees NULL and leaves the queue alone, and its defensive drain/free still covers the case where a session is disabled before ep_start_streaming runs. The single-use dispatch helper is inlined now that it just drains a local queue. The has_started spin still reads the session outside the lock (it must - waiting under the lock could deadlock the drain thread startup against a GC holding the thread-store lock), but that read is safe: disable joins the drain thread before freeing the session, and the thread publishes "started" before it can exit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Reclaim a buffer's memory before releasing its budget (now a single helper), so the Block-mode capacity signal is only observed once the memory is freed - matching the buffer-allocation error path. - ep_buffer_manager_abort_blocked_writers only raises the abort flag. Waking parked producers is owned by ep_session_wait_for_inflight_thread_ops, which signals capacity each iteration until every producer observes the abort, so the previous lone signal was redundant and racy and is removed. - ep_session_wait_for_inflight_thread_ops selects the Block-mode wait from the buffer manager's immutable buffering mode and asserts the abort flag is set, instead of branching on a racy field. - Fold write_event_2's first send and its Block-mode park/retry into one loop with a single ep_session_write_event call. - When a session is disabled before ep_start_streaming dispatched its provider-enable callbacks, move them into the disable callback queue so they still fire, balanced with the disable notifications, and so each provider's callbacks_pending is decremented - otherwise ep_delete_provider can wait on it forever. - Assert the Block-only buffer_available_event is valid where it is waited on and signaled. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Adds test_buffer_manager_block_mode_abort_and_disable to the native EventPipe buffer-manager unit tests, covering the non-lossy Block buffering path: - fill a 1 MB buffer pool until ep_buffer_manager_write_event returns EP_WRITE_EVENT_RESULT_BLOCKED, verifying a full buffer parks the producer (park-and-retry) instead of dropping the event; - abort the blocked writers and verify a subsequent write gives up (no longer BLOCKED) rather than parking; - verify the capacity signal/wait wake path returns without hanging; - tear the Block-mode session down via ep_session_dec_ref. Supporting changes in the same test file: - Refactor buffer_manager_init into buffer_manager_init_mode (parameterized by buffering mode); buffer_manager_init forwards EP_BUFFERING_MODE_DROP so existing callers are unchanged. This also corrects a pre-existing arg misalignment in the shared ep_session_alloc call (it was missing circular_buffer_size_in_mb and user_events_data_fd). - write_events now publishes the session index via ep_thread_set_session_use_in_progress before writing, satisfying the producer-index assert that Block mode added to ep_buffer_manager_write_event. NOTE: this native EventPipe unit-test target is not compiled in CI, so the test is not compile- or run-verified here. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

noahfalk · 2026-06-19T07:26:46Z

@@ -83,6 +83,16 @@ buffer_manager_release_buffer (
 	EventPipeBufferManager *buffer_manager,
 	uint32_t size);

+// Free a buffer's memory and then release its reserved budget. The order matters: the Block-mode


Nit: Comments about the implementation details are usually more relevant at the implementation site rather than on the method declaration.

noahfalk · 2026-06-19T07:56:49Z

+{
+	EP_ASSERT (buffer_manager != NULL);
+	EP_ASSERT (ep_rt_wait_event_is_valid (&buffer_manager->buffer_available_event));
+	ep_rt_wait_event_set (&buffer_manager->buffer_available_event);


I don't think a single auto reset event will offer any fairness guarantees and under high contention we might need them to ensure threads don't get starved. I'd suggest adding an explicit queue. Each thread waiting for a buffer can enqueue itself and the per-thread queue node holds a wait event rather than a shared global one so that only the thread at the front of the queue is woken when space becomes available.

Agree, there is no fairness guarantees on events and it's an implementation detail how threads get signaled (both on win32 and posixs). We could have a lazy allocated event on the ep_thread object that can be queued and awaited. The queue could be a dn_queue owned by buffer manager and guarded by buffer manager lock.

noahfalk · 2026-06-19T08:44:51Z

-			ep_thread_session_state_increment_sequence_number (session_state);
+
+			// In block mode, notify the caller that they should park and retry, unless the session is closing,
+			// or rundown is enabled, in which case we cannot block without deadlocking.


This sounds fine for now (and for .NET 11), but there does appear to be a pre-existing limitation that any rundown events larger than the buffer size get dropped. This exists for non-blocking mode too so its nothing unique to this feature, just an odd limitation we've had for years. I'm surprised we haven't heard complaints about it.

noahfalk · 2026-06-19T08:54:07Z

@@ -981,7 +1057,8 @@ ep_buffer_manager_write_event (
 	EP_ASSERT (thread == ep_rt_thread_get_handle ());

 	// Before we pick a buffer, make sure the event is enabled.
-	ep_return_false_if_nok (ep_event_is_enabled (ep_event));
+	if (!ep_event_is_enabled (ep_event))
+		return EP_WRITE_EVENT_RESULT_DROPPED;


For clarity I'd rename RESULT_DROPPED to RESULT_NOT_WRITTEN.

Everywhere else we use the term 'drop' it means we incremented the sequence number and one of the debug events_dropped variables so it has a more specific meaning than 'not written'.

noahfalk · 2026-06-19T09:14:06Z

+	// Before: enable() built these on a stack-local queue and invoked them inline, so each provider's
+	// EventSource enable notification fired during enable(), unconditionally, before streaming began.
+	//
+	// Why deferred: a blocking GCHeapSnapshot enable callback forces a stop-the-world heap walk that parks


The description makes this sound specific to GC details but I would state this in a more generalized way. Enable callbacks are allowed to generate an unbounded number of events which would block this thread if nothing is concurrently draining them.

noahfalk · 2026-06-19T09:17:31Z

@@ -12,6 +12,14 @@
 #endif
 #include "ep-getter-setter.h"

+// OR'd into EventPipeThread.session_use_in_progress alongside the session index to mean "actively
+// writing into the buffer". The reader steals a buffer only while the bit is set; the index alone


Suggested change

// writing into the buffer". The reader steals a buffer only while the bit is set; the index alone

// writing into the buffer". The reader steals a buffer only while the bit is clear; the index alone

noahfalk · 2026-06-19T12:11:52Z

+	// Why deferred: a blocking GCHeapSnapshot enable callback forces a stop-the-world heap walk that parks
+	// the cooperative producer on a full buffer, and only the session's drain thread frees that capacity;
+	// invoking it before that thread is live self-deadlocks. So enable() parks the callbacks here and a
+	// later site dispatches them once the drain thread runs.


The callbacks are one thing that could generate events prior to the streaming thread running but they aren't the only one. In theory once we've enabled the allow_writes flag for a session and enabled the event configuration state any thread could write events to the session and become deadlocked. Practically I don't think any callers do that today, or don't do it enough to fill the buffer but I'd feel better if we didn't open the door to that possibility. A suggestion on how to avoid it:

Rename the session 'enable()' functions to something like 'session_init'. These APIs would do all the allocation they do today + registering the session in the global session list but they would not do any of the steps that enable events to be received, send events, or invoke callbacks such as:

ep_volatile_store_allow_write

ep_config_enable_disable

ep_sample_profiler_init

Rename ep_start_streaming() to ep_session_enable(). This function currently has some logic that maybe it starts the streaming thread immediately or maybe it defers. We can have a helper enable_helper() that either gets invoked immediately in the non-defered case, or it gets invoked by finish_init in the defered case. All of the callback invocation logic + event enabling logic that got moved out of the old enable() function would come to this new helper. The streaming thread start functionality would also be here. This ensures that creating the streaming thread happens before any events are received regardless of deferral.

Since all the functions that initialize the callback queue and invoke the callbacks are in the new enable_helper() function there should be no need for a heap allocated callback queue, it can stay on the stack as before which should simplify the cleanup.

Currently the config code that re-computes the global keyword/level information is a little eager to assume that any registered session (anything in _ep_sessions) should be included. Prior to calling enable_helper() we wouldn't want the session to show up in those global calculations so we might want to check the allow_write flag or create some dedicated flag that indicates a session is enabled and should be included in those calculations.

disable()/enable_helper() might need some adjustments to ensure enable_helper() doesn't try to do any work on a session that was already disabled and that disable() doesn't assume enable_helper() has already run. Both enable_helper() and disable() should be able to take the global EP lock which makes synchronization straight forward.

Of course this is just a rough idea in my head and I might be missing things. If you forsee issues or have alternate ideas on how to resolve it I'm glad to chat any time!

lateralusX · 2026-06-22T09:04:40Z

@pavelsavara, FYI. Will this PR have impact on single threaded WASM that needs to be considered?

pavelsavara · 2026-06-22T09:45:43Z

@pavelsavara, FYI. Will this PR have impact on single threaded WASM that needs to be considered?

ep_rt_wait_event_wait (and others) are NOOP under PERFTRACING_DISABLE_THREADS.
Single thread can't block by design.

So I think this new feature should not compile under PERFTRACING_DISABLE_THREADS.

Alternatively it could write-thru (there is no other thread that would conflict). The underlying web-socket or JS buffer never blocks. But that would probably need more testing.

pavelsavara · 2026-06-22T09:46:23Z

/azp run runtime-wasm

azure-pipelines · 2026-06-22T09:46:41Z

Azure Pipelines successfully started running 1 pipeline(s).

- Rename EP_WRITE_EVENT_RESULT_DROPPED to EP_WRITE_EVENT_RESULT_NOT_WRITTEN: the result covers every not-written case (buffers full, oversized, suspended, or not enabled), not only dropping. - Move the buffer_manager_free_buffer_and_release_budget ordering comment onto the definition. - Clarify the EventPipeSession::provider_callbacks deferral rationale and fix the write-buffer-in-use bit-state comment in ep-thread.h. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Block mode previously woke a parked producer through a single shared auto-reset event, which gave no ordering (a producer could starve) and coalesced multiple capacity releases into one wake. Replace it with a strict-FIFO queue: each producer that cannot reserve buffer budget enqueues itself and parks on its own event, only the producer at the head may reserve, and the reader wakes just that one when it frees a buffer (handing the line off to the next while budget remains). - The queue is a dn_queue of EventPipeThread* on the buffer manager. Each EventPipeThread carries an enqueued flag and a lazily-allocated auto-reset event (freed with the thread, so threads that never park pay no OS handle). - buffer_manager_try_reserve_buffer_fair enforces the order: reserve only when at the head (already front, or the queue is empty so nobody is barged), otherwise enqueue at the tail and let the caller park; rundown/teardown writers never park. - The queue is guarded by the buffer manager's existing rt_lock rather than a new lock: the reader frees budget and wakes the front while already holding rt_lock, and CoreCLR forbids nesting spin locks, so a dedicated lock could not be taken there. Setting an event under rt_lock is allowed. - Teardown raises the abort flag then wakes and clears the whole queue, so ep_session_wait_for_inflight_thread_ops just waits the producers' indices out. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Move EventPipe session enablement out of the under-lock enable() path into ep_start_streaming so the provider-enable callbacks live on a stack-local queue for one call instead of a heap queue parked on the session. ep_enable_3 now calls session_init, which only allocates and publishes the session inert (providers unconfigured, allow_write unset, uncounted). The caller's subsequent ep_start_streaming runs enable_holding_lock under the lock (config_enable_disable, allow_write, number_of_sessions, sample profiler), starts the drain thread, waits for it, then dispatches the collected callbacks. This removes EventPipeSession::provider_callbacks and the detach-under-lock dance in ep_start_streaming and disable_holding_lock (the orphan drain), since generation and dispatch now happen in one place. The enable-before-disable callback ordering reverts to the long-standing upstream behavior (a non-IPC disable racing start_streaming's outside-the-lock dispatch can reorder), which is not a regression. Tidy the teardown side to mirror creation: session_fini (the counterpart to session_init - unpublish, drain-if-enabled, close, reclaim) and session_free (the counterpart to ep_session_alloc) are extracted, and disable_helper is renamed stop_session to name the lock-free disable driver. Because session_init publishes an inert session before enable_holding_lock runs, three things keep that window safe: - config_compute_keyword_and_level / config_register_provider skip sessions whose allow_write bit is clear, so an inert session is invisible to provider config. - disable_holding_lock tears down enablement state only if the session was actually enabled (allow_write bit set); otherwise it just unpublishes and frees. Its caller gate also accepts an inert-only session (present in the collection even when number_of_sessions is 0) so a shutdown racing an in-progress enable cleanly unpublishes it instead of orphaning it. - ep_session_dec_ref closes the user_events data fd, covering an inert (or alloc-error) USEREVENTS teardown that never runs ep_session_disable. Behavior-preserving. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Create the SESSION drain thread as a raw native thread (like the diagnostics server thread) instead of a managed CLR Thread, so it needs no GC / Thread Store and can start during early startup. With a native drain thread there is no longer any reason to defer session streaming until ep_finish_init: - ep_rt_thread_create routes EP_THREAD_TYPE_SESSION through ::CreateThread, carrying the session pointer and reusing ep_rt_thread_coreclr_start_func (which skips DestroyThread when there is no managed Thread). SAMPLING stays a managed thread - its callback needs a fully-initialized GC. - streaming_thread tolerates a NULL managed-Thread handle and no longer wraps its drain loop in a GC-mode transition (a native thread has no GC mode). - ep_start_streaming starts the drain thread immediately instead of parking the session id; _ep_deferred_enable_session_ids and the ep_finish_init enable- replay are removed. _ep_can_start_threads is kept for the sample profiler and the rundown-on-disable path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…E_THREADS) builds Block (non-lossy) mode parks a producer on a full buffer until the drain thread frees capacity. Under PERFTRACING_DISABLE_THREADS (single-threaded, e.g. browser WASM) there is no separate drain thread - the drain runs cooperatively on the same thread via the JS job queue - and ep_rt_wait_event_wait is a NOOP. A parked producer would therefore busy-spin forever while the drain it is waiting on can never run. Block cannot work single-threaded by design. Gate the Block park/abort/fair-reserve machinery under #ifndef PERFTRACING_DISABLE_THREADS so it is not compiled into single-threaded builds: the wait_queue field, buffer_manager_try_reserve_buffer_fair / buffer_manager_signal_front_waiter / ep_buffer_manager_writer_wait_for_capacity / ep_buffer_manager_is_aborting / ep_buffer_manager_abort_blocked_writers, the buffering_mode==BLOCK write-path branches, and the producer park-retry loop. A buffering_mode of BLOCK falls through to the existing DROP path there (a single write that drops on a full buffer). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Block (non-lossy) buffering parks a producer on a full buffer until the drain thread frees capacity, so it only works for session types that have a continuous native drain - the ones with a streaming thread: IPCSTREAM and FILESTREAM. FILE and LISTENER sessions use the buffer manager but have no streaming thread (a FILE session flushes only at disable; a LISTENER session is pumped by an in-proc managed poll via EventPipeEventDispatcher), so a parked producer would stall app threads until teardown. SYNCHRONOUS and USEREVENTS have no buffer manager at all. Degrade Block to Drop in ep_session_alloc for any session type without a streaming thread, so a request can never deadlock a session that has no drainer. Callers that genuinely need Block (the IPC CollectTracing6 opt-in, and the env-var FILESTREAM session) only request it for streaming sessions and are unaffected; this is a defensive central guard for all current and future session-creation paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

CollectTracing6 (0x0207) extends CollectTracing5 with a trailing uint32 buffering mode after the provider configs, for streaming (IPCSTREAM) sessions only. It encodes EventPipeBufferingMode: 0 = Drop (default, lossy), non-zero = Block (non-lossy). user_events sessions don't use the buffer manager, so they omit the field. The shared command payload now defaults buffering_mode to Drop explicitly (via the enum in ds_eventpipe_collect_tracing_command_payload_alloc), so every older CollectTracing version is unchanged. ep_session_options_init takes a buffering_mode parameter instead of hard-coding Drop, and the collect handler passes payload->buffering_mode straight through it - no post-init override - which flows the mode to the buffer manager via ep_enable_3. This is the client-facing opt-in for the non-lossy Block mode added earlier in this series: a client (e.g. dotnet-gcdump on a large heap) requests lossless collection by sending CollectTracing6 with Block. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The startup EventPipe session (DOTNET_EnableEventPipe) had no way to request non-lossy Block buffering. Add DOTNET_EventPipeBufferingMode (0 = Drop, the default; non-zero = Block), read by a new ep_rt_config_value_get_buffering_mode on CoreCLR and NativeAOT plus a new CLRConfig entry. Mono stubs the getter to Drop (the function exists only to satisfy the ep-rt adapter contract); Block on Mono can be wired up later. ep_enable_2 gains a buffering_mode parameter (it already carries the other per-session knobs - format, rundown_keyword, etc.), so the startup session can request a mode while every other path passes EP_BUFFERING_MODE_DROP. Managed (EventListener), profiler, and gen-analysis sessions go through ep_enable and are unaffected. Block only takes effect for a streaming session (DOTNET_EventPipeOutputStreaming=1, i.e. FILESTREAM); on a non-streaming FILE session ep_session_alloc degrades it to Drop, as noted on the config value. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

noahfalk · 2026-06-25T06:17:41Z

So I think this new feature should not compile under PERFTRACING_DISABLE_THREADS.

Alternatively it could write-thru (there is no other thread that would conflict). The underlying web-socket or JS buffer never blocks. But that would probably need more testing.

Either of these sound like good solutions to me. 👍

Copilot AI review requested due to automatic review settings June 16, 2026 04:45

mdh1418 requested review from lateralusX, noahfalk, steveisok and vitek-karas as code owners June 16, 2026 04:45

github-actions Bot added the area-Tracing-coreclr label Jun 16, 2026

Copilot started reviewing on behalf of mdh1418 June 16, 2026 04:46 View session

dotnet-policy-service Bot assigned mdh1418 Jun 16, 2026

Copilot AI reviewed Jun 16, 2026

View reviewed changes

mdh1418 and others added 3 commits June 16, 2026 15:04

mdh1418 force-pushed the eventpipe-nonlossy-block branch from c7a287d to f15cb28 Compare June 16, 2026 19:05

build-analysis Bot mentioned this pull request Jun 17, 2026

MSBuild crashing in the build #92290

Open