Skip to content

Conversation

@tristanpemble
Copy link

@tristanpemble tristanpemble commented Nov 26, 2025

this PR introduces a std.log.span() method, following the existing patterns for std.log:

  • support for log levels
  • support for scoping
  • all calls removed at comptime when scope/level is disabled
  • support for a custom traceFn implementation

Usage

There are three different APIs being exposed here.

Instrumenting code

Anyone who wants to instrument their code (application and library authors) can add spans to their code like this:

pub fn main() !void {
    var threaded = std.Io.Threaded.init(std.heap.page_allocator);
    defer threaded.deinit();

    const io = threaded.io();

    var work = log.span(.info, "scheduling work", .{});
    work.begin();
    defer work.end();

    var group: std.Io.Group = .init;
    group.async(io, doWork, .{io});
    group.async(io, doWork, .{io});
    group.async(io, doWork, .{io});
    group.wait(io);
}

fn doWork(io: std.Io) void {
    var span = log.span(.info, "doing a unit of work", .{});
    span.begin();
    defer span.end();

    io.sleep(.fromMilliseconds(1), .awake) catch {};
}

Implementing a tracing backend

Application authors can provide a traceFn on std.Options that consumes the events. Here's a stupid one that I used to test when developing; it prints a column for each thread and visualizes the spans

pub const std_options: std.Options = .{
    .traceFn = myTraceFn,
};

var mutex: std.Thread.Mutex = .{};
var state: [max_executors]u32 = @splat(0);

fn myTraceFn(
    comptime level: log.Level,
    comptime scope: @EnumLiteral(),
    any: *log.AnySpan,
    event: log.SpanEvent,
    executor: log.ExecutorId,
    comptime format: []const u8,
    args: anytype,
) void {
    mutex.lock();
    defer mutex.unlock();

    const span = any.asSpan(level, scope, format, @TypeOf(args));
    const span_id: usize = @intCast(@intFromEnum(span.id));
    const exec_id: usize = if (executor == .none) 0 else @as(usize, @intCast(@intFromEnum(executor))) + 1;

    const sign: u8 = switch (event) {
        .begin => '[',
        .end => ']',
        .enter => '>',
        .exit => '<',
        .link => '+',
        .unlink => '-',
    };

    switch (event) {
        .begin => state[exec_id] += 1,
        .end => state[exec_id] -= 1,
        else => {},
    }

    for (0..max_executors) |e| {
        if (e == exec_id) {
            std.debug.print("{c}{:0>2} ", .{ sign, span_id });
        } else if (state[e] > 0) {
            std.debug.print(" || ", .{});
        } else {
            std.debug.print("    ", .{});
        }
    }
    std.debug.print("\n", .{});
}

Implementing a std.Io

This requires support by any implementation of std.Io. You can look at how I implemented it in Threaded, but the idea is simple:

  • each "executor" (thread, fiber, etc) gets a unique ExecutorId
  • as the executor is moved between threads, they should executor_id.enter() and executor_id.exit() on that thread
  • as work is created, it should store the std.log.current_span
  • as work is scheduled, it should span.link(), linking the original span to this executor
  • as work is completed, it should span.unlink(), unlinking the original span to this executor
  • as work is suspended/resumed, it should span.exit() and span.enter() accordingly

Misc Details

Userdata

Some tracing backends might need userdata to be stored on the span. std.Options has a SpanUserdata: type field (default void). when the span is created, its userdata: SpanUserdata is unintialized. the traceFn implementation can
store user data on it as SpanEvents are handled.

Source location

I'm conflicted on whether to include SourceLocation as part of the API. For a middleground, I am tracking the @returnAddress() of the call to std.log.span(); the implementation can decide if they would like to use this to gather the SourceLocation from debug data.

if we do include it in the API, it will make the span() calls slightly more verbose.

Timing data

I opted not to include timing data. Some backends use their own timing, some don't. It seemed like something that the traceFn implementation should handle.

Caveats

The biggest caveat to my design is that it depends on a linked list of stack allocated data. Each Span stores a pointer to its previous Span. What that means is that moving the span after you have called begin() is an illegal operation:

const span = log.span(.info, "hello", .{});
// move span wherever you want
span.begin(); // we have now stored the stack pointer -- moving after this point is illegal behavior!

I don't love it, but I also think it's kind of ok - the split design of initializing a span and beginning it allows you to move the span where it needs to be stored before beginning it.

Open Questions

Status of IoUring/Kqueue?

I started working on IoUring but quickly found out that it looks like it's not currently complete. Is it? Let me know what's up here. Until I can test on IoUring/Kqueue I can't validate this design against a fiber model

Should std.log.ExecutorId be part of std.Io?

To me this concept feels pretty intrinsically tied to Io implementations; it's essentially a monotonic identifier for native threads/green threads/fibers/whatever.

Should this be moved to a new module, like std.trace?

I go back and forth on this, but it shares a lot with std.log, so I left it there.

Should we create a default executor ID in start(), or leave it as .none?

std.log.current_executor defaults to a .none. this helps catch implementation errors in std.Io when adding support. if you're not using std.Io, all your executors will be none -- which I think is ok. but there could be an argument to automatically set the main thread's "executor ID"

Should we update logFn to include span information?

People often like to tie logs to the spans, right now I don't have that wired. It would be a breaking change, but may be valuable.

Default traceFn?

I put a default traceFn that logs the beginning and end of the trace. Should there even be a default? If there is, what should it do?

Various bikeshedding on names, etc.

I have spent a lot of time on naming but I am still not sure what the best names for a lot of these concepts are. I am very open to suggestions.

Issues

implements #5987

@tristanpemble
Copy link
Author

@MasonRemaley

@steeve
Copy link

steeve commented Nov 26, 2025

Minor aesthetic suggestion: let begin return the work so that:

    var work = log.span(.info, "scheduling work", .{}).begin();
    defer work.end();

@steeve
Copy link

steeve commented Nov 26, 2025

Source location

I'm conflicted on whether to include SourceLocation as part of the API. For a middleground, I am tracking the @returnAddress() of the call to std.log.span(); the implementation can decide if they would like to use this to gather the SourceLocation from debug data.

@returnAddress requires DWARF and stack unwinding will be slow (or at least slower than it could have been). Strong argument for something like @callerSrc, though!

@tristanpemble
Copy link
Author

Minor aesthetic suggestion: let begin return the work so that:

    var work = log.span(.info, "scheduling work", .{}).begin();
    defer work.end();

let me caveat this with that I would really like to enable a simpler syntax, but have yet to find a way to do it with the current traceFn. the constraints come down to two things:

first, we must not allocate since this is performance sensitive code. a child span needs to be able to revert to the parent span, so I arrived on a linked list with a threadlocal head.

second, Span is generic over its level/scope/format/Args, storing args: Args, and the traceFn currently has access to those args for all events: begin, end, link, unlink, etc. this meant the the linked list must store type erased pointers, rather than just the SpanId.

now, to answer your question, since we store a pointer to the Span as threadlocal state, its location must be stable when begin() is called. correct me if I am wrong, but in chained syntax, log.span() would return a temporary and we'd end up storing a dangling reference.

so then the question comes down to whether every event should have access to those fields. if not -- specifically, in the events emitted from std.Io (link/unlink/enter/exit) -- I think we can just store the SpanId per thread, remove the stack pointers, and then only begin/end will receive all the fields. any std.Io code will only be able to see SpanId, and maybe a pointer to SpanUserdata.

the SpanUserdata was something I added last minute, but it being available is making me reconsider this as an option. the traceFn could store what it needs there on begin.

Source location

I'm conflicted on whether to include SourceLocation as part of the API. For a middleground, I am tracking the @returnAddress() of the call to std.log.span(); the implementation can decide if they would like to use this to gather the SourceLocation from debug data.

@returnAddress requires DWARF and stack unwinding will be slow (or at least slower than it could have been). Strong argument for something like @callerSrc, though!

to paraphrase @MasonRemaley; the tradeoff here seems to be introducing overhead while doing performance analysis, versus potentially playing better with incremental compilation

when I originally approached this, my expectation was that @src() would be passed to args as a matter of convention, and your traceFn could extract it. the problem is that this would require all library authors to follow a convention.

there's an argument for removing format and args entirely from the span, just having log.span(.info, @src()), simplifying this API tremendously as a result, and then correlating log functions to the span ID

@tristanpemble tristanpemble marked this pull request as draft November 26, 2025 22:16
@andrewrk
Copy link
Member

This pull request is not ready for review because:

  • It is explicitly marked as a draft.

Since we have moved development to Codeberg, please open your pull request there if you would like to continue these efforts.

@andrewrk andrewrk closed this Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants