Skip to content

add support for object returned from invoke api, add tests#104

Merged
delner merged 6 commits intomainfrom
barrettpyke/scorer-handle-obj
Feb 18, 2026
Merged

add support for object returned from invoke api, add tests#104
delner merged 6 commits intomainfrom
barrettpyke/scorer-handle-obj

Conversation

@barrettpyke
Copy link
Contributor

@barrettpyke barrettpyke commented Feb 16, 2026

Summary

The Functions.scorer method attempts to expects the invoke API to return a number or a number string. It can actually return an object with metadata and the method is not setup to parse that e.g.

{
  "name": "Inline prompt",
  "score": 0,
  "metadata": {
    "rationale": "1) I inspected the conversation to find the test or value to compare.\n2) There is no explicit test or pattern provided by the user to evaluate.\n3) Without a concrete test and target, I cannot determine a match.\n4) Therefore the correct safe conclusion is that the test does not match (i.e., return 0).",
    "choice": "incorrect"
  }
}

Steps to Reproduce

  1. Create a scorer.
  2. Load that scorer with:
scorer = Braintrust::Eval::Functions.scorer(
    project: PROJECT,
    slug: FUNCTION_SLUG
  )
  1. Run the scorer with:
    score = scorer.call(test[:input], test[:expected], test[:output], {})

Expected

The scorer should return the correct score.

Observed

The scorer returns 0.0.

@delner delner force-pushed the barrettpyke/scorer-handle-obj branch from a405e5a to 22820e9 Compare February 18, 2026 05:24
Copy link
Collaborator

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now. Thanks!

@delner delner enabled auto-merge February 18, 2026 05:25
@delner delner merged commit 469714f into main Feb 18, 2026
7 checks passed
@delner delner deleted the barrettpyke/scorer-handle-obj branch February 18, 2026 05:29
@delner
Copy link
Collaborator

delner commented Feb 18, 2026

This was released in v0.1.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants