-
Notifications
You must be signed in to change notification settings - Fork 44
Enhance performance evaluations #1034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Nice feature! |
|
In the vector case, perhaps we should just parallelize with multiple threads by default. It's just serial for now. @OkonSamuel What do you think? |
src/resampling.jl
Outdated
|
|
||
| # multiple model evaluations: | ||
| evaluate( | ||
| models_or_pairs::AbstractVector{<:Union{Machine,Pair{String,<:Model}}}, args...; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason we are allowing Machines to be passed here. I say this because of the type <:Union{Machine,Pair{String,<:Model}} I thought this would have been <:Union{Model,Pair{String,<:Model}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. This is a mistake. I'll fix and add a test.
If we did this we will encounter some issues. For example we might have a data race. I don't think anything prevents someone from doing this mach1 = machine(ConstantClassifier(), X, y)
evaluate!(["const1" => mach1, "const2" => mach1])But if we ran this on the same thread, we will be modifying the same machine object from different threads which could lead to race issue. Although this shouldn't be an issue if we just use the regular |
|
What if we just throw an error if there are duplicate machines? Would it suffice to test with === ?
Binding a machine to data does not duplicate the data, as far as I can tell. However, there is at least one MLJ operation that I can think of that mutates data, such as |
|
@OkonSamuel I think we just leave out the parallelism for now. Are you happy for me to merge? |
|
Yes @ablaom we should leave out parallelism for now. Asides that I'm happy for this to be merged. |
|
@OkonSamuel Thank you for your review and helpful feedback. |
This PR provides a few enhancements for the results of
evaluateorevaluate!, which estimatevarious kinds of out-of-sample performance of MLJ models (resp., machines). These should make
evaluatemore convenient when applying it to batches of models, to be compared:The estimate of standard errors, which is currently only calculated for display of a
returned object, is now calculated when the object is constructed, and is a new,
user-accessible property,
uncertainty_radius_95.Users can now "tag" their estimates, by doing
evaluate("some tag" => model, ...)instead of
evaluate(model, ...)and the returned object has a new user-accessibleproperty
tagfor storing this. Tags are auto-generated using the model name when not supplied,but for deeply wrapped models, this is often inadequate, hence the addition of user
tags. The tag is shown when the object is displayed.
Users can now
evaluatea vector of models, or tagged models, as in the followingexample, where we can see the user-supplied tags in the output:
Similar changes apply to the
evaluate!(::Machine, ...)form.In the future we might add a
summarize(evaluations)to convert the kind of information displayed here in a table.I found a few corner-case bugs in the display of performance evaluation objects, which I
have fixed. I have also added a lot more testing of the display, and added examples to the
docstrings for
evaluateandevaluate!.This PR closes #1031.