[ENH] defer trailing underscore attribute assignment in fit() for imputers and discretisers by direkkakkar319-ops · Pull Request #917 · feature-engine/feature_engine

direkkakkar319-ops · 2026-03-21T19:25:03Z

Description

Closes #586

This PR ensures that all trailing underscore attributes (variables_, imputer_dict_, binner_dict_) in imputation and discretisation transformers are only assigned after all fit logic has successfully completed, following sklearn convention.

Problem

Previously, attributes like variables_ were set early in fit(), before the remaining logic ran. If an error occurred midway through fitting, the transformer was left in a partially fitted state — meaning transform() would not raise NotFittedError as expected.

For example:

def fit(self, X, y=None):
    self.variables_ = find_numerical_variables(X)  # ← set too early
    self.imputer_dict_ = X[self.variables_].mean()  # ← if this fails, variables_ already set

Tests

Added test_raises_non_fitted_error_when_error_during_fit to tests/test_imputation/test_check_estimator_imputers.py, following the same pattern already established in tests/test_encoding/test_check_estimator_encoders.py.

First attempt — 4 test cases failed

When the test was first added, it used a numerical-only DataFrame as the failure trigger for all estimators. This worked for MeanMedianImputer, EndTailImputer, and ArbitraryNumberImputer, but 4 cases did not raise because CategoricalImputer(ignore_format=True), AddMissingIndicator, RandomSampleImputer, and DropMissingData accept all variable types and fitted successfully on that DataFrame.

Fix — different triggers per estimator

The test was updated to use the correct failure trigger per estimator type:

Estimator	Trigger
`MeanMedianImputer`, `EndTailImputer`, `ArbitraryNumberImputer`	Categorical-only df → `find_numerical_variables()` raises `TypeError`
`CategoricalImputer`	Reset to `ignore_format=False` + numerical-only df → raises `TypeError`
`AddMissingIndicator`, `RandomSampleImputer`, `DropMissingData`	Empty df → `check_X()` raises `ValueError`

Type of Change

Bug fix
New feature (non-breaking)
Breaking change
Documentation update

All the test cases are passing locally

…it(estimator):`

…ters and discretisers (closes feature-engine#586)

direkkakkar319-ops · 2026-03-22T09:56:21Z

Before the test cases were passing locally but the CI checks were failing.

the 3 CI chesks that were failing were:-

test_failure_engine_py39
test_feature-engine_py311_sklearn160
est_feature-engine_py312_pandas300

As self.variables_ was set early in fit(). On older sklearn versions, check_is_fitted() found it and considered the transformer fitted even though fit() had failed — so transform()` didn't raise NotFittedError as expected.
This was the reason for the 2 CI checks failing

direkkakkar319-ops · 2026-03-22T10:01:57Z

After the fix, since no trailing underscore attributes are set until the very end, a failed fit() leaves the object completely clean,check_is_fitted() finds nothing and correctly raises ERROR:NotFittedError

direkkakkar319-ops · 2026-03-22T10:02:15Z

I haved also ran tests locally, all of them are passing

codecov · 2026-03-22T10:17:23Z

Codecov Report

❌ Patch coverage is 98.73418% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 98.25%. Comparing base (f72a2b7) to head (0c681f5).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
feature_engine/transformation/log.py	88.23%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #917      +/-   ##
==========================================
- Coverage   98.27%   98.25%   -0.03%     
==========================================
  Files         116      116              
  Lines        4978     5033      +55     
  Branches      795      797       +2     
==========================================
+ Hits         4892     4945      +53     
- Misses         55       56       +1     
- Partials       31       32       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

solegalli

Hey @direkkakkar319-ops

These changes are neat! Thank you so much!

Could you finish updating all discretisers and all imputers so we can merge?

For the next PR, could you please make 1 PR per module? So, for example, 1 PR for encoders, a different PR for creation and so on?

Thanks a lot! I look forward to the changes.

solegalli · 2026-03-26T17:24:22Z

feature_engine/_base_transformers/base_numerical.py

+
+        return X, variables_
+
    def fit(self, X: pd.DataFrame) -> pd.DataFrame:


I can see that the logic that is here and we were calling with super().fit(), you are now not using and passing it on to the transformer, which makes sense for the requested change.

So i think we should remove the fit method from base numerical altogether. Like this, we ensure we don't use it as legacy anywhere else in the source code.

direkkakkar319-ops · 2026-03-26T19:17:31Z

Hey @solegalli ,thank you for the kind words and the feedback!
I'll finish updating all the discretisers and imputers right away so we can get this merged.

direkkakkar319-ops · 2026-03-26T19:17:47Z

I'll update this PR shortly. Thanks again for the guidance

direkkakkar319-ops · 2026-03-26T21:12:28Z

Changes made

Removed fit() from the Base Class . - The fit() method was completely removed from base_numerical.py .

2 Deferred Trailing Underscore Attribute Assignment - Updated the fit() method in all imputer and discretiser classes to use local variables for internal calculations. Attributes like variables_ , imputer_dict_ , and `binner_dict_ are now assigned only as the final step of the method.

Refactored Subclasses to Implement Local fit() - Updated several transformation and scaling classes to implement their own fit() logic instead of calling super().fit() .
Updated Mixins and Shared Logic - Refactored the FitFromDictMixin in mixins.py to defer the assignment of self.variables_ .
Strengthened the Test Suite - addd test_raises_non_fitted_error_when_error_during_fit to the discretisation test suite and updated existing imputer tests. I also updated the MockClass in base transformer tests to reflect the new architecture.

Files changed:-

Imputation : categorical.py, missing_indicator.py , random_sample.py , drop_missing_data.py , arbitrary_number.py .
Discretisation : geometric_width.py , decision_tree.py , arbitrary.py .
Transformation : log.py, yeojohnson.py , power.py , boxcox.py , arcsin.py , arcsinh.py .
Others : mean_normalization.py , cyclical_features.py .

tests

The test cases are passing locally

solegalli

Hey @direkkakkar319-ops

Thanks a lot for the quick turnaround!

I wasn't sure if you finished working on this PR, but I had a look anyways.

It's looking great.

We need to add tests for all the modules included in this PR. Currently, we need to add the test to the following:

Add test for the transformers in the creation module (like this, we ensure all transformers in creation are now OK)
Add test for the transformers in the transformation module
Add test for the scaler

I also updated the issue, to keep track of which modules have been updated up to now.

Thank you!

direkkakkar319-ops · 2026-03-27T13:58:00Z

tests added

Added test for the transformers in the creation module
locally tests cases are passing

---- * Add test for the `transformers` in the `transformation` module locally tests are passing for this change

---- * Add test for the `scaler` locally test casses passed

locally also the tests were ran they also pass

direkkakkar319-ops · 2026-03-27T14:10:35Z

HI, @solegalli made the 3 tests i was asked to , would appreciate you review

direkkakkar319-ops added 4 commits March 22, 2026 00:33

added test case `def test_raises_non_fitted_error_when_error_during_f…

7d23f87

…it(estimator):`

fixing test case test_style

e1439cb

fixing test case test_style

f33a85a

fixing test case test_style

7a0e8f5

direkkakkar319-ops changed the title ~~added test case `def test_raises_non_fitted_error_when_error_during_f…~~ fix: ensure trailing underscore attributes are set only after fit() succeeds in imputers and discretisers Mar 22, 2026

direkkakkar319-ops changed the title ~~fix: ensure trailing underscore attributes are set only after fit() succeeds in imputers and discretisers~~ fix: defer trailing underscore attribute assignment in fit() for imputers and discretisers (closes #586) Mar 22, 2026

fix: defer trailing underscore attribute assignment in fit() for impu…

59c8bd1

…ters and discretisers (closes feature-engine#586)

direkkakkar319-ops marked this pull request as ready for review March 22, 2026 10:06

solegalli reviewed Mar 26, 2026

View reviewed changes

direkkakkar319-ops added 7 commits March 27, 2026 02:39

base transformers

fef5a3f

discretisation

2d3f734

scaling

f95cb7f

creation

9b2fa4c

imputation

64c9e74

transformation

50075fb

tests

f2e944f

direkkakkar319-ops added 4 commits March 27, 2026 02:44

creation

ca9241e

verified changes for checks

d813e2d

verified changes for checks

519ca1d

value error

44f75c0

solegalli changed the title ~~fix: defer trailing underscore attribute assignment in fit() for imputers and discretisers (closes #586)~~ [ENH] defer trailing underscore attribute assignment in fit() for imputers and discretisers Mar 27, 2026

solegalli reviewed Mar 27, 2026

View reviewed changes

ADDED:test_raises_non_fitted_error_when_error_during_fit

7b3c592

direkkakkar319-ops added 3 commits March 27, 2026 19:23

added:test_raises_non_fitted_error_when_error_during_fit

6babea2

addEd:test_raises_non_fitted_error_when_error_during_fit

c945dee

tranformers

b631f63

left

0c681f5

direkkakkar319-ops requested a review from solegalli March 27, 2026 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] defer trailing underscore attribute assignment in fit() for imputers and discretisers#917

[ENH] defer trailing underscore attribute assignment in fit() for imputers and discretisers#917
direkkakkar319-ops wants to merge 21 commits intofeature-engine:mainfrom
direkkakkar319-ops:issue#586

direkkakkar319-ops commented Mar 21, 2026 •

edited

Loading

Uh oh!

direkkakkar319-ops commented Mar 22, 2026 •

edited

Loading

Uh oh!

direkkakkar319-ops commented Mar 22, 2026

Uh oh!

direkkakkar319-ops commented Mar 22, 2026

Uh oh!

codecov bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

solegalli left a comment

Uh oh!

solegalli Mar 26, 2026

Uh oh!

direkkakkar319-ops commented Mar 26, 2026

Uh oh!

direkkakkar319-ops commented Mar 26, 2026

Uh oh!

direkkakkar319-ops commented Mar 26, 2026

Uh oh!

solegalli left a comment

Uh oh!

direkkakkar319-ops commented Mar 27, 2026

Uh oh!

direkkakkar319-ops commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		return X, variables_

		def fit(self, X: pd.DataFrame) -> pd.DataFrame:

Conversation

direkkakkar319-ops commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Tests

First attempt — 4 test cases failed

Fix — different triggers per estimator

Type of Change

Uh oh!

direkkakkar319-ops commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

direkkakkar319-ops commented Mar 22, 2026

Uh oh!

direkkakkar319-ops commented Mar 22, 2026

Uh oh!

codecov bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

solegalli left a comment

Choose a reason for hiding this comment

Uh oh!

solegalli Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

direkkakkar319-ops commented Mar 26, 2026

Uh oh!

direkkakkar319-ops commented Mar 26, 2026

Uh oh!

direkkakkar319-ops commented Mar 26, 2026

Changes made

Files changed:-

tests

Uh oh!

solegalli left a comment

Choose a reason for hiding this comment

Uh oh!

direkkakkar319-ops commented Mar 27, 2026

tests added

Uh oh!

direkkakkar319-ops commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

direkkakkar319-ops commented Mar 21, 2026 •

edited

Loading

direkkakkar319-ops commented Mar 22, 2026 •

edited

Loading

codecov bot commented Mar 22, 2026 •

edited

Loading