feat: add `errors` parameter to CategoricalImputer for multimodal variables by direkkakkar319-ops · Pull Request #908 · feature-engine/feature_engine

direkkakkar319-ops · 2026-03-08T14:33:06Z

Description

Fixes #904

Adds an errors parameter to CategoricalImputer to handle multimodal categorical variables gracefully, instead of always raising a ValueError.

Changes

Added errors parameter to CategoricalImputer.__init__() with options 'raise' (default), 'warn', and 'ignore'
Updated both single-variable and multi-variable branches in .fit() to respect the new parameter
When errors='warn', emits a UserWarning and imputes using the first most frequent category found
When errors='ignore', silently imputes using the first most frequent category found
Default errors='raise' preserves existing behaviour — no breaking changes
Updated docstring with full parameter documentation
Added tests covering all three errors values and invalid input
Added CHANGELOG.rst entry

Type of Change

Bug fix
New feature (non-breaking)
Breaking change
Documentation update

Tests

All new and existing tests pass:

pytest tests/test_imputation/test_categorical_imputer.py

Notes

The existing error message ("contains multiple frequent categories") is preserved so no existing test matchers break.

…bles (feature-engine#904)

codecov · 2026-03-08T15:24:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.27%. Comparing base (f72a2b7) to head (6f5b4da).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #908   +/-   ##
=======================================
  Coverage   98.27%   98.27%           
=======================================
  Files         116      116           
  Lines        4978     4988   +10     
  Branches      795      800    +5     
=======================================
+ Hits         4892     4902   +10     
  Misses         55       55           
  Partials       31       31

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

direkkakkar319-ops · 2026-03-08T15:32:01Z

@solegalli facing issue with the 2 checks will work on this and move the PR for review in some time

direkkakkar319-ops · 2026-03-08T17:35:56Z

the test pass percentage has increased still below the set benchmark
trying to resolve the problems

solegalli

Hey @direkkakkar319-ops

Thank you very much for taking care of this issue and a lot of apologies for taking so long to reply.

As mentioned previously I am travelling, currently in the last month, and not always have time or internet to take care of feature-engine as I would like to.

Anyhow, I went over this PR now, the logic is there, thanks a lot for the implementation.

The tests need a bit of tidying but you covered every angle as well, thanks a lot for that.

Please see my comments below.

Would you be able to take a look at them?

PS: I'm back in Europe from April 13th so I will have normal capacity from then on :)

docs/whats_new/v_190.rst

solegalli · 2026-03-26T15:33:11Z

feature_engine/imputation/categorical.py

-    _fit_transform_docstring,
-    _transform_imputers_docstring,
-)
+    _feature_names_in_docstring, _imputer_dict_docstring,


Please restore to previous format.

restored the previous format

solegalli · 2026-03-26T15:33:24Z

feature_engine/imputation/categorical.py

-    find_all_variables,
-    find_categorical_variables,
-)
+from feature_engine.variable_handling import (check_all_variables,


please restore to previous format

restored the previous format

solegalli · 2026-03-26T15:35:35Z

feature_engine/imputation/categorical.py

        type object or categorical. If True, the imputer will select all variables or
        accept all variables entered by the user, including those cast as numeric.

+    errors : str, default='raise'


Instead of "errors", let's call this parameter "multimodal" so it is immediately obvious what it is about.

made errors to multimodal

solegalli · 2026-03-26T15:36:09Z

feature_engine/imputation/categorical.py

        accept all variables entered by the user, including those cast as numeric.

+    errors : str, default='raise'
+        Indicates what to do when the selected imputation_method='frequent'


Suggested change

Indicates what to do when the selected imputation_method='frequent'

Indicates what to do when `imputation_method='frequent'`

applied the sugesstion

solegalli · 2026-03-26T16:39:55Z

tests/test_imputation/test_categorical_imputer.py

+    imputer = CategoricalImputer(imputation_method="missing", errors="warn")
+    # Should fit without warnings since there's no mode computation
+    with warnings.catch_warnings():
+        warnings.simplefilter("error")


i am not sure we are testing that there were no warning. Could you pls check?

This new approach specifically checks that no warnings containing the message "multiple frequent categories" are raised when imputation_method='missing' , even if errors='warn' is set.

tests cases are passing

solegalli · 2026-03-26T16:40:26Z

tests/test_imputation/test_categorical_imputer.py

+
+def test_errors_ignore_single_variable():
+    """errors='ignore' on single multimodal variable — silent, uses first mode."""
+    X = pd.DataFrame(


Do we need this test? the logic of the transformer is tested in previous tests

Removed test_errors_ignore_single_variable and test_errors_ignore_multiple_variables .

test cases are passing

solegalli · 2026-03-26T16:41:24Z

tests/test_imputation/test_categorical_imputer.py

+
+
+def test_errors_ignore_multiple_variables():
+    """errors='ignore' on multiple multimodal variables — silent, uses first mode."""


do we need this test? the logic of the imputation is tested in previous tests

solegalli · 2026-03-26T16:42:26Z

tests/test_imputation/test_categorical_imputer.py

+    assert imputer.imputer_dict_["country"] == X["country"].mode()[0]
+
+
+# =============================================================================


thanks for highlighting this but pls remove these commented block.

removed the commented block

solegalli · 2026-03-26T16:43:33Z

tests/test_imputation/test_categorical_imputer.py

+    """
+    Covers the warnings.warn() inside the SINGLE-VARIABLE block of fit().
+
+    The existing test_errors_warn_emits_userwarning uses multimodal_df (2 columns),


thanks for picking this up. Instead of lengthy comments, we should try and capture the essence of the test in the test name.

yes will take care of it

…d_is_missing`

…iple_variables`

direkkakkar319-ops · 2026-03-26T19:07:34Z

made formatting changes for passing the ci:test_style from this

            "city": ["London", "London", "Paris", "Paris", "Berlin", "Berlin", "Madrid"],
            "country": ["UK", "UK", "FR", "FR", "DE", "DE", "ES"],
            "one_mode": ["London", "London", "London", "Paris", "Paris", "Berlin", "Berlin"],

to

            "city": [
                "London", "London", "Paris", "Paris", "Berlin", "Berlin", "Madrid"
            ],
            "country": ["UK", "UK", "FR", "FR", "DE", "DE", "ES"],
            "one_mode": [
                "London", "London", "London", "Paris", "Paris", "Berlin", "Berlin"
            ],

direkkakkar319-ops · 2026-03-26T19:28:02Z

Hey @solegalli, the only failing checks are the Codecov coverage ones.
Could you guide me on which test file I should add the coverage for the new errors='warn' and errors='raise' branches?

solegalli · 2026-03-27T12:22:42Z

feature_engine/imputation/categorical.py

-    _transform_imputers_docstring,
+    _variables_attribute_docstring
 )
+from feature_engine._docstrings.methods import (_fit_transform_docstring,


please make format match other imports

feature_engine/imputation/categorical.py

solegalli · 2026-03-27T12:24:47Z

feature_engine/imputation/categorical.py

        variables: Union[None, int, str, List[Union[str, int]]] = None,
        return_object: bool = False,
        ignore_format: bool = False,
+        errors: str = "raise",


Suggested change

errors: str = "raise",

multimodal: str = "raise",

solegalli · 2026-03-27T12:25:03Z

feature_engine/imputation/categorical.py

        if not isinstance(ignore_format, bool):
            raise ValueError("ignore_format takes only booleans True and False")

+        if errors not in ["raise", "warn", "ignore"]:


we need to update all of these errors to multimodal :)

solegalli · 2026-03-27T12:28:41Z

tests/test_imputation/test_categorical_imputer.py

+    )
    imputer = CategoricalImputer(imputation_method="frequent", variables="Name")
-    with pytest.raises(ValueError) as record:
+    with pytest.raises(ValueError, match=re.escape(msg)):


instead of making the test now dependent on re, we can test that it matches just part of the error message, 1 line, that's all we need :)

solegalli · 2026-03-27T12:29:28Z

tests/test_imputation/test_categorical_imputer.py

+    imputer = CategoricalImputer(imputation_method="frequent")
+    msg = (
+        "The variable(s) city, country contain(s) multiple frequent categories. "
+        "Set errors='warn' or errors='ignore' to allow imputation "


we can remove the 2nd and 3rd line and then we don't need re

solegalli · 2026-03-27T12:29:53Z

tests/test_imputation/test_categorical_imputer.py

+
+@pytest.mark.parametrize("errors", ["warn", "ignore"])
+def test_multimodal_imputation_result(multimodal_df, errors):
+    """Check that result is the same when errors='warn' or 'ignore'."""


pls remove comment

solegalli · 2026-03-27T12:30:31Z

tests/test_imputation/test_categorical_imputer.py

+
+@pytest.mark.parametrize("errors", ["bad_value", 1, True])
+def test_errors_invalid_value_raises(errors):
+    """Passing an unsupported value for errors should raise ValueError at init."""


pls remove comments from all tests :)

solegalli

Hey @direkkakkar319-ops

Thank you so much for the quick turnaround. Really appreciate it.

I saw you changed errors to multimodal in the docstring. We also need to change the parameter name and hence the logic throughout. Could you do that?

I think there is one part of the warn logic that is not being tested, I am not sure if you need to test multimodal=warn when passing 1 variable name to the variables parameters?

The logic in the transformer is broken down in 1: when variables get 1 variable, or otherwise. In one of the 2, we are not testing multimodal=warn, i think.

…g test

direkkakkar319-ops · 2026-03-27T14:19:58Z

Changes made

Rename of errors to multimodal in categorical.py , test_categorical_imputer.py
Added missing Unit test case test_warning_when_single_variable_in_list_is_multimodal in test_categorical_imputer.py

tests cases are passing locally

direkkakkar319-ops · 2026-03-27T14:22:32Z

Hi, @solegalli the changes required in the comments you mentioned i will do then shortly

Co-authored-by: Soledad Galli <solegalli@protonmail.com>

…thub.com/direkkakkar319-ops/feature_engine into issue-904-categorical-imputer-multimodal

solegalli · 2026-03-28T12:11:09Z

Hi, @solegalli the changes required in the comments you mentioned i will do then shortly

Alright! No problem at all. Give me a shout when you are ready! Thanks a lot for your support.

direkkakkar319-ops · 2026-03-28T18:59:39Z

Hi, @solegalli , Changes have been added , now the pr is ready for your review

feat(CategoricalImputer): add errors param to handle multimodal varia…

fb230fe

…bles (feature-engine#904)

direkkakkar319-ops marked this pull request as draft March 8, 2026 14:36

style: fix flake8 line length in CategoricalImputer

81be348

direkkakkar319-ops marked this pull request as ready for review March 8, 2026 15:11

direkkakkar319-ops marked this pull request as draft March 8, 2026 15:14

style: fix import order and duplicate pandas import

4fb5b7a

direkkakkar319-ops marked this pull request as ready for review March 8, 2026 16:35

direkkakkar319-ops added 2 commits March 8, 2026 22:49

test: add coverage for errors='ignore' branches

835133f

style: add missing newline at end of test file

81f31d8

direkkakkar319-ops marked this pull request as draft March 8, 2026 17:26

Changes for codedev tests

657de1f

direkkakkar319-ops marked this pull request as ready for review March 9, 2026 19:06

added space at last of test_categorical_imputer.py

a0ea71d

solegalli reviewed Mar 26, 2026

View reviewed changes

direkkakkar319-ops added 13 commits March 26, 2026 22:53

Revert docs/whats_new/v_190.rst to upstream version

0cdcf03

changes done to feature_engine/imputation/categorical.py

cf7670e

changes made to tests/test_imputation/test_categorical_imputer.py

fb2f8db

resolved comment done on R15

97d6053

reformated the error tests to match the error from within pytest

c454edd

made three tests in on test

5992d09

left change

85b1974

refaactored the multimodal tests

09429f3

refactored test_errors_invalid_value_raises

0b86cfa

changed the function `test_errors_param_ignored_when_imputation_metho…

45f4e2f

…d_is_missing`

removed test_errors_ignore_single_variable `test_errors_ignore_mult…

cda93e7

…iple_variables`

emove the commented block

04be1a0

last few changes made

94643d8

solegalli reviewed Mar 27, 2026

View reviewed changes

feature_engine/imputation/categorical.py Outdated Show resolved Hide resolved

solegalli reviewed Mar 27, 2026

View reviewed changes

feature_engine/imputation/categorical.py Outdated Show resolved Hide resolved

solegalli reviewed Mar 27, 2026

View reviewed changes

Renamed errors to multimodal in CategoricalImputer and add missin…

6ba7fce

…g test

direkkakkar319-ops and others added 12 commits March 27, 2026 19:56

Apply suggestion from @solegalli

1a3fde2

Co-authored-by: Soledad Galli <solegalli@protonmail.com>

Apply suggestion from @solegalli

36eb1dc

Co-authored-by: Soledad Galli <solegalli@protonmail.com>

Update categorical.py

aa37d19

removed comments and added tests

3e58d8b

Merge branch 'issue-904-categorical-imputer-multimodal' of https://gi…

6746429

…thub.com/direkkakkar319-ops/feature_engine into issue-904-categorical-imputer-multimodal

Update .gitignore

c77e8f1

removed the spaces

a22f586

Merge branch 'issue-904-categorical-imputer-multimodal' of https://gi…

51f8276

…thub.com/direkkakkar319-ops/feature_engine into issue-904-categorical-imputer-multimodal

removed the spaces

7156d28

simplified the test case as asked

5d65fe8

simplified the test case as asked

a95f5e0

simplified the test case as asked

6f5b4da

direkkakkar319-ops requested a review from solegalli March 28, 2026 18:59

	Indicates what to do when the selected imputation_method='frequent'
	Indicates what to do when `imputation_method='frequent'`



		def test_errors_ignore_multiple_variables():
		"""errors='ignore' on multiple multimodal variables — silent, uses first mode."""

		assert imputer.imputer_dict_["country"] == X["country"].mode()[0]


		# =============================================================================

Conversation

direkkakkar319-ops commented Mar 8, 2026

Description

Changes

Type of Change

Tests

Notes

Uh oh!

codecov bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

direkkakkar319-ops commented Mar 8, 2026

Uh oh!

direkkakkar319-ops commented Mar 8, 2026

Uh oh!

solegalli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

direkkakkar319-ops commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

direkkakkar319-ops commented Mar 26, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

solegalli left a comment

Choose a reason for hiding this comment

Uh oh!

direkkakkar319-ops commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

codecov bot commented Mar 8, 2026 •

edited

Loading

direkkakkar319-ops commented Mar 26, 2026 •

edited

Loading

direkkakkar319-ops commented Mar 27, 2026 •

edited

Loading