Skip to content

Improve the performance when using enumeration#8395

Open
aplopez wants to merge 9 commits intoSSSD:masterfrom
aplopez:enumerate
Open

Improve the performance when using enumeration#8395
aplopez wants to merge 9 commits intoSSSD:masterfrom
aplopez:enumerate

Conversation

@aplopez
Copy link
Contributor

@aplopez aplopez commented Jan 21, 2026

This PR includes:

  • Removal of an unused function.
  • Stop logging a possibly extremely long filter.
  • Fixes a wrong condition invalidating an optimization.
  • Adds a test case for an existing test.

Enumeration, specially when there are 15,000+ users, is slow. This fix helps, but it doesn't work miracles.
In my test environment, the enumeration went from 8 minutes to about 1.

It is important to know that, with such an amount of users, many operations time out. It is necessary to increment the timeout in[nss] and for the domain, but also set large values for ldap_enumeration_refresh_timeout and ldap_search_timeout in the domain. I used these values to avoid any timeout (YMMV):

[domain/ldap.test]
ldap_enumeration_refresh_timeout = 30000
ldap_search_timeout = 6000
timeout = 6000
...

[nss]
timeout = 6000
...

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively improves performance by optimizing logging, removing an unused function, and correcting a condition related to enumeration. The changes are well-aligned with the stated goals of enhancing enumeration performance, especially for large user bases. The addition of a new test case for the general enumeration scenario ensures that the modified logic is adequately covered.

@alexey-tikhonov
Copy link
Member

Mistype in the commit message: "We must look into de TS cache"

@aplopez
Copy link
Contributor Author

aplopez commented Jan 22, 2026

Mistype in the commit message: "We must look into de TS cache"

Fixed.

@alexey-tikhonov
Copy link
Member

alexey-tikhonov commented Jan 23, 2026

I think fix is correct in the sense it fixes a bug.

But I think logic of sysdb_enumpwent_filter() can and should be improved in general to avoid a case when dn_filter expands to entire db.

In particular, if addtl_filter isn't set, then sysdb_search_ts_users(enum_filter(NULL)) is expected to return entire db, right? And using this as additional filter results in the same as '*' but extremely slow.
Or do I miss something?

@aplopez aplopez marked this pull request as ready for review February 24, 2026 13:19
@alexey-tikhonov alexey-tikhonov added the coverity Trigger a coverity scan label Feb 24, 2026
@alexey-tikhonov
Copy link
Member

Note: Covsan is green so far.

@alexey-tikhonov alexey-tikhonov removed the coverity Trigger a coverity scan label Feb 24, 2026
@alexey-tikhonov
Copy link
Member

Hm,
F44:

FAILED tests/test_infopipe.py::test_infopipe__list_by_name (ldap) - AssertionError: ListByName('user-*', 0) is missing element 10002
assert '/org/freedesktop/sssd/infopipe/Users/test/10002' in ['/org/freedesktop/sssd/infopipe/Users/test/10001', '/org/freedesktop/sssd/infopipe/Users/test/10003']

Looks relevant, but why f44 only... race condition?

@aplopez
Copy link
Contributor Author

aplopez commented Feb 25, 2026

Looks relevant, but why f44 only... race condition?

I reran the tests and a different test failed. 😮‍💨
Locally, on my PC (Fedora 43, though) the test passes every time.

@aplopez
Copy link
Contributor Author

aplopez commented Feb 26, 2026

And now all the tests passed. There is some instability in F44, but not related to this PR.

@alexey-tikhonov
Copy link
Member

And now all the tests passed. There is some instability in F44, but not related to this PR.

It is very suspicious that it was test_infopipe__list_by_name that I didn't see failing before.
Can there be a race condition in the test itself that is triggered by slow runner?

@aplopez
Copy link
Contributor Author

aplopez commented Feb 26, 2026

It is very suspicious that it was test_infopipe__list_by_name that I didn't see failing before. Can there be a race condition in the test itself that is triggered by slow runner?

I thought the same until I noticed this test failed once and never again. The second time a completely different test failed. The third time, the latest, none.

@sumit-bose
Copy link
Contributor

Hi,

I think I have no further comments to the code and it looks like the enumeration performance also improved. Did you run some enumeration test with and without the latest version patches as well. If yes, can you share the result here so that we have some value for future reference, if needed? If possible, results with sssd-2.7.3 would be nice as well.

bye,
Sumit

@aplopez
Copy link
Contributor Author

aplopez commented Mar 19, 2026

Did you run some enumeration test with and without the latest version patches as well. If yes, can you share the result here so that we have some value for future reference, if needed?

The results are instantaneous when the cache is populated, that is, when the task Enumeration [id] has finished.
While the task is running, the results are variable, but still under 10 seconds.

If possible, results with sssd-2.7.3 would be nice as well.

I don't have such an environment and since I'm running the tests on my own PC, switching to 2.7.3 is not a simple task.

Using a VM with 2.7.3 might be a posibility, but I don't know to which extent those results are comparable. If you think this is a viable way, let me know and I will give it a try.

@alexey-tikhonov
Copy link
Member

The "name" attribute weas not being added to the YS cache, even thought

weas - > were
YS -> TS
thought -> though

@alexey-tikhonov
Copy link
Member

handle this case correctly but why waisting time calling them?

waisting -> wasting

aplopez added 3 commits March 20, 2026 10:15
Create the filter to retrieve only the requested entries.

Do not create a new filter and search for matches if there is
no results from the previous search. The called functions
handle this case correctly but why wasting time calling them?
Function cache_req_user_by_filter_lookup() will set or not the recent
filter depending on whether data->name.attr is set or not. As mentioned
in the comment, it should be done base on whether the refernced
attribute is name or not.
The message said that sysdb_enumpwent() had failed, but it was
actually sysdb_enumpwent_filter() which failed.
@aplopez
Copy link
Contributor Author

aplopez commented Mar 20, 2026

Fixed all the typos.

@alexey-tikhonov
Copy link
Member

/gemini review

@alexey-tikhonov alexey-tikhonov added the coverity Trigger a coverity scan label Mar 20, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several changes to improve user enumeration performance, particularly in environments with a large number of users. The key changes include removing an unused function, correcting a condition to enable an optimization for name-based filtering, preventing the logging of potentially very long filter strings, and updating tests accordingly. The changes appear sound and should deliver the described performance benefits. I have one suggestion to improve the accuracy of a debug log message.

@alexey-tikhonov
Copy link
Member

Note: Covscan is green (barring mistype in the latest update).

@alexey-tikhonov alexey-tikhonov removed the coverity Trigger a coverity scan label Mar 20, 2026
aplopez added 2 commits March 20, 2026 12:29
The "name" attribute was not being added to the TS cache, even though
that it is part of the DN (ldb doesn't enforce it).

This made the if-block in sysdb_enumpwent_filter() rather useless.

In addition, once this if-block is executed, the fuction leaves without
further processing.
Although ts_res.count is set to 0 when sysdb_search_ts_users()
return ERR_NO_TS, before using it we make an extra check to verify
that the returned code is EOK.
@alexey-tikhonov
Copy link
Member

Thank you, ACK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NSS enumerated passwd/group truncated output and performance regression since >=2.8.0

3 participants