-
Notifications
You must be signed in to change notification settings - Fork 140
soundwire: cadence: Fixes for three interrupt races #5659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: topic/sof-dev
Are you sure you want to change the base?
soundwire: cadence: Fixes for three interrupt races #5659
Conversation
…thread Clear the CDNS_MCP_INT_RX_WL interrupt before signalling completion. This is to prevent the possible race: - completion is signalled. - waiting thread is immediately scheduled and starts a new message. - new message completes and CDNS_MCP_INT_RX_WL is asserted. - reschedule to sdw_cdns_irq() thread, which then clears the new interrupt. - interrupt is never handled. Before this change, this error message was sometimes seen on kernels that have large amounts of debugging enabled: SCP Msg trf timed out This error indicates that the completion has not been signalled after 500ms. Signed-off-by: Richard Fitzgerald <[email protected]> Fixes: 956baa1 ("soundwire: cdns: Add sdw_master_ops and IO transfer support")
Remove the masking of CDNS_MCP_INT_SLAVE_MASK while the work thread runs. This is to avoid losing interrupts for state changes that happen while the work is running. The code assumed that any status change events that happened while CDNS_MCP_INT_SLAVE_MASK is masked would be latched and would assert the main interrupt when CDNS_MCP_INT_SLAVE_MASK is unmasked. We can avoid relying on this assumption by leaving CDNS_MCP_INT_SLAVE_MASK unmasked. There is no need to mask the interrupt while waiting for the work to run. The kernel will already filter out schedule_work() for work that is already pending. The worst that can happen is that the work can be re-queued while it is running, and so will run again when there are no more status changes. But the core code already has to filter out duplicate states because sdw_handle_slave_status() is given the current status of all devices, so some of those will usually be unchanged. Signed-off-by: Richard Fitzgerald <[email protected]> Fixes: 4a98a6b ("soundwire: intel/cadence: merge Soundwire interrupt handlers/threads")
Clear interrupts before handling them in sdw_cdns_irq() to prevent loasing events that re-triggered before they can be handled. Clearing interrupts at the end of sdw_cdns_irq() is a potential race. If the event re-triggered while sdw_cdns_irq() is still running it would be cleared when sdw_cdns_irq() ends, so the irq handler would not be re-triggered to handle it. Signed-off-by: Richard Fitzgerald <[email protected]> Fixes: 2f52a51 ("soundwire: cdns: Add cadence library")
|
@bardliao @shumingfan Do you have soak/stress tests you could run on Realtek so we can get more confidence this doesn't cause any new problems? |
|
@rfvirgil There are some
That means Device 0 is attached. Not sure if it is a real issue. |
| CDNS_MCP_INT_SLAVE_MASK, 0); | ||
|
|
||
| int_status &= ~CDNS_MCP_INT_SLAVE_MASK; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rfvirgil @shumingfan Is it possible that a Peripheral can generate a new event while handling a special event and lead to an infinite loop in a corner case?
This series is to fix three places in the cadence IRQ handling where interrupt evens could be lost.
Two are variations of the same problem that the code cleared interrupts AFTER it handled them. This is not the correct way, because any interrupt that re-triggered while the irq handler function is running would then be cleared at the end of the function without ever having been handled.
The other case is to avoid masking the PING status change interrupt while waiting for the work function to handle it. There is a potential problem here if state changes that happen while masked are lost,