Fix audio truncation by adding 20-second silent buffer #173
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When converting music with basic-pitch, the output is mistakenly truncated at the end for unknown reasons. This issue becomes more noticeable as the audio length increases. After repeated testing, it was found that simply adding 20 seconds of silence to the input audio in the
predictfunction ofbasic_pitch/inference.pycan prevent this issue from occurring.Since basic-pitch trims silence before outputting, the added 20 seconds of silence here will not result in a longer output.
Modified the
predictfunction ininference.pyto always append 20 seconds of silence to the input audio before running inference.This prevents the model from incorrectly truncating the tail end of the audio, which was happening on long, continuous files due to CNN edge effects.