Fix audio truncation by adding 20-second silent buffer #173

avan06 · 2025-08-07T16:04:03Z

When converting music with basic-pitch, the output is mistakenly truncated at the end for unknown reasons. This issue becomes more noticeable as the audio length increases. After repeated testing, it was found that simply adding 20 seconds of silence to the input audio in the predict function of basic_pitch/inference.py can prevent this issue from occurring.

Since basic-pitch trims silence before outputting, the added 20 seconds of silence here will not result in a longer output.

Modified the predict function in inference.py to always append 20 seconds of silence to the input audio before running inference.

This prevents the model from incorrectly truncating the tail end of the audio, which was happening on long, continuous files due to CNN edge effects.

Modified the `predict` function in `inference.py` to always append 20 seconds of silence to the input audio before running inference. This prevents the model from incorrectly truncating the tail end of the audio, which was happening on long, continuous files due to CNN edge effects.

hyperc54 · 2025-08-14T17:26:17Z

Hi!
Thanks for raising this issue and starting the investigation.

It's good to know that adding 20s of samples at the end seems to mitigate the issue, although I think we'd like to find the root cause before making any changes here, which I'm happy to help doing!

Could you share more details to help reproduce the issue? (eg: audio file tested, command run, etc.)

avan06 · 2025-08-15T01:47:51Z

Hi,

Sure, no problem. I can provide the details to reproduce this issue.

The music source I used is the following YouTube video, which I downloaded and converted to FLAC. Its length is 38:11.
https://www.youtube.com/watch?v=JkWeyX7Hquc
Below is the script I used for testing:

from basic_pitch.inference import predict
from basic_pitch import ICASSP_2022_MODEL_PATH

audio_path = "(GB)ONI V 隠忍を継ぐ者⧸Oni V： Innin wo Tsugumono-Soundtrack [JkWeyX7Hquc].flac"

model_output, midi_data, note_events = predict(
    audio_path,
    ICASSP_2022_MODEL_PATH,
    onset_threshold=0.55,
    frame_threshold=0.25,
    minimum_note_length=100,
    minimum_frequency=50,
    maximum_frequency=3000
)

midi_data.write("ONI V.mid")

The result after running basic_pitch predict has a length of 37:53.

Please help confirm, thank you.

By the way, below is the log from my execution:

>python test.py
WARNING:root:Coremltools is not installed. If you plan to use a CoreML Saved Model, reinstall basic-pitch with `pip install 'basic-pitch[coreml]'`
WARNING:root:tflite-runtime is not installed. If you plan to use a TFLite Model, reinstall basic-pitch with `pip install 'basic-pitch tflite-runtime'` or `pip install 'basic-pitch[tf]'`
WARNING:root:Tensorflow is not installed. If you plan to use a TF Saved Model, reinstall basic-pitch with `pip install 'basic-pitch[tf]'`
Predicting MIDI for (GB)ONI V 隠忍を継ぐ者⧸Oni V: Innin wo Tsugumono-Soundtrack [JkWeyX7Hquc].flac...

hyperc54 · 2025-10-28T16:56:47Z

Hi @avan06,

Sorry for the delay in answering.
I have been able to find the root cause for the issue you're highlighting and tentatively fixed it in this PR. If the team approves of the fix, I would suggest we close your PR.

Let me know if you would like to co-author commits in the other PR I created to thank you for your contribution in finding the bug!

avan06 · 2025-10-29T01:00:09Z

Hi @hyperc54,
No problem on my side — as long as the issue is resolved, that’s what matters. Many thanks to the team for your hard work!

avan06 force-pushed the main branch from 40f3308 to 5bf5d71 Compare August 13, 2025 12:40

hyperc54 mentioned this pull request Oct 28, 2025

Fix intermediate frames trimming to make sure final notes output is not truncated #179

Merged

avan06 closed this Oct 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix audio truncation by adding 20-second silent buffer #173

Fix audio truncation by adding 20-second silent buffer #173

Uh oh!

avan06 commented Aug 7, 2025

Uh oh!

hyperc54 commented Aug 14, 2025

Uh oh!

avan06 commented Aug 15, 2025 •

edited

Loading

Uh oh!

hyperc54 commented Oct 28, 2025

Uh oh!

avan06 commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix audio truncation by adding 20-second silent buffer #173

Fix audio truncation by adding 20-second silent buffer #173

Uh oh!

Conversation

avan06 commented Aug 7, 2025

Uh oh!

hyperc54 commented Aug 14, 2025

Uh oh!

avan06 commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hyperc54 commented Oct 28, 2025

Uh oh!

avan06 commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

avan06 commented Aug 15, 2025 •

edited

Loading