Skip to content

fix: correctly parse headerless _motion.tsv files#420

Open
Yehuda-Bergstein wants to merge 2 commits into
bids-standard:mainfrom
Yehuda-Bergstein:fix-motion-tsv-headers
Open

fix: correctly parse headerless _motion.tsv files#420
Yehuda-Bergstein wants to merge 2 commits into
bids-standard:mainfrom
Yehuda-Bergstein:fix-motion-tsv-headers

Conversation

@Yehuda-Bergstein

Copy link
Copy Markdown

Fixes #394

Description

This PR fixes a bug where the Deno validator incorrectly parsed the first row of data in
_motion.tsv files as column headers, resulting in false-positive
CUSTOM_COLUMN_WITHOUT_DESCRIPTION errors.

According to BEP029, _motion.tsv files are uncompressed and explicitly headerless. However,
the standard _loadTSV function was previously hardcoded to always consume the first row as
headers.

Changes Made

  1. src/files/tsv.ts: Introduced a headerless boolean parameter (default false) to
  2. _loadTSV and its memoized wrappers. When headerless is true, the parser safely bypasses
    duplicate header checks, auto-generates dummy columns to preserve table structure, and retains
    the first row of numerical data.
  3. src/schema/context.ts: Inside BIDSContext.loadColumns(), if the file suffix is
    motion, it now explicitly calls the TSV parser with headerless: true. After structural
    validation, it dynamically clears the parsed columns map for motion files so that
    evalColumns ignores them (mirroring the legacy validator's behavior).
  4. src/schema/associations.ts & Tests: Updated existing data loaders and test calls to
    explicitly pass false to maintain standard behavior for all other BIDS TSVs.

Testing

  • Ran validation locally against real-world VR tracking datasets containing _motion.tsv
    files; verified the custom column errors disappeared.
  • Successfully ran the full test suite (deno task test against the bids-examples submodule) with 349 passing tests and 0 failures.

@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 86.36364% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.41%. Comparing base (549f03c) to head (b6566ff).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/schema/associations.ts 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #420      +/-   ##
==========================================
+ Coverage   87.38%   87.41%   +0.03%     
==========================================
  Files          65       65              
  Lines        4779     4793      +14     
  Branches      782      787       +5     
==========================================
+ Hits         4176     4190      +14     
  Misses        510      510              
  Partials       93       93              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@effigies effigies left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I think the cleaner patch would actually be to the loadTSVGZ immediately above where we make the .pipeThrough(new DecompressionStream('gzip')) call optional.

I would suggest we give it a more general name, e.g., loadHeaderlessTSV, with a compression parameter. You'll also need to extract the headers from the associated channels.tsv file, which will involve adding a name field to:

channels: async (file: BIDSFile, options: LoadOptions): Promise<Channels> => {
const columns = await loadTSV(file, options.maxRows)
.catch((_e) => {
return new Map()
})
return {
path: file.path,
type: columns.get('type'),
short_channel: columns.get('short_channel'),
sampling_frequency: columns.get('sampling_frequency'),
}
},

You can then pass in this.associations.channels.name as the headers.

Comment thread src/schema/context.ts
Comment on lines +271 to +273
if (headerless) {
this.columns = new ColumnsMap() as Record<string, string[]>
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you undo the load. That's not going to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TSV_COLUMN_HEADER_DUPLICATE reported for valid headerless *_motion.tsv files

2 participants