Skip to content

Conversation

@uttam282005
Copy link

Fixes #3808

Description

This PR changes the URL format for license and rule references in scan outputs from using the hardcoded develop branch to version-tagged URLs.

Changes Made

Code Changes:

  • Modified src/scancode/api.py to use v{scancode_version} in SCANCODE_DATA_BASE_URL instead of hardcoded develop branch
  • This ensures that rule_url and license_url fields in scan outputs point to the exact version of license/rule data that matches the installed ScanCode version

Test Updates:

  • Regenerated all test fixture files to reflect the new URL format
  • All changes in test files are mechanical URL updates from tree/develop to tree/v32.4.1
  • No functional changes to test logic

Documentation:

  • Documentation files in docs/ intentionally kept using develop branch links to remain evergreen and show current state

Motivation

Using version-tagged URLs provides:

  • Reproducibility: Users can always access the exact license/rule data that was used for their scan
  • Accuracy: URLs in scan outputs match the installed version
  • Traceability: Bug reports and issues can reference the correct version of data files

Example Output Change

Before:

"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/mit_31.RULE"

After:

"rule_url": "https://github.com/nexB/scancode-toolkit/tree/v32.4.1/src/licensedcode/data/rules/mit_31.RULE"

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled and links the original issue above
  • Tests pass -- look for a green checkbox a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)

Copilot AI review requested due to automatic review settings December 10, 2025 17:39
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@uttam282005 uttam282005 requested a review from Copilot December 10, 2025 18:12
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@uttam282005
Copy link
Author

hi. @AyanSinhaMahapatra can you review this pr pls?

@stefan6419846
Copy link

Having the current version in the expected data files sounds like a large amount of work to do for each new release.

@uttam282005
Copy link
Author

@stefan6419846 I think we could automate generating new expected files as part of the existing release workflow.

It would run tests with SCANCODE_REGEN_TEST_FIXTURES=yes, commit the updated fixtures, and then run tests again to verify everything passes.

What do you think of this approach? I'm happy to implement it if you think versioned URLs are worth the added automation step.

Alternatively, if you prefer to keep things simpler, I can close this PR and we can stick with develop branch URLs.

@stefan6419846
Copy link

I honestly do not think that updating large amount of files during a release workflow is a good idea. This just bloats the diff. Thus I proposed some placeholder or automated replacement approach in the affected test implementations, while ensuring that one dynamic Python-only test still exists to check that the correct values are written.

The final decision is with the maintainers.

Alternatively, if you prefer to keep things simpler, I can close this PR and we can stick with develop branch URLs.

I am the reporter of the original linked issue, thus I am of course still interested in a proper maintainable solution.

@uttam282005
Copy link
Author

Let's wait for the maintainers before doing significant rework, but I think your approach is much cleaner than mass file updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reported rule URLs refer to develop branch instead of corresponding tagged release

2 participants