Skip to content

Pr1 price reward#31

Merged
rbx merged 2 commits intoFairRootGroup:masterfrom
enlorenz:pr1-price_reward
Mar 12, 2026
Merged

Pr1 price reward#31
rbx merged 2 commits intoFairRootGroup:masterfrom
enlorenz:pr1-price_reward

Conversation

@enlorenz
Copy link
Contributor

No description provided.

Enis Lorenz added 2 commits March 11, 2026 15:42
-Now both price and efficiency rewards use "num_used_nodes" as a proxy for work done. Allow for more direct comparisson; efficiency=work quality, price=price aware work perfomance.
-Simplify function and tweak constants to allow stronger responsiveness.
-Price overdrive now bounded in [0,1]
-Efficiency signal shifted to be between [-1,1]. At 50% efficiency reward is 0.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

The reward_calculation.py file undergoes a refactoring of its price reward logic, transitioning from job-based to node-based saturation metrics. A negative-price overdrive mechanism is introduced with configurable capped/uncapped modes, alongside updated constants and method signatures reflecting the new terminology and behavior.

Changes

Cohort / File(s) Summary
Price Reward Logic Refactoring
src/reward_calculation.py
Replaced per-job saturation with per-used-nodes saturation in price reward calculation. Updated method signature _reward_price() to accept num_used_nodes instead of num_processed_jobs. Introduced negative-price overdrive with uncapped/capped modes, including new constants (NEGATIVE_PRICE_NODE_TAU, NEGATIVE_PRICE_OVERDRIVE_GAIN/FLOOR/MAX_REWARD/ALLOW_ABOVE_ONE). Renamed saturation constants (PRICE_JOB_TAU_* → PRICE_NODE_TAU_*) and increased PRICE_ADVANTAGE_GAIN from 3.0 to 4.0. Energy efficiency normalization now outputs to [-1, 1] range. Updated calculate() method to use _num_processed_jobs as legacy parameter.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning No pull request description was provided by the author; the description is completely absent. Add a clear description explaining the changes, including the shift from per-job to per-node saturation, the new negative-price overdrive mechanism, and affected methods/parameters.
Title check ❓ Inconclusive The title 'Pr1 price reward' is vague and generic; it does not clearly convey the main changes (node-based saturation, negative-price overdrive, and parameter renaming). Consider a more descriptive title that captures the primary changes, such as 'Refactor price reward logic to use node-based saturation and add negative-price overdrive'.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/reward_calculation.py (1)

36-53: Clarify implications of rewards exceeding 1.0.

The uncapped mode (NEGATIVE_PRICE_OVERDRIVE_ALLOW_ABOVE_ONE = True) allows _reward_price to return values up to 1.5. This diverges from the normalized [-1, 1] range used by other reward components (e.g., efficiency, idle penalty). Downstream consumers or weighted aggregation logic may assume bounded [-1, 1] inputs.

Consider documenting whether this asymmetry is intentional and how it interacts with the weighted sum in calculate().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/reward_calculation.py` around lines 36 - 53, The uncapped overdrive flag
NEGATIVE_PRICE_OVERDRIVE_ALLOW_ABOVE_ONE lets _reward_price produce values up to
NEGATIVE_PRICE_OVERDRIVE_MAX_REWARD (1.5), which breaks the implicit [-1,1]
normalization expected by other components and calculate(); either document this
behavior clearly and its intended interaction with the weighted aggregation
(mention NEGATIVE_PRICE_OVERDRIVE_ALLOW_ABOVE_ONE,
NEGATIVE_PRICE_OVERDRIVE_MAX_REWARD, _reward_price and calculate), or change
calculate() to explicitly handle the asymmetry (e.g., clamp/scale _reward_price
to [-1,1] before summing or adjust weights/normalization to accept >1) and add a
short inline comment explaining why the chosen approach preserves expected
reward semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/reward_calculation.py`:
- Around line 36-53: The uncapped overdrive flag
NEGATIVE_PRICE_OVERDRIVE_ALLOW_ABOVE_ONE lets _reward_price produce values up to
NEGATIVE_PRICE_OVERDRIVE_MAX_REWARD (1.5), which breaks the implicit [-1,1]
normalization expected by other components and calculate(); either document this
behavior clearly and its intended interaction with the weighted aggregation
(mention NEGATIVE_PRICE_OVERDRIVE_ALLOW_ABOVE_ONE,
NEGATIVE_PRICE_OVERDRIVE_MAX_REWARD, _reward_price and calculate), or change
calculate() to explicitly handle the asymmetry (e.g., clamp/scale _reward_price
to [-1,1] before summing or adjust weights/normalization to accept >1) and add a
short inline comment explaining why the chosen approach preserves expected
reward semantics.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 07763696-03b2-4003-a191-0392a5c7d46b

📥 Commits

Reviewing files that changed from the base of the PR and between 1e983e8 and bb27aa4.

📒 Files selected for processing (1)
  • src/reward_calculation.py

@rbx
Copy link
Member

rbx commented Mar 12, 2026

Looks good overall, but I would address CodeRabbit's nitpick comment to document the intentional reward value range that can exceed 1.5 in that case. Just a short comment in the code that it is indeed intentional (if it is).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean... There is a comment just above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True!

@rbx rbx merged commit 7c585e9 into FairRootGroup:master Mar 12, 2026
4 checks passed
@enlorenz enlorenz deleted the pr1-price_reward branch March 12, 2026 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants