Skip to content

[ENH] Parallelize solver input preparation using ThreadPoolExecutor#55

Open
Leguark wants to merge 4 commits intooptimize_grad_vs_scalarfrom
optimize_yield_cov
Open

[ENH] Parallelize solver input preparation using ThreadPoolExecutor#55
Leguark wants to merge 4 commits intooptimize_grad_vs_scalarfrom
optimize_yield_cov

Conversation

@Leguark
Copy link
Member

@Leguark Leguark commented Mar 13, 2026

[ENH] Parallelize solver input preparation using ThreadPoolExecutor

  • Refactored stack operations to enable parallel preparation of solver inputs.
  • Introduced thread-safe handling of stack_structure to avoid race conditions.
  • Improved efficiency and scalability by replacing sequential loops with concurrent execution.

[ENH] Refactor GPU tensor handling and evaluator logic

  • Added conditionals for GPU usage to streamline tensor preparation and movement.
  • Improved contiguity enforcement and asynchronous GPU transfers in evaluator workflows.
  • Enhanced backend configuration logic by refining BackendTensor methods for modularity.
  • Fixed minor formatting and consistency issues across backend and evaluator modules.

[ENH] Add optimized stack weight computation and improve GPU synchronization

  • Implemented _compute_weights_for_stacks_single_thread to support stack weight calculations.
  • Added explicit GPU synchronization for enhanced eGPU stability.
  • Resolved minor formatting inconsistencies across evaluator and stack modules.

Copy link
Member Author

Leguark commented Mar 13, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@Leguark Leguark marked this pull request as ready for review March 18, 2026 14:27
@Leguark Leguark force-pushed the optimize_yield_cov branch from b246cfc to 93929d4 Compare March 18, 2026 14:45
@Leguark Leguark force-pushed the optimize_grad_vs_scalar branch from e67c8a4 to 29a2e9a Compare March 18, 2026 14:45
Leguark added 4 commits March 18, 2026 16:53
- Refactored stack operations to enable parallel preparation of solver inputs.
- Introduced thread-safe handling of `stack_structure` to avoid race conditions.
- Improved efficiency and scalability by replacing sequential loops with concurrent execution.
- Added conditionals for GPU usage to streamline tensor preparation and movement.
- Improved contiguity enforcement and asynchronous GPU transfers in evaluator workflows.
- Enhanced backend configuration logic by refining `BackendTensor` methods for modularity.
- Fixed minor formatting and consistency issues across backend and evaluator modules.
…ization

- Implemented `_compute_weights_for_stacks_single_thread` to support stack weight calculations.
- Added explicit GPU synchronization for enhanced eGPU stability.
- Resolved minor formatting inconsistencies across evaluator and stack modules.
- Enhanced tensor handling by enabling `.to("cuda")` during tensor preparation in `symbolic_evaluator`.
- Introduced `BackendTensor.clear_gpu_memory()` to explicitly manage GPU memory.
- Added `gc.collect()` for improved resource cleanup after computations.
@Leguark Leguark force-pushed the optimize_grad_vs_scalar branch from 29a2e9a to b45b42e Compare March 18, 2026 15:53
@Leguark Leguark force-pushed the optimize_yield_cov branch from 93929d4 to 641cb34 Compare March 18, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant