-
Notifications
You must be signed in to change notification settings - Fork 121
Add memory model support for host-resident problem data #819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/26.02
Are you sure you want to change the base?
Conversation
|
this is a cleaner version of #807 |
📝 WalkthroughWalkthroughThis PR adds remote-solve support and non‑owning MPS data model views, refactors solve entry points to accept views, and introduces dual memory-model solution objects (host/device) with lazy CUDA initialization and memory-location-aware conversion/dispatch. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp (2)
41-51:⚠️ Potential issue | 🟠 MajorGuard NULL PyObject returns before DECREF or use.
PyObject_CallMethodcan returnnullptr; the newPy_DECREF(res)will crash on failure, andnumba_matrix/numpy_arraycan also benullptr. Add NULL checks and usePy_XDECREFto avoid segfaults and ensure error propagation.🔧 Proposed fix (safe NULL handling)
void get_solution(void* data, void* objective_value) override { - PyObject* numba_matrix = - data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables); - PyObject* numpy_array = - data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1); + const bool on_device = data_on_device(); + PyObject* numba_matrix = + on_device ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables); + PyObject* numpy_array = + on_device ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1); + if (!numba_matrix || !numpy_array) { + Py_XDECREF(numba_matrix); + Py_XDECREF(numpy_array); + PyErr_Print(); + return; + } PyObject* res = PyObject_CallMethod(this->pyCallbackClass, "get_solution", "(OO)", numba_matrix, numpy_array); - Py_DECREF(numba_matrix); - Py_DECREF(numpy_array); - Py_DECREF(res); + if (!res) { PyErr_Print(); } + Py_XDECREF(res); + Py_XDECREF(numba_matrix); + Py_XDECREF(numpy_array); }
80-90:⚠️ Potential issue | 🟠 MajorAdd NULL checks for
set_solutioncallback objects.Same failure mode as
get_solution:PyObject_CallMethodcan returnnullptr, andPy_DECREFon it will crash. Guard and usePy_XDECREF, plus early return if inputs fail to construct.🔧 Proposed fix (safe NULL handling)
void set_solution(void* data, void* objective_value) override { - PyObject* numba_matrix = - data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables); - PyObject* numpy_array = - data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1); + const bool on_device = data_on_device(); + PyObject* numba_matrix = + on_device ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables); + PyObject* numpy_array = + on_device ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1); + if (!numba_matrix || !numpy_array) { + Py_XDECREF(numba_matrix); + Py_XDECREF(numpy_array); + PyErr_Print(); + return; + } PyObject* res = PyObject_CallMethod(this->pyCallbackClass, "set_solution", "(OO)", numba_matrix, numpy_array); - Py_DECREF(numba_matrix); - Py_DECREF(numpy_array); - Py_DECREF(res); + if (!res) { PyErr_Print(); } + Py_XDECREF(res); + Py_XDECREF(numba_matrix); + Py_XDECREF(numpy_array); }cpp/src/linear_programming/cuopt_c.cpp (5)
50-73:⚠️ Potential issue | 🟠 MajorGuard view creation to keep the C API exception-safe.
create_view_from_mps_data_model()is outside the try/catch; if it throws, the C boundary is crossed andproblem_and_streamleaks. Wrap view creation and use RAII to ensure cleanup.🛠️ Suggested fix (exception-safe view creation)
- problem_and_stream_view_t* problem_and_stream = new problem_and_stream_view_t(); + auto problem_and_stream = std::make_unique<problem_and_stream_view_t>(); std::string filename_str(filename); bool input_mps_strict = false; std::unique_ptr<mps_data_t> mps_data_model_ptr; try { mps_data_model_ptr = std::make_unique<mps_data_t>( parse_mps<cuopt_int_t, cuopt_float_t>(filename_str, input_mps_strict)); } catch (const std::exception& e) { CUOPT_LOG_INFO("Error parsing MPS file: %s", e.what()); - delete problem_and_stream; *problem_ptr = nullptr; if (std::string(e.what()).find("Error opening MPS file") != std::string::npos) { return CUOPT_MPS_FILE_ERROR; } else { return CUOPT_MPS_PARSE_ERROR; } } - problem_and_stream->op_problem = mps_data_model_ptr.release(); - problem_and_stream->view = create_view_from_mps_data_model(*problem_and_stream->op_problem); - problem_and_stream->view.set_is_device_memory(false); - - *problem_ptr = static_cast<cuOptOptimizationProblem>(problem_and_stream); + try { + problem_and_stream->view = create_view_from_mps_data_model(*mps_data_model_ptr); + problem_and_stream->view.set_is_device_memory(false); + problem_and_stream->op_problem = mps_data_model_ptr.release(); + } catch (const std::exception& e) { + CUOPT_LOG_INFO("Error creating view from MPS data: %s", e.what()); + *problem_ptr = nullptr; + return CUOPT_INVALID_ARGUMENT; + } + + *problem_ptr = static_cast<cuOptOptimizationProblem>(problem_and_stream.release());
737-744:⚠️ Potential issue | 🟡 MinorAdd the same op_problem null check as other getters.
This keeps cuOptIsMIP aligned with the other accessors and avoids dereferencing invalid handles.🔧 Suggested fix
problem_and_stream_view_t* problem_and_stream_view = static_cast<problem_and_stream_view_t*>(problem); + if (!problem_and_stream_view->op_problem) { return CUOPT_INVALID_ARGUMENT; } *is_mip_ptr = static_cast<cuopt_int_t>(problem_and_stream_view->is_mip());
747-798:⚠️ Potential issue | 🟠 MajorMake cuOptSolve exception-safe to avoid leaks on solve failure.
Ifsolve_mip/solve_lpthrows, the allocated solution wrapper leaks and the exception crosses the C boundary.🛠️ Suggested fix (RAII + try/catch)
- solution_and_stream_view_t* solution_and_stream_view = - new solution_and_stream_view_t(true, handle); - solution_and_stream_view->mip_solution_ptr = new mip_solution_t<cuopt_int_t, cuopt_float_t>( - solve_mip<cuopt_int_t, cuopt_float_t>(handle, view, mip_settings)); - *solution_ptr = static_cast<cuOptSolution>(solution_and_stream_view); + auto solution_and_stream_view = + std::make_unique<solution_and_stream_view_t>(true, handle); + try { + solution_and_stream_view->mip_solution_ptr = + new mip_solution_t<cuopt_int_t, cuopt_float_t>( + solve_mip<cuopt_int_t, cuopt_float_t>(handle, view, mip_settings)); + } catch (const std::exception& e) { + CUOPT_LOG_INFO("MIP solve failed: %s", e.what()); + return CUOPT_INVALID_ARGUMENT; + } + *solution_ptr = static_cast<cuOptSolution>(solution_and_stream_view.release()); ... - solution_and_stream_view_t* solution_and_stream_view = - new solution_and_stream_view_t(false, handle); - solution_and_stream_view->lp_solution_ptr = - new optimization_problem_solution_t<cuopt_int_t, cuopt_float_t>( - solve_lp<cuopt_int_t, cuopt_float_t>(handle, view, pdlp_settings)); - *solution_ptr = static_cast<cuOptSolution>(solution_and_stream_view); + auto solution_and_stream_view = + std::make_unique<solution_and_stream_view_t>(false, handle); + try { + solution_and_stream_view->lp_solution_ptr = + new optimization_problem_solution_t<cuopt_int_t, cuopt_float_t>( + solve_lp<cuopt_int_t, cuopt_float_t>(handle, view, pdlp_settings)); + } catch (const std::exception& e) { + CUOPT_LOG_INFO("LP solve failed: %s", e.what()); + return CUOPT_INVALID_ARGUMENT; + } + *solution_ptr = static_cast<cuOptSolution>(solution_and_stream_view.release());
878-903:⚠️ Potential issue | 🟠 MajorHandle host-resident solutions in cuOptGetPrimalSolution.
Remote solves can return host data; the current code always reads device vectors and will throw for host-only solutions.🛠️ Suggested fix
if (solution_and_stream_view->is_mip) { mip_solution_t<cuopt_int_t, cuopt_float_t>* mip_solution = static_cast<mip_solution_t<cuopt_int_t, cuopt_float_t>*>( solution_and_stream_view->mip_solution_ptr); - const rmm::device_uvector<cuopt_float_t>& solution_values = mip_solution->get_solution(); - rmm::cuda_stream_view stream = - solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); - raft::copy(solution_values_ptr, solution_values.data(), solution_values.size(), stream); - stream.synchronize(); + if (mip_solution->is_device_memory()) { + const auto& solution_values = mip_solution->get_solution(); + rmm::cuda_stream_view stream = + solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); + raft::copy(solution_values_ptr, solution_values.data(), solution_values.size(), stream); + stream.synchronize(); + } else { + const auto& solution_values = mip_solution->get_solution_host(); + std::copy(solution_values.begin(), solution_values.end(), solution_values_ptr); + } } else { optimization_problem_solution_t<cuopt_int_t, cuopt_float_t>* optimization_problem_solution = static_cast<optimization_problem_solution_t<cuopt_int_t, cuopt_float_t>*>( solution_and_stream_view->lp_solution_ptr); - const rmm::device_uvector<cuopt_float_t>& solution_values = - optimization_problem_solution->get_primal_solution(); - rmm::cuda_stream_view stream = - solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); - raft::copy(solution_values_ptr, solution_values.data(), solution_values.size(), stream); - stream.synchronize(); + if (optimization_problem_solution->is_device_memory()) { + const auto& solution_values = optimization_problem_solution->get_primal_solution(); + rmm::cuda_stream_view stream = + solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); + raft::copy(solution_values_ptr, solution_values.data(), solution_values.size(), stream); + stream.synchronize(); + } else { + const auto& solution_values = optimization_problem_solution->get_primal_solution_host(); + std::copy(solution_values.begin(), solution_values.end(), solution_values_ptr); + } }
981-999:⚠️ Potential issue | 🟠 MajorAdd host-path copies for dual solution and reduced costs.
These getters still assume device memory and will fail for remote/host solutions.🛠️ Suggested fix
- const rmm::device_uvector<cuopt_float_t>& dual_solution = - optimization_problem_solution->get_dual_solution(); - rmm::cuda_stream_view stream = - solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); - raft::copy(dual_solution_ptr, dual_solution.data(), dual_solution.size(), stream); - stream.synchronize(); + if (optimization_problem_solution->is_device_memory()) { + const auto& dual_solution = optimization_problem_solution->get_dual_solution(); + rmm::cuda_stream_view stream = + solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); + raft::copy(dual_solution_ptr, dual_solution.data(), dual_solution.size(), stream); + stream.synchronize(); + } else { + const auto& dual_solution = optimization_problem_solution->get_dual_solution_host(); + std::copy(dual_solution.begin(), dual_solution.end(), dual_solution_ptr); + } ... - const rmm::device_uvector<cuopt_float_t>& reduced_cost = - optimization_problem_solution->get_reduced_cost(); - rmm::cuda_stream_view stream = - solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); - raft::copy(reduced_cost_ptr, reduced_cost.data(), reduced_cost.size(), stream); - stream.synchronize(); + if (optimization_problem_solution->is_device_memory()) { + const auto& reduced_cost = optimization_problem_solution->get_reduced_cost(); + rmm::cuda_stream_view stream = + solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); + raft::copy(reduced_cost_ptr, reduced_cost.data(), reduced_cost.size(), stream); + stream.synchronize(); + } else { + const auto& reduced_cost = optimization_problem_solution->get_reduced_cost_host(); + std::copy(reduced_cost.begin(), reduced_cost.end(), reduced_cost_ptr); + }Also applies to: 1021-1039
cpp/src/linear_programming/solver_solution.cu (1)
191-265:⚠️ Potential issue | 🟠 MajorRelease device buffers when copy_from switches to host data.
Whenother.is_device_memory_is false, the existing device buffers remain allocated, which can retain large GPU memory unnecessarily.As per coding guidelines: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events.🛠️ Suggested fix
} else { // Copy CPU data + // Release device buffers when switching to host-backed data + primal_solution_.reset(); + dual_solution_.reset(); + reduced_cost_.reset(); if (!primal_solution_host_) { primal_solution_host_ = std::make_unique<std::vector<f_t>>(); } if (!dual_solution_host_) { dual_solution_host_ = std::make_unique<std::vector<f_t>>(); } if (!reduced_cost_host_) { reduced_cost_host_ = std::make_unique<std::vector<f_t>>(); }
🤖 Fix all issues with AI agents
In `@cpp/libmps_parser/src/data_model_view.cpp`:
- Around line 351-392: The code in data_model_view_t::get_problem_category()
(and other span-returning getters using variable_types_.data() / spans)
dereferences host spans without checking the is_device_memory_ flag; either
guard these accesses by checking is_device_memory_ and performing a device->host
copy into a host-local span before reading, or remove the unused flag and
document that spans must be host-resident; specifically update
data_model_view_t::get_problem_category(), all span-returning getters, and any
callers that assume host memory (e.g., remove reliance on
set_is_device_memory(false) or add an explicit device-to-host copy helper
invoked when is_device_memory_ is true) so accesses to variable_types_,
column_types_, etc. never read device memory directly.
In `@cpp/src/linear_programming/solve.cu`:
- Around line 1299-1306: In solve_lp, remove the unreachable duplicate
remote_config.has_value() branch (the block that logs CUOPT_REMOTE_HOST/PORT and
calls solve_lp_remote) since remote_config was already checked earlier and
returned; specifically delete the second if (remote_config.has_value()) {
CUOPT_LOG_INFO(...); return solve_lp_remote(*remote_config, view, settings); }
block so the function flow no longer contains dead code referencing
remote_config, solve_lp_remote, view, and settings.
In `@cpp/src/mip/solve.cu`:
- Around line 338-342: The null handle_ptr check must be moved so it executes
before either data path; ensure you validate handle_ptr (the pointer used by
data_model_view_to_optimization_problem) before calling
data_model_view_to_optimization_problem so both the device and CPU branches are
protected; specifically, place the existing handle_ptr == nullptr check above
the device memory branch (i.e., before any calls into
data_model_view_to_optimization_problem/optimization_problem_t constructor),
keep the same error logging and return of mip_solution_t with
cuopt::logic_error, and avoid duplicating checks—one early guard covers both
paths.
In `@cpp/tests/linear_programming/c_api_tests/c_api_tests.cpp`:
- Around line 177-184: Replace the exact equality check on the floating-point
objective offset with an epsilon comparison: instead of comparing
orig->get_objective_offset() == rerd->get_objective_offset(), compute the
absolute difference and compare it to a small tolerance (e.g.,
std::numeric_limits<double>::epsilon() * scale or a fixed small value) so the
result uses fabs(orig->get_objective_offset() - rerd->get_objective_offset()) <=
tol; keep the integer checks for orig->get_n_constraints(),
orig->get_n_variables(), orig->get_nnz() unchanged and use these symbol names to
locate the change.
In `@python/cuopt/cuopt/__init__.py`:
- Around line 29-33: The __dir__() function currently returns __all__ +
_submodules which duplicates submodule names because __all__ already unpacks
_submodules; change __dir__ to return a deduplicated list (preferably simply
return __all__) or otherwise return an order-preserving deduplicated sequence
(e.g., using dict.fromkeys) so names from __all__ and _submodules aren’t
repeated; update the __dir__ definition accordingly (referencing __dir__,
__all__, and _submodules).
In `@python/cuopt/cuopt/tests/linear_programming/test_memory_model.py`:
- Around line 167-168: The test sets variable types using an integer array which
prevents is_mip() from recognizing MIP variables; update the test to provide
string-encoded type codes (e.g., use a one-byte string array) when calling
data_model_obj.set_variable_types so the solver's is_mip() (solver.py) will
detect the "I" code; replace the np.array([1]) usage with a string bytes array
like np.array(["I"], dtype="S1") (or equivalent single-byte string encoding) so
the test exercises the MIP path.
🧹 Nitpick comments (14)
cpp/tests/linear_programming/unit_tests/lp_solution_memory_test.cu (2)
22-39: Clarify test naming and incomplete guard implementation.The test name
host_only_accessors_need_coderabbit_patchand the commented-outEXPECT_THROWassertions suggest that device-accessor guards for host-only solutions are not yet implemented. The test comments reference a "coderabbit_changes.patch" which is an unusual convention.Consider:
- Renaming tests to reflect current behavior (e.g.,
host_solution_basic_accessors)- Creating a tracking issue for implementing the accessor guards
- Using
// TODO(issue#123): Enable when guards are implementedinstead of referencing a patch file
70-104: Consider adding stream synchronization before accessor validation.The test creates device vectors without explicit initialization values. While this is acceptable for testing accessor availability, you may want to add
handle.sync_stream()before callingEXPECT_NO_THROWon device accessors to ensure any pending operations are complete.&handle, true); + handle.sync_stream(); EXPECT_TRUE(solution.is_device_memory());python/cuopt/cuopt/tests/linear_programming/test_memory_model.py (1)
187-250: Add infeasible/unbounded + free‑variable cases to memory‑model tests.
To align the new memory‑model suite with test requirements, consider adding a minimal infeasible LP, an unbounded LP, and a free‑variable case (host‑memory path).As per coding guidelines: “Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems)” and “Test with free variables, singleton problems, and extreme problem dimensions near resource limits to validate edge case handling.”
cpp/tests/mip/mip_solution_memory_test.cu (1)
25-47: Remove patch-specific naming and enable guard behavior validation in tests.The guards are implemented:
EXE_CUOPT_EXPECTSchecks onget_solution()andget_solution_host()verify pointer validity. The test names ("need_coderabbit_patch") and commented-outEXPECT_THROWassertions are misleading and should be removed or the assertions enabled. Either rename the tests to reflect their current behavior or uncomment theEXPECT_THROWforget_solution()on host-only solutions to validate that guards properly throwlogic_error.Suggested fix
-TEST(mip_solution_memory, host_only_accessors_need_coderabbit_patch) +TEST(mip_solution_memory, host_only_accessors) { - // This test validates that EXE_CUOPT_EXPECTS guards are in place - // Guards are added in coderabbit_changes.patch + // Verify host-only solution behavior std::vector<double> solution{0.0}; std::vector<std::string> var_names{"x0"}; solver_stats_t<int, double> stats{}; mip_solution_t<int, double> mip_solution(std::move(solution), std::move(var_names), 0.0, 0.0, mip_termination_status_t::Optimal, 0.0, 0.0, 0.0, stats); EXPECT_FALSE(mip_solution.is_device_memory()); - // After applying CodeRabbit patch, this should throw - // EXPECT_THROW(mip_solution.get_solution(), cuopt::logic_error); + EXPECT_THROW(mip_solution.get_solution(), cuopt::logic_error); EXPECT_NO_THROW(mip_solution.get_solution_host()); }cpp/tests/linear_programming/unit_tests/memory_model_infrastructure_test.cu (1)
287-309: Test uses host pointers withis_device_memory=trueflag — intentional but potentially confusing.The comment on line 292 notes this is "simulated with host pointers for test", but passing host pointers while marking
is_device_memory(true)could cause undefined behavior if the conversion function actually attempts GPU memory operations. This test only validates thatEXPECT_NO_THROWsucceeds, but doesn't verify the resulting optimization problem's correctness.Consider adding a comment clarifying that this test relies on the conversion function not dereferencing the pointers during view-to-optimization_problem conversion, or using actual device memory allocations for a more realistic test.
cpp/libmps_parser/CMakeLists.txt (2)
98-100: Static library missing compile options.The shared library
mps_parserhastarget_compile_optionsapplied (line 98-100), butmps_parser_staticdoes not receive the same treatment. This could lead to inconsistent warning/error behavior between the two builds.♻️ Proposed fix to add compile options to static library
target_compile_options(mps_parser PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>" ) + +target_compile_options(mps_parser_static + PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>" +)Also applies to: 128-135
137-143: Static library missing BZip2/ZLIB include directories.If
MPS_PARSER_WITH_BZIP2orMPS_PARSER_WITH_ZLIBare enabled, only the shared library receives the corresponding include directories. The static library should receive them as well.♻️ Proposed fix to add compression library includes to static target
if(MPS_PARSER_WITH_BZIP2) target_include_directories(mps_parser PRIVATE BZip2::BZip2) + target_include_directories(mps_parser_static PRIVATE BZip2::BZip2) endif(MPS_PARSER_WITH_BZIP2) if(MPS_PARSER_WITH_ZLIB) target_include_directories(mps_parser PRIVATE ZLIB::ZLIB) + target_include_directories(mps_parser_static PRIVATE ZLIB::ZLIB) endif(MPS_PARSER_WITH_ZLIB)cpp/src/mip/solve.cu (1)
331-345: Consider consolidating duplicate conversion calls.Lines 333 and 343 both call
data_model_view_to_optimization_problem(handle_ptr, view). The only difference is the null-check guard for CPU data. Consider restructuring to avoid duplication:♻️ Proposed refactor to consolidate conversion logic
- if (view.is_device_memory()) { - // Local solve: data already on GPU - convert view to optimization_problem_t and solve - auto op_problem = data_model_view_to_optimization_problem(handle_ptr, view); - return solve_mip(op_problem, settings); - } - // Local solve with CPU data: copy to GPU and solve - if (handle_ptr == nullptr) { + if (!view.is_device_memory() && handle_ptr == nullptr) { CUOPT_LOG_ERROR("[solve_mip] Local solve requested but handle_ptr is null."); return mip_solution_t<i_t, f_t>( cuopt::logic_error("No CUDA handle for CPU->GPU copy", cuopt::error_type_t::RuntimeError)); } + auto op_problem = data_model_view_to_optimization_problem(handle_ptr, view); return solve_mip(op_problem, settings); }cpp/libmps_parser/include/mps_parser/data_model_view.hpp (1)
450-454: Document thatget_problem_category()reflects the variable types at first call.The lazy-cached
problem_category_usesstd::call_oncefor thread-safe initialization, readingvariable_types_to compute the category. While the documented API indicatesset_variable_types()is called before solving (during initialization), explicitly documenting thatget_problem_category()reflects the state at first call would clarify the intended usage pattern and prevent potential misuse where variable types are modified after category computation.cpp/include/cuopt/linear_programming/mip/solver_solution.hpp (1)
111-116: Document precondition for host accessors.The
get_solution_host()methods document "Only valid when is_device_memory() returns false", but there's no runtime enforcement. Consider adding a debug assertion to catch misuse early:cuopt_assert(!is_device_memory_, "get_solution_host() called on device-memory solution");This would help catch programming errors during development.
cpp/src/linear_programming/utilities/cython_solve.cu (1)
208-214: Consider a more explicit warm-start presence check.The warm-start detection relies on checking if
last_restart_duality_gap_dual_solution_.data() != nullptr. While functional, a dedicatedhas_warm_start_data()method would be clearer and less brittle.// Current: relies on internal pointer state if (!is_mip && solver_settings->get_pdlp_warm_start_data_view() .last_restart_duality_gap_dual_solution_.data() != nullptr) {This is minor since the current approach works, but consider adding a helper method for clarity in future iterations.
cpp/src/linear_programming/optimization_problem_conversions.cu (3)
61-76: Clarify cudaPointerGetAttributes error handling.The error handling for
cudaPointerGetAttributesis functionally correct but could be clearer. The pattern handles:
- Device memory: copy D2H
- Host memory: direct copy
- Unregistered memory (returns
cudaErrorInvalidValue): treated as host memoryConsider adding a brief comment explaining why
cudaErrorInvalidValueis acceptable:} else { // Source is on host (or unregistered) - direct copy if (err != cudaSuccess) { cudaGetLastError(); } // Clear cudaPointerGetAttributes error + // cudaErrorInvalidValue is expected for unregistered host memory - treat as host if (err != cudaSuccess && err != cudaErrorInvalidValue) { RAFT_CUDA_TRY(err); } std::memcpy(host_var_types.data(), var_types.data(), var_types.size() * sizeof(char)); }
108-154: Consider extracting memory-aware copy helper.The
cudaPointerGetAttributes+ conditional copy pattern is duplicated between variable types (lines 61-76) and quadratic objective (lines 114-146). Consider extracting a helper template:template <typename T> void copy_to_host(std::vector<T>& dest, const T* src, size_t count);This would reduce duplication and centralize the error handling logic. However, this is a nice-to-have refactor that can be deferred.
241-335: Good non-owning view creation with correct memory semantics.The function correctly creates a non-owning view pointing to
mps_data_model's data and setsis_device_memory(false).Important lifetime consideration: The returned view holds raw pointers to the
mps_data_model's internal storage. The view is only valid as long as themps_data_modelremains alive and unmodified. This is documented implicitly by "non-owning view" but consider adding explicit documentation:/** * `@brief` Create a non-owning view from an MPS data model. * `@warning` The returned view holds pointers to mps_data_model's data. * The view is invalid if mps_data_model is destroyed or modified. */
| if (handle_ptr == nullptr) { | ||
| CUOPT_LOG_ERROR("[solve_mip] Local solve requested but handle_ptr is null."); | ||
| return mip_solution_t<i_t, f_t>( | ||
| cuopt::logic_error("No CUDA handle for CPU->GPU copy", cuopt::error_type_t::RuntimeError)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, examine the context around lines 338-342 in solve.cu
cat -n cpp/src/mip/solve.cu | sed -n '330,350p'Repository: NVIDIA/cuopt
Length of output: 1199
🏁 Script executed:
# Find the implementation of data_model_view_to_optimization_problem
find . -type f \( -name "*.cu" -o -name "*.cpp" -o -name "*.cuh" -o -name "*.hpp" \) -exec grep -l "data_model_view_to_optimization_problem" {} \;Repository: NVIDIA/cuopt
Length of output: 390
🏁 Script executed:
# Examine the implementation of data_model_view_to_optimization_problem
cat -n cpp/src/linear_programming/optimization_problem_conversions.cu | head -150Repository: NVIDIA/cuopt
Length of output: 8214
🏁 Script executed:
# Search for the function signature and its use of handle_ptr
rg -n "data_model_view_to_optimization_problem" cpp/src/linear_programming/optimization_problem_conversions.cu -B 5 -A 40 | head -100Repository: NVIDIA/cuopt
Length of output: 3895
🏁 Script executed:
# Check the optimization_problem_t constructor to see if it uses handle_ptr
rg -n "optimization_problem_t.*constructor\|explicit optimization_problem_t" cpp/src/linear_programming/ -B 2 -A 15 | head -80Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Look for the constructor definition
fd -e hpp -e h cpp/src/linear_programming/ -x grep -l "optimization_problem_t"Repository: NVIDIA/cuopt
Length of output: 469
🏁 Script executed:
# Find header file for optimization_problem_t
find cpp/include -name "*.hpp" -o -name "*.h" | xargs grep -l "class optimization_problem_t\|struct optimization_problem_t"Repository: NVIDIA/cuopt
Length of output: 117
🏁 Script executed:
# Search for optimization_problem_t constructor in all header files
fd . cpp/include -name "*.hpp" -o -name "*.h" | xargs grep -A 20 "optimization_problem_t.*handle_ptr"Repository: NVIDIA/cuopt
Length of output: 288
🏁 Script executed:
# Read the optimization_problem.hpp file to find constructor
cat cpp/include/cuopt/linear_programming/optimization_problem.hpp | head -200Repository: NVIDIA/cuopt
Length of output: 7595
🏁 Script executed:
# Search for constructor definition with handle_ptr
rg -n "optimization_problem_t.*handle_ptr\|optimization_problem_t\(" cpp/include/cuopt/linear_programming/optimization_problem.hpp -A 10 | head -60Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Find the full constructor implementation
rg -n "optimization_problem_t\(.*handle" cpp/include/cuopt/linear_programming/optimization_problem.hpp -A 5Repository: NVIDIA/cuopt
Length of output: 297
🏁 Script executed:
# Look at the implementation in the .cu file
rg -n "optimization_problem_t<i_t, f_t>::optimization_problem_t.*handle_ptr" cpp/src/linear_programming/ -A 20Repository: NVIDIA/cuopt
Length of output: 1924
Move the null handle check before the device memory path.
Both the device (line 333) and CPU (line 343) data paths call data_model_view_to_optimization_problem, which creates an optimization_problem_t that immediately dereferences handle_ptr in its constructor (line 41 of optimization_problem.cu). The current null check at line 338 only guards the CPU path; the device path executes without validation and will crash with a null pointer dereference in the constructor. Move the null check before line 331 to protect both paths.
🤖 Prompt for AI Agents
In `@cpp/src/mip/solve.cu` around lines 338 - 342, The null handle_ptr check must
be moved so it executes before either data path; ensure you validate handle_ptr
(the pointer used by data_model_view_to_optimization_problem) before calling
data_model_view_to_optimization_problem so both the device and CPU branches are
protected; specifically, place the existing handle_ptr == nullptr check above
the device memory branch (i.e., before any calls into
data_model_view_to_optimization_problem/optimization_problem_t constructor),
keep the same error logging and return of mip_solution_t with
cuopt::logic_error, and avoid duplicating checks—one early guard covers both
paths.
| // Compare key problem attributes since mps_data_t doesn't have is_equivalent() | ||
| auto* orig = original_problem_wrapper->op_problem; | ||
| auto* rerd = reread_problem_wrapper->op_problem; | ||
|
|
||
| result = | ||
| original_problem_wrapper->op_problem->is_equivalent(*reread_problem_wrapper->op_problem); | ||
| (orig->get_n_constraints() == rerd->get_n_constraints() && | ||
| orig->get_n_variables() == rerd->get_n_variables() && orig->get_nnz() == rerd->get_nnz() && | ||
| orig->get_objective_offset() == rerd->get_objective_offset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use epsilon comparison for floating-point objective_offset.
Line 184 uses exact equality (==) for get_objective_offset(), which returns a floating-point value. MPS serialization/deserialization could introduce minor precision drift, causing spurious test failures.
Additionally, this comparison validates problem dimensions but not the actual data content (constraint coefficients, variable bounds, objective coefficients). Consider whether a more comprehensive check is needed for true roundtrip validation.
Proposed fix for floating-point comparison
+#include <cmath>
+
+// Helper for approximate FP comparison
+static bool approx_equal(double a, double b, double eps = 1e-12) {
+ return std::fabs(a - b) <= eps * std::max(1.0, std::max(std::fabs(a), std::fabs(b)));
+}
+
// In test_mps_roundtrip:
result =
(orig->get_n_constraints() == rerd->get_n_constraints() &&
orig->get_n_variables() == rerd->get_n_variables() && orig->get_nnz() == rerd->get_nnz() &&
- orig->get_objective_offset() == rerd->get_objective_offset());
+ approx_equal(orig->get_objective_offset(), rerd->get_objective_offset()));As per coding guidelines: "use epsilon comparisons for floating-point equality checks."
🤖 Prompt for AI Agents
In `@cpp/tests/linear_programming/c_api_tests/c_api_tests.cpp` around lines 177 -
184, Replace the exact equality check on the floating-point objective offset
with an epsilon comparison: instead of comparing orig->get_objective_offset() ==
rerd->get_objective_offset(), compute the absolute difference and compare it to
a small tolerance (e.g., std::numeric_limits<double>::epsilon() * scale or a
fixed small value) so the result uses fabs(orig->get_objective_offset() -
rerd->get_objective_offset()) <= tol; keep the integer checks for
orig->get_n_constraints(), orig->get_n_variables(), orig->get_nnz() unchanged
and use these symbol names to locate the change.
| def __dir__(): | ||
| return __all__ + _submodules | ||
|
|
||
|
|
||
| __all__ = ["__git_commit__", "__version__", "__version_major_minor__", *_submodules] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__dir__() returns duplicate entries for submodules.
__all__ already includes the submodules via *_submodules unpacking on line 33, so __dir__() returning __all__ + _submodules will list them twice in dir(cuopt) output.
🐛 Proposed fix
def __dir__():
- return __all__ + _submodules
+ return list(__all__)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def __dir__(): | |
| return __all__ + _submodules | |
| __all__ = ["__git_commit__", "__version__", "__version_major_minor__", *_submodules] | |
| def __dir__(): | |
| return list(__all__) | |
| __all__ = ["__git_commit__", "__version__", "__version_major_minor__", *_submodules] |
🤖 Prompt for AI Agents
In `@python/cuopt/cuopt/__init__.py` around lines 29 - 33, The __dir__() function
currently returns __all__ + _submodules which duplicates submodule names because
__all__ already unpacks _submodules; change __dir__ to return a deduplicated
list (preferably simply return __all__) or otherwise return an order-preserving
deduplicated sequence (e.g., using dict.fromkeys) so names from __all__ and
_submodules aren’t repeated; update the __dir__ definition accordingly
(referencing __dir__, __all__, and _submodules).
| var_types = np.array([1]) # Integer | ||
| data_model_obj.set_variable_types(var_types) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n python/cuopt/cuopt/tests/linear_programming/test_memory_model.py | sed -n '160,175p'Repository: NVIDIA/cuopt
Length of output: 618
🏁 Script executed:
# Find where variable type codes are defined
rg -n "set_variable_types" -g '*.py' -g '*.pyx' python/cuopt | head -20Repository: NVIDIA/cuopt
Length of output: 1001
🏁 Script executed:
# Find variable type constant definitions
rg -n "BINARY|INTEGER|CONTINUOUS" -g '*.py' python/cuopt | grep -E "(^[^:]+:\s*(BINARY|INTEGER|CONTINUOUS)\s*=|def.*variable_type)" | head -30Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Check what dtype np.array([1]) creates and look for examples in the codebase
rg -n "np\.array\(\[" -g '*.py' python/cuopt/tests | grep -i "var_type\|variable_type" -A 2 -B 2 | head -40Repository: NVIDIA/cuopt
Length of output: 114
🏁 Script executed:
# Find the actual variable type values used in tests
rg -n "variable_types\s*=" -g '*.py' python/cuopt/tests -A 1 | head -50Repository: NVIDIA/cuopt
Length of output: 114
🏁 Script executed:
# Look at the set_variable_types implementation
sed -n '364,390p' python/cuopt/cuopt/linear_programming/data_model/data_model.pyRepository: NVIDIA/cuopt
Length of output: 958
🏁 Script executed:
# Check the wrapper implementation
sed -n '133,160p' python/cuopt/cuopt/linear_programming/data_model/data_model_wrapper.pyxRepository: NVIDIA/cuopt
Length of output: 827
🏁 Script executed:
# Find variable type constants
rg -n "VarType|VariableType|var_type" -g '*.py' -g '*.pyx' python/cuopt/cuopt/linear_programming --max-count=100 | head -50Repository: NVIDIA/cuopt
Length of output: 2773
🏁 Script executed:
# Look for where variable types are actually used/parsed
rg -n "dtype.*S1|dtype.*str|int.*1.*Integer|VarType\s*=" -g '*.py' -g '*.pyx' python/cuopt | head -30Repository: NVIDIA/cuopt
Length of output: 1372
🏁 Script executed:
# Check test cases for other MIP tests to see how they use variable types
rg -n "set_variable_types" -g '*.py' python/cuopt --context=3 | head -80Repository: NVIDIA/cuopt
Length of output: 3546
🏁 Script executed:
sed -n '84,96p' python/cuopt/cuopt/linear_programming/solver/solver.pyRepository: NVIDIA/cuopt
Length of output: 581
🏁 Script executed:
# Check if there are any other tests using np.array([1]) for variable types
rg -n "np\.array\(\[1\]\)" -g '*.py' python/cuopt/testsRepository: NVIDIA/cuopt
Length of output: 114
Fix variable type encoding in test to use explicit string codes.
np.array([1]) creates an integer array, not a string array. The is_mip() function (solver.py:84-96) checks for the string "I" or bytes b"I", so integer 1 is not recognized as an integer variable type. This prevents the test from exercising the MIP path. Use np.array(["I"], dtype="S1") to match the documented requirement (data_model.py:364) and patterns used throughout the codebase.
Proposed fix
- var_types = np.array([1]) # Integer
+ var_types = np.array(["I"], dtype="S1") # Integer🤖 Prompt for AI Agents
In `@python/cuopt/cuopt/tests/linear_programming/test_memory_model.py` around
lines 167 - 168, The test sets variable types using an integer array which
prevents is_mip() from recognizing MIP variables; update the test to provide
string-encoded type codes (e.g., use a one-byte string array) when calling
data_model_obj.set_variable_types so the solver's is_mip() (solver.py) will
detect the "I" code; replace the np.array([1]) usage with a string bytes array
like np.array(["I"], dtype="S1") (or equivalent single-byte string encoding) so
the test exercises the MIP path.
NOTE: Remote solve is stubbed in this commit and returns dummy solutions. Full gRPC implementation will be added in a follow-up PR. This change enables cuOpt to work with problem data in host (CPU) memory, supporting both local GPU solves and future remote solve capabilities. - Add data_model_view_t: non-owning view of problem data (CPU or GPU) - Add optimization_problem_conversions: convert between data representations - Update solve functions to accept data_model_view_t overloads - C API now uses mps_data_model_t (host memory) instead of optimization_problem_t (GPU) - Solutions track memory location (is_device_memory_ flag) - Add host-memory accessors for solutions (get_primal_solution_host, etc.) - Callbacks track expected memory location for data exchange - Add lazy module loading (supports CPU-only hosts with remote solve configured) - Cython layer handles device/host memory internally - Python API unchanged (no user-facing changes) - Add remote_solve.hpp with solve_lp_remote/solve_mip_remote stubs - Solve functions check CUOPT_REMOTE_HOST/CUOPT_REMOTE_PORT environment variables - Remote solves require host memory (enforced with cuopt_expects assertion) - cuopt_cli uses host-memory workflow (reads MPS → host data → solve) - Add lp_solution_memory_test.cu (4 tests) - Add memory_model_infrastructure_test.cu (4 tests) - Add mip_solution_memory_test.cu (5 tests) - Add test_memory_model.py (Python integration tests)
39e5e08 to
8aa38fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp (1)
41-52:⚠️ Potential issue | 🟡 MinorAdd null checks before Py_DECREF to prevent crashes on Python errors.
PyObject_CallMethodcan returnNULLif the Python call fails (e.g., method not found, exception raised). CallingPy_DECREF(NULL)causes undefined behavior. The same issue exists inset_solution.🛡️ Proposed fix
void get_solution(void* data, void* objective_value) override { PyObject* numba_matrix = data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables); PyObject* numpy_array = data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1); + if (numba_matrix == nullptr || numpy_array == nullptr) { + Py_XDECREF(numba_matrix); + Py_XDECREF(numpy_array); + return; + } PyObject* res = PyObject_CallMethod(this->pyCallbackClass, "get_solution", "(OO)", numba_matrix, numpy_array); Py_DECREF(numba_matrix); Py_DECREF(numpy_array); - Py_DECREF(res); + Py_XDECREF(res); }cpp/libmps_parser/CMakeLists.txt (1)
70-136:⚠️ Potential issue | 🟠 Majormps_parser_static misses C++ standard and compile options.
The shared target sets
CXX_STANDARDandMPS_PARSER_CXX_FLAGS, but the static target does not. This can break static builds if the sources rely on C++20 features or warning flags. Consider applying the same properties (and optional include/link settings) tomps_parser_static.🔧 Proposed fix (keep targets consistent)
add_library(mps_parser SHARED ${MPS_PARSER_SOURCES}) # Static library for linking into libcuopt add_library(mps_parser_static STATIC ${MPS_PARSER_SOURCES}) set_target_properties(mps_parser @@ CXX_SCAN_FOR_MODULES OFF ) + +set_target_properties(mps_parser_static + PROPERTIES + CXX_STANDARD 20 + CXX_STANDARD_REQUIRED ON + INTERFACE_POSITION_INDEPENDENT_CODE ON + CXX_SCAN_FOR_MODULES OFF +) @@ target_compile_options(mps_parser PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>" ) + +target_compile_options(mps_parser_static + PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>" +)cpp/src/linear_programming/cuopt_c.cpp (1)
878-903:⚠️ Potential issue | 🟠 MajorHandle host‑memory solutions in C API getters.
Remote solves can return host-only solutions, but the getters always read device buffers. This will throw or return invalid data when
is_device_memory()is false. Add a host branch (for MIP and LP) and use the new host accessors.🐛 Proposed fix (example for cuOptGetPrimalSolution)
if (solution_and_stream_view->is_mip) { mip_solution_t<cuopt_int_t, cuopt_float_t>* mip_solution = static_cast<mip_solution_t<cuopt_int_t, cuopt_float_t>*>( solution_and_stream_view->mip_solution_ptr); - const rmm::device_uvector<cuopt_float_t>& solution_values = mip_solution->get_solution(); - rmm::cuda_stream_view stream = - solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); - raft::copy(solution_values_ptr, solution_values.data(), solution_values.size(), stream); - stream.synchronize(); + if (mip_solution->is_device_memory()) { + const rmm::device_uvector<cuopt_float_t>& solution_values = mip_solution->get_solution(); + rmm::cuda_stream_view stream = + solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); + raft::copy(solution_values_ptr, solution_values.data(), solution_values.size(), stream); + stream.synchronize(); + } else { + const auto& host_solution = mip_solution->get_solution_host(); + std::copy(host_solution.begin(), host_solution.end(), solution_values_ptr); + } } else { optimization_problem_solution_t<cuopt_int_t, cuopt_float_t>* optimization_problem_solution = static_cast<optimization_problem_solution_t<cuopt_int_t, cuopt_float_t>*>( solution_and_stream_view->lp_solution_ptr); - const rmm::device_uvector<cuopt_float_t>& solution_values = - optimization_problem_solution->get_primal_solution(); - rmm::cuda_stream_view stream = - solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); - raft::copy(solution_values_ptr, solution_values.data(), solution_values.size(), stream); - stream.synchronize(); + if (optimization_problem_solution->is_device_memory()) { + const rmm::device_uvector<cuopt_float_t>& solution_values = + optimization_problem_solution->get_primal_solution(); + rmm::cuda_stream_view stream = + solution_and_stream_view->stream_view.value_or(rmm::cuda_stream_view{}); + raft::copy(solution_values_ptr, solution_values.data(), solution_values.size(), stream); + stream.synchronize(); + } else { + const auto& host_solution = optimization_problem_solution->get_primal_solution_host(); + std::copy(host_solution.begin(), host_solution.end(), solution_values_ptr); + } }Please mirror the same device/host branching in
cuOptGetDualSolutionandcuOptGetReducedCosts.Also applies to: 981-999, 1021-1039
cpp/src/linear_programming/solver_solution.cu (1)
191-265:⚠️ Potential issue | 🟠 MajorClear stale buffers when switching memory domains in
copy_from.
When copying from a different memory domain, the old buffers remain allocated and can be returned by getters, leading to stale data and wasted memory. Reset the inactive domain in each branch to preserve invariants.🧹 Proposed fix to clear inactive-domain buffers
if (other.is_device_memory_) { // Copy GPU data @@ if (other.reduced_cost_) { reduced_cost_->resize(other.reduced_cost_->size(), handle_ptr->get_stream()); raft::copy(reduced_cost_->data(), other.reduced_cost_->data(), reduced_cost_->size(), handle_ptr->get_stream()); } else { reduced_cost_->resize(0, handle_ptr->get_stream()); } + // Clear stale host buffers when switching to device memory + primal_solution_host_.reset(); + dual_solution_host_.reset(); + reduced_cost_host_.reset(); + handle_ptr->sync_stream(); } else { // Copy CPU data @@ if (other.reduced_cost_host_) { *reduced_cost_host_ = *other.reduced_cost_host_; } else { reduced_cost_host_->clear(); } + + // Clear stale device buffers when switching to host memory + primal_solution_.reset(); + dual_solution_.reset(); + reduced_cost_.reset(); }
🤖 Fix all issues with AI agents
In `@cpp/include/cuopt/linear_programming/mip/solver_solution.hpp`:
- Around line 157-169: The to_host(rmm::cuda_stream_view) implementation must
perform an asynchronous device-to-host copy on the provided stream and check
CUDA errors without synchronizing; update mip_solution_t<...>::to_host to call
rmm::device_uvector::copy_async (or cudaMemcpyAsync) into the host buffer when
is_device_memory() is true, pass stream_view.stream() for the stream, wrap the
copy call with CUDA_CHECK (or the project’s CUDA_CHECK macro) to validate the
operation, and remove any cudaDeviceSynchronize or stream synchronization calls
so the path remains async; ensure after scheduling the copy you update internal
state so get_solution_host() refers to the host buffer once the copy is
enqueued.
In `@cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp`:
- Around line 71-110: Add a precondition check in the remote stub(s) to enforce
that input data is host-resident: in solve_lp_remote(...) (and the other remote
overload) assert or throw if the provided
cuopt::mps_parser::data_model_view_t<i_t,f_t>& view is not on the host (use the
view's host-residency accessor, e.g., view.is_host_resident() or
view.is_on_host(), or equivalent), so the stub fails fast instead of silently
proceeding; place this check at the top of the functions (before computing
n_rows/n_cols) and keep behavior consistent (assert for debug builds or throw
std::invalid_argument/std::runtime_error for public API).
In `@cpp/tests/linear_programming/unit_tests/memory_model_infrastructure_test.cu`:
- Around line 36-65: The tests mutate process-global CUOPT_REMOTE_HOST/PORT in
RemoteSolveConfigTest::SetUp/ TearDown which can race when tests run in
parallel; fix by introducing a global test-scope mutex (e.g., a static
std::mutex or a test helper ScopedEnvLock) and acquire it for the entire
duration of each test instance in SetUp and release in TearDown (or use RAII
ScopedEnvLock constructed in SetUp and destroyed in TearDown); ensure the lock
covers both the unsetenv/setenv calls and any GPU/global-state touches so
environment and global state are serialized across tests.
🧹 Nitpick comments (5)
cpp/tests/mip/mip_solution_memory_test.cu (1)
101-117: Consider adding edge case tests for empty solutions.The termination-status-only and error constructors are well tested. For more comprehensive coverage per coding guidelines, consider adding tests for:
- Empty variable names vector
- Zero-variable solutions
- Degenerate termination statuses
These could help validate edge case handling in the solution memory paths.
cpp/tests/linear_programming/unit_tests/memory_model_infrastructure_test.cu (2)
218-331: Exercise a real device‑pointer path (or rename the test).
gpu_view_conversionmarks device memory but uses host pointers, so it doesn’t validate the real GPU copy path. Consider allocating device buffers (e.g.,rmm::device_uvector) when GPU tests are available, or rename to clarify it’s a “GPU‑flag only” test.Based on learnings: Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems).
336-387: Strengthen stub assertions with value checks (zeros).These tests assert sizes and status only. Add value checks for the returned vectors (all zeros) to confirm numerical content, not just shape.
Based on learnings: Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems).
✅ Example enhancement
EXPECT_FALSE(solution.is_device_memory()); EXPECT_EQ(solution.get_termination_status(), pdlp_termination_status_t::Optimal); EXPECT_EQ(solution.get_primal_solution_host().size(), 1); + EXPECT_EQ(solution.get_primal_solution_host(), std::vector<double>({0.0})); EXPECT_NO_THROW(solution.get_primal_solution_host()); EXPECT_NO_THROW(solution.get_dual_solution_host());cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp (1)
40-56: Consider exception‑free port parsing withstd::from_chars.This avoids exceptions in headers and is faster/locale‑independent.
♻️ Suggested change
+#include <charconv> ... if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') { - try { - int port_num = std::stoi(port); - if (port_num < 1 || port_num > 65535) { return std::nullopt; } - return remote_solve_config_t{std::string(host), port_num}; - } catch (...) { - // Invalid port number, fall back to local solve - return std::nullopt; - } + int port_num = 0; + auto [ptr, ec] = std::from_chars(port, port + std::strlen(port), port_num); + if (ec != std::errc{} || ptr == port || *ptr != '\0') { return std::nullopt; } + if (port_num < 1 || port_num > 65535) { return std::nullopt; } + return remote_solve_config_t{std::string(host), port_num}; }cpp/src/linear_programming/solver_solution.cu (1)
733-782: Consider consolidating the repeated host-copy logic.
The device→host copy blocks are repeated inwrite_to_file,write_to_sol_file, andto_host. Extracting a shared helper would reduce duplication and future maintenance drift.Based on learnings: Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication.
| /** | ||
| * @brief Copy solution data from GPU to CPU memory. | ||
| * | ||
| * After calling this method, is_device_memory() will return false and | ||
| * the solution can be accessed via get_solution_host(). | ||
| * This is useful for remote solve scenarios where serialization requires | ||
| * CPU-accessible data. | ||
| * | ||
| * If the solution is already in CPU memory, this is a no-op. | ||
| * | ||
| * @param stream_view The CUDA stream to use for the copy | ||
| */ | ||
| void to_host(rmm::cuda_stream_view stream_view); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure to_host uses CUDA_CHECK and async copy on the provided stream.
This path will likely call cudaMemcpyAsync/rmm::device_uvector::copy_async; please confirm CUDA errors are checked and avoid cudaDeviceSynchronize in hot paths.
As per coding guidelines: Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification.
#!/bin/bash
# Inspect to_host implementation for CUDA_CHECK usage and sync patterns.
rg -n "mip_solution_t<.*>::to_host\\(" -C3🤖 Prompt for AI Agents
In `@cpp/include/cuopt/linear_programming/mip/solver_solution.hpp` around lines
157 - 169, The to_host(rmm::cuda_stream_view) implementation must perform an
asynchronous device-to-host copy on the provided stream and check CUDA errors
without synchronizing; update mip_solution_t<...>::to_host to call
rmm::device_uvector::copy_async (or cudaMemcpyAsync) into the host buffer when
is_device_memory() is true, pass stream_view.stream() for the stream, wrap the
copy call with CUDA_CHECK (or the project’s CUDA_CHECK macro) to validate the
operation, and remove any cudaDeviceSynchronize or stream synchronization calls
so the path remains async; ensure after scheduling the copy you update internal
state so get_solution_host() refers to the host buffer once the copy is
enqueued.
| optimization_problem_solution_t<i_t, f_t> solve_lp_remote( | ||
| const remote_solve_config_t&, | ||
| const cuopt::mps_parser::data_model_view_t<i_t, f_t>& view, | ||
| const pdlp_solver_settings_t<i_t, f_t>&) | ||
| { | ||
| auto n_rows = view.get_constraint_matrix_offsets().size() > 0 | ||
| ? static_cast<i_t>(view.get_constraint_matrix_offsets().size()) - 1 | ||
| : 0; | ||
| auto n_cols = static_cast<i_t>(view.get_objective_coefficients().size()); | ||
|
|
||
| std::vector<f_t> primal_solution(static_cast<size_t>(n_cols), f_t{0}); | ||
| std::vector<f_t> dual_solution(static_cast<size_t>(n_rows), f_t{0}); | ||
| std::vector<f_t> reduced_cost(static_cast<size_t>(n_cols), f_t{0}); | ||
|
|
||
| typename optimization_problem_solution_t<i_t, f_t>::additional_termination_information_t stats; | ||
| stats.number_of_steps_taken = 0; | ||
| stats.total_number_of_attempted_steps = 0; | ||
| stats.l2_primal_residual = f_t{0}; | ||
| stats.l2_relative_primal_residual = f_t{0}; | ||
| stats.l2_dual_residual = f_t{0}; | ||
| stats.l2_relative_dual_residual = f_t{0}; | ||
| stats.primal_objective = f_t{0}; | ||
| stats.dual_objective = f_t{0}; | ||
| stats.gap = f_t{0}; | ||
| stats.relative_gap = f_t{0}; | ||
| stats.max_primal_ray_infeasibility = f_t{0}; | ||
| stats.primal_ray_linear_objective = f_t{0}; | ||
| stats.max_dual_ray_infeasibility = f_t{0}; | ||
| stats.dual_ray_linear_objective = f_t{0}; | ||
| stats.solve_time = 0.0; | ||
| stats.solved_by_pdlp = false; | ||
| return optimization_problem_solution_t<i_t, f_t>(std::move(primal_solution), | ||
| std::move(dual_solution), | ||
| std::move(reduced_cost), | ||
| "", | ||
| {}, | ||
| {}, | ||
| stats, | ||
| pdlp_termination_status_t::Optimal); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enforce host‑memory precondition in remote stubs.
Remote solve is defined to require host‑resident data; the stubs should assert that to prevent silent misuse.
🛑 Proposed guard
optimization_problem_solution_t<i_t, f_t> solve_lp_remote(
const remote_solve_config_t&,
const cuopt::mps_parser::data_model_view_t<i_t, f_t>& view,
const pdlp_solver_settings_t<i_t, f_t>&)
{
+ cuopt_assert(!view.is_device_memory(), "Remote solve requires host memory view");
auto n_rows = view.get_constraint_matrix_offsets().size() > 0
? static_cast<i_t>(view.get_constraint_matrix_offsets().size()) - 1
: 0;
@@
mip_solution_t<i_t, f_t> solve_mip_remote(
const remote_solve_config_t&,
const cuopt::mps_parser::data_model_view_t<i_t, f_t>& view,
const mip_solver_settings_t<i_t, f_t>&)
{
+ cuopt_assert(!view.is_device_memory(), "Remote solve requires host memory view");
auto n_cols = static_cast<i_t>(view.get_objective_coefficients().size());Also applies to: 117-139
🤖 Prompt for AI Agents
In `@cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp` around lines
71 - 110, Add a precondition check in the remote stub(s) to enforce that input
data is host-resident: in solve_lp_remote(...) (and the other remote overload)
assert or throw if the provided cuopt::mps_parser::data_model_view_t<i_t,f_t>&
view is not on the host (use the view's host-residency accessor, e.g.,
view.is_host_resident() or view.is_on_host(), or equivalent), so the stub fails
fast instead of silently proceeding; place this check at the top of the
functions (before computing n_rows/n_cols) and keep behavior consistent (assert
for debug builds or throw std::invalid_argument/std::runtime_error for public
API).
| class RemoteSolveConfigTest : public ::testing::Test { | ||
| protected: | ||
| void SetUp() override | ||
| { | ||
| // Save original environment variables | ||
| const char* host_env = std::getenv("CUOPT_REMOTE_HOST"); | ||
| const char* port_env = std::getenv("CUOPT_REMOTE_PORT"); | ||
| saved_host_ = host_env ? std::string(host_env) : std::string(); | ||
| saved_port_ = port_env ? std::string(port_env) : std::string(); | ||
| had_host_ = (host_env != nullptr); | ||
| had_port_ = (port_env != nullptr); | ||
|
|
||
| // Clean environment for test | ||
| unsetenv("CUOPT_REMOTE_HOST"); | ||
| unsetenv("CUOPT_REMOTE_PORT"); | ||
| } | ||
|
|
||
| void TearDown() override | ||
| { | ||
| // Restore original environment variables | ||
| if (had_host_) { | ||
| setenv("CUOPT_REMOTE_HOST", saved_host_.c_str(), 1); | ||
| } else { | ||
| unsetenv("CUOPT_REMOTE_HOST"); | ||
| } | ||
| if (had_port_) { | ||
| setenv("CUOPT_REMOTE_PORT", saved_port_.c_str(), 1); | ||
| } else { | ||
| unsetenv("CUOPT_REMOTE_PORT"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard global env var mutations across tests.
CUOPT_REMOTE_* is process‑global; if tests run in parallel, SetUp/TearDown can interleave and cause flaky behavior. Consider holding a mutex for the full test scope.
Based on learnings: Ensure test isolation: prevent GPU state, cached memory, and global variables from leaking between test cases; verify each test independently initializes its environment.
🔒 Proposed fix: scoped env lock
+#include <mutex>
...
namespace cuopt::linear_programming::test {
+namespace {
+std::mutex env_mutex;
+}
+
class RemoteSolveConfigTest : public ::testing::Test {
protected:
void SetUp() override
{
+ env_lock_ = std::unique_lock<std::mutex>(env_mutex);
// Save original environment variables
const char* host_env = std::getenv("CUOPT_REMOTE_HOST");
const char* port_env = std::getenv("CUOPT_REMOTE_PORT");
@@
void TearDown() override
{
// Restore original environment variables
if (had_host_) {
setenv("CUOPT_REMOTE_HOST", saved_host_.c_str(), 1);
} else {
unsetenv("CUOPT_REMOTE_HOST");
}
@@
if (had_port_) {
setenv("CUOPT_REMOTE_PORT", saved_port_.c_str(), 1);
} else {
unsetenv("CUOPT_REMOTE_PORT");
}
+ env_lock_.unlock();
}
private:
std::string saved_host_;
std::string saved_port_;
bool had_host_ = false;
bool had_port_ = false;
+ std::unique_lock<std::mutex> env_lock_;
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class RemoteSolveConfigTest : public ::testing::Test { | |
| protected: | |
| void SetUp() override | |
| { | |
| // Save original environment variables | |
| const char* host_env = std::getenv("CUOPT_REMOTE_HOST"); | |
| const char* port_env = std::getenv("CUOPT_REMOTE_PORT"); | |
| saved_host_ = host_env ? std::string(host_env) : std::string(); | |
| saved_port_ = port_env ? std::string(port_env) : std::string(); | |
| had_host_ = (host_env != nullptr); | |
| had_port_ = (port_env != nullptr); | |
| // Clean environment for test | |
| unsetenv("CUOPT_REMOTE_HOST"); | |
| unsetenv("CUOPT_REMOTE_PORT"); | |
| } | |
| void TearDown() override | |
| { | |
| // Restore original environment variables | |
| if (had_host_) { | |
| setenv("CUOPT_REMOTE_HOST", saved_host_.c_str(), 1); | |
| } else { | |
| unsetenv("CUOPT_REMOTE_HOST"); | |
| } | |
| if (had_port_) { | |
| setenv("CUOPT_REMOTE_PORT", saved_port_.c_str(), 1); | |
| } else { | |
| unsetenv("CUOPT_REMOTE_PORT"); | |
| } | |
| `#include` <mutex> | |
| // ... other includes ... | |
| namespace cuopt::linear_programming::test { | |
| namespace { | |
| std::mutex env_mutex; | |
| } | |
| class RemoteSolveConfigTest : public ::testing::Test { | |
| protected: | |
| void SetUp() override | |
| { | |
| env_lock_ = std::unique_lock<std::mutex>(env_mutex); | |
| // Save original environment variables | |
| const char* host_env = std::getenv("CUOPT_REMOTE_HOST"); | |
| const char* port_env = std::getenv("CUOPT_REMOTE_PORT"); | |
| saved_host_ = host_env ? std::string(host_env) : std::string(); | |
| saved_port_ = port_env ? std::string(port_env) : std::string(); | |
| had_host_ = (host_env != nullptr); | |
| had_port_ = (port_env != nullptr); | |
| // Clean environment for test | |
| unsetenv("CUOPT_REMOTE_HOST"); | |
| unsetenv("CUOPT_REMOTE_PORT"); | |
| } | |
| void TearDown() override | |
| { | |
| // Restore original environment variables | |
| if (had_host_) { | |
| setenv("CUOPT_REMOTE_HOST", saved_host_.c_str(), 1); | |
| } else { | |
| unsetenv("CUOPT_REMOTE_HOST"); | |
| } | |
| if (had_port_) { | |
| setenv("CUOPT_REMOTE_PORT", saved_port_.c_str(), 1); | |
| } else { | |
| unsetenv("CUOPT_REMOTE_PORT"); | |
| } | |
| env_lock_.unlock(); | |
| } | |
| private: | |
| std::string saved_host_; | |
| std::string saved_port_; | |
| bool had_host_ = false; | |
| bool had_port_ = false; | |
| std::unique_lock<std::mutex> env_lock_; | |
| }; |
🤖 Prompt for AI Agents
In `@cpp/tests/linear_programming/unit_tests/memory_model_infrastructure_test.cu`
around lines 36 - 65, The tests mutate process-global CUOPT_REMOTE_HOST/PORT in
RemoteSolveConfigTest::SetUp/ TearDown which can race when tests run in
parallel; fix by introducing a global test-scope mutex (e.g., a static
std::mutex or a test helper ScopedEnvLock) and acquire it for the entire
duration of each test instance in SetUp and release in TearDown (or use RAII
ScopedEnvLock constructed in SetUp and destroyed in TearDown); ensure the lock
covers both the unsetenv/setenv calls and any GPU/global-state touches so
environment and global state are serialized across tests.
NOTE: Remote solve is stubbed in this commit and returns dummy solutions. Full gRPC implementation will be added in a follow-up PR.
Overview:
This change enables cuOpt to work with problem data in host (CPU) memory, supporting both local GPU solves and future remote solve capabilities.
Key Changes:
Core Infrastructure:
Solution Memory Model:
Cython/Python:
Remote Solve Support (Stubbed):
CLI:
Tests:
Description
Issue
Checklist
Summary by CodeRabbit
New Features
Improvements
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.