Skip to content

Conversation

@rgsl888prabhu
Copy link
Collaborator

@rgsl888prabhu rgsl888prabhu commented Jan 27, 2026

Description

Issue

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

Summary by CodeRabbit

  • New Features

    • Remote solve over gRPC: submit jobs, check status, stream/get results and logs, cancel/delete jobs; CLI and Python cancel_job; probe/report gRPC limits.
    • Local/remote unified solve path and CPU-host solution support for remote solves.
  • Documentation

    • Added architecture, prototype/status, build notes, serialization plugin guide, and solution memory architecture.
  • Chores

    • Enabled gRPC/Protobuf in builds, added build targets/scripts, CI install helpers, and updated Conda/dependency manifests.

✏️ Tip: You can customize this high-level summary in your review settings.

tmckayus and others added 13 commits January 26, 2026 09:58
Introduce CPU/remote-safe solution handling, gRPC service definitions, and remote solve client/server plumbing to support remote execution and log streaming while retaining local solve behavior.
Include gRPC packages in dependency manifests and regenerate environment files from dependencies.yaml.
Ensure incumbent callbacks are retained for gRPC workers, switch incumbent job IDs to bytes in the proto, and harden remote solve CPU-only paths and build configuration defaults.
@rgsl888prabhu rgsl888prabhu requested review from a team as code owners January 27, 2026 17:45
@rgsl888prabhu rgsl888prabhu added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Jan 27, 2026
@rgsl888prabhu rgsl888prabhu changed the title Remote execution 2 - testing CI [Don't review]Remote execution 2 - testing CI Jan 27, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 27, 2026

📝 Walkthrough

Walkthrough

Adds comprehensive gRPC-based remote-solve support: protobuf/gRPC schemas, server and client implementations, pluggable protobuf serializer, remote-solve C/C++/Python APIs, hybrid GPU/CPU solution memory model, build/CI integration for gRPC/Protobuf, and developer docs/scripts. All changes are additive; no existing public declarations removed.

Changes

Cohort / File(s) Summary
Docs
GRPC_ARCHITECTURE.md, GRPC_BUILD_STATUS.md, GRPC_PROTOTYPE_STATUS.md, docs/developer/SERIALIZATION_PLUGIN_GUIDE.md, docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md
New architecture, prototype/build-status, plugin guide, and solution-memory documentation for remote/gRPC work.
Protobuf & gRPC schemas
cpp/src/linear_programming/utilities/cuopt_remote.proto, cpp/src/linear_programming/utilities/cuopt_remote_service.proto
New protobuf messages and gRPC service (submit/upload, status/result/streaming, logs, cancel/delete, incumbents, chunked uploads).
Server implementation
cpp/cuopt_remote_server.cpp
New TCP/IPC-based remote solve server with job queue, workers, pipe/SHM IPC, sync/async semantics, log management (~2300 LOC).
gRPC client & transport
cpp/src/linear_programming/utilities/remote_solve_grpc.cpp, cpp/src/linear_programming/utilities/remote_solve_grpc.hpp
New gRPC client with submit/upload, streaming results/logs, status/cancel/delete, chunked-upload and server-feedback handling.
Remote solve runtime
cpp/src/linear_programming/utilities/remote_solve.cu, cpp/include/.../remote_solve.hpp
Remote entry points (solve_lp_remote/solve_mip_remote), cancel_job_remote, config helpers, and explicit instantiations; routes local vs remote solves.
Serialization framework & implementation
cpp/include/.../remote_serialization.hpp, cpp/src/.../protobuf_serializer.cu, docs/developer/SERIALIZATION_PLUGIN_GUIDE.md
Abstract remote_serializer_t interface and Protobuf-backed implementation plus plugin/factory support for custom serializers.
Solution memory model
cpp/include/.../mip/solver_solution.hpp, cpp/include/.../pdlp/solver_solution.hpp, cpp/src/linear_programming/solver_solution.cu, cpp/src/mip/solver_solution.cu, docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md
Dual GPU/CPU solution storage: host buffers, is_device_memory flag, to_host(), host setters/getters, and remote-deserialization support.
View-based solve integration
cpp/include/.../data_model_view.hpp, cpp/include/.../solve.hpp, cpp/src/linear_programming/solve.cu, cpp/src/mip/solve.cu
Non-owning data_model_view_t APIs and new solve overloads accepting views; unified routing for local vs remote with CPU/GPU copy helpers.
MPS parser & build
cpp/libmps_parser/include/mps_parser/data_model_view.hpp, cpp/libmps_parser/src/data_model_view.cpp, cpp/libmps_parser/CMakeLists.txt
Added is_device_memory on data_model_view, adjusted explicit instantiations, and added shared/static mps_parser targets/aliases.
C API, CLI & tools
cpp/src/linear_programming/cuopt_c.cpp, cpp/cuopt_cli.cpp, cpp/cuopt_cancel_job.cpp, cpp/scripts/run_dev_server.sh
Dual-path C API for CPU/GPU problems, CLI crash handlers and gRPC max probing, cancel-job utility, dev-run script.
Build & CI
build.sh, cpp/CMakeLists.txt, cpp/src/linear_programming/CMakeLists.txt, cpp/libmps_parser/CMakeLists.txt, ci/build_wheel_libcuopt.sh, ci/utils/install_protobuf_grpc.sh, python/libcuopt/CMakeLists.txt, conda/.../all_cuda-*.yaml, conda/recipes/libcuopt/recipe.yaml, dependencies.yaml
Add CUOPT_ENABLE_GRPC/BUILD_REMOTE_SERVER flags and cuopt_grpc_server target; CMake/protobuf/gRPC hints; CI install helper; conda/dependency additions.
Python bindings & utilities
python/cuopt/cuopt/__init__.py, python/cuopt/cuopt/linear_programming/__init__.py, python/cuopt/cuopt/linear_programming/remote.py, python/cuopt/cuopt/linear_programming/solver/solver.pxd, python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx, python/cuopt/cuopt/utilities/__init__.py, python/cuopt/cuopt/utilities/utils.py, python/cuopt/cuopt/linear_programming/internals/internals.pyx
New Python remote.cancel API (JobStatus/CancelResult), lazy module/utility imports, Cython solver types extended for host/device buffers, ctypes-based numpy-buffer path, and solver-wrapper CPU/GPU handling.
Callback & internals
cpp/include/.../callbacks_implems.hpp, cpp/include/.../internals.hpp, cpp/include/.../cython_solve.hpp
Added callback_memory_location enum and APIs; data_on_device checks to select get_numba_matrix vs get_numpy_array; cython types extended with host buffers/is_device_memory.
Misc & .gitignore
.gitignore
Added ignore entry dev_scripts/test_c_api.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title '[Don't review]Remote execution 2 - testing CI' is vague and includes a meta-annotation '[Don't review]' rather than clearly describing the main technical change. The actual changeset implements comprehensive gRPC-based remote execution infrastructure with protobuf serialization, multiple new servers, and solver support, but the title obscures this with a non-descriptive prefix and generic 'testing CI' phrase. Replace with a specific title reflecting the main change, e.g., 'Add gRPC remote execution infrastructure with protobuf serialization' or 'Implement remote solver server and client with async job management'.
Docstring Coverage ⚠️ Warning Docstring coverage is 26.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
python/cuopt/cuopt/utilities/utils.py (1)

45-64: Pre-existing bug: operator precedence issue in boundary checks.

Lines 53 and 59 contain a precedence bug that predates this PR:

if len(data.get_variable_lower_bounds() > 0):  # Wrong

This applies > 0 element-wise to the array first, then takes len() of the resulting boolean array. The intent appears to be:

if len(data.get_variable_lower_bounds()) > 0:  # Correct

While this is pre-existing code not modified in this PR, it may cause unexpected behavior in validation. Consider addressing in a follow-up.

cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp (2)

41-52: Missing null check on PyObject_CallMethod result before Py_DECREF.

If PyObject_CallMethod fails (e.g., Python exception), it returns NULL. Calling Py_DECREF(res) on NULL (line 51) causes undefined behavior. The same applies to numba_matrix and numpy_array if their creation fails.

🐛 Proposed fix: Add null checks
   void get_solution(void* data, void* objective_value) override
   {
     PyObject* numba_matrix =
       data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables);
+    if (!numba_matrix) return;  // Python exception set
     PyObject* numpy_array =
       data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1);
+    if (!numpy_array) { Py_DECREF(numba_matrix); return; }
     PyObject* res =
       PyObject_CallMethod(this->pyCallbackClass, "get_solution", "(OO)", numba_matrix, numpy_array);
     Py_DECREF(numba_matrix);
     Py_DECREF(numpy_array);
-    Py_DECREF(res);
+    Py_XDECREF(res);  // Use XDECREF to handle NULL safely
   }

80-91: Same null check issue in set_solution.

Apply the same null-safety pattern here as recommended for get_solution.

🐛 Proposed fix
   void set_solution(void* data, void* objective_value) override
   {
     PyObject* numba_matrix =
       data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables);
+    if (!numba_matrix) return;
     PyObject* numpy_array =
       data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1);
+    if (!numpy_array) { Py_DECREF(numba_matrix); return; }
     PyObject* res =
       PyObject_CallMethod(this->pyCallbackClass, "set_solution", "(OO)", numba_matrix, numpy_array);
     Py_DECREF(numba_matrix);
     Py_DECREF(numpy_array);
-    Py_DECREF(res);
+    Py_XDECREF(res);
   }
cpp/libmps_parser/CMakeLists.txt (1)

137-143: Static library missing BZip2/ZLIB include directories.

When MPS_PARSER_WITH_BZIP2 or MPS_PARSER_WITH_ZLIB are enabled, only the shared library gets the include directories. The static library should also have these.

🔧 Proposed fix
 if(MPS_PARSER_WITH_BZIP2)
     target_include_directories(mps_parser PRIVATE BZip2::BZip2)
+    target_include_directories(mps_parser_static PRIVATE BZip2::BZip2)
 endif(MPS_PARSER_WITH_BZIP2)

 if(MPS_PARSER_WITH_ZLIB)
     target_include_directories(mps_parser PRIVATE ZLIB::ZLIB)
+    target_include_directories(mps_parser_static PRIVATE ZLIB::ZLIB)
 endif(MPS_PARSER_WITH_ZLIB)
🤖 Fix all issues with AI agents
In `@cpp/cuopt_remote_server.cpp`:
- Around line 303-313: generate_job_id currently declares static std::mt19937
gen and static std::uniform_int_distribution<uint64_t> dis which are shared
across threads and cause data races; modify generate_job_id to use thread-local
RNG state (e.g., make rd, gen, and dis thread_local) or protect the static
generators with a mutex so concurrent calls are safe; update the function
implementation around the symbols generate_job_id, gen, dis (and rd if present)
to apply one of these fixes and ensure generated IDs remain formatted the same.
- Around line 289-297: The signal handler signal_handler currently uses
non-async-signal-safe operations (std::cout, accessing shm_ctrl, and
result_cv.notify_all()), so change it to only set an atomic flag (e.g., a global
volatile sig_atomic_t or std::atomic<bool> shutdown_signal_received) and set
keep_running = false if that is signal-safe; remove direct access to shm_ctrl
and result_cv from signal_handler. In the main accept loop or a dedicated
monitor thread, poll or wait on shutdown_signal_received and there perform the
non-safe actions: log the shutdown message (std::cout), set
shm_ctrl->shutdown_requested, and call result_cv.notify_all(). Update references
in code to use shutdown_signal_received where signal_handler used to toggle
shutdown behavior.
- Around line 548-562: The current loop that finds a free slot in result_queue
sets fields non-atomically and races on result_queue[i].ready; replace this
linear check-and-write with an atomic claim using an atomic flag and
compare_exchange_strong (the same pattern used in the job-claiming code path) so
only one worker can win the slot before writing other fields. Concretely, change
the probe that tests result_queue[i].ready into an atomic compare-and-set (e.g.,
attempt to swap from false to true using std::atomic<bool> or an integer flag),
and only on successful compare_exchange_strong proceed to populate job_id,
status, data_size, shm_data_name, worker_index, error_message, retrieved, etc.;
apply the same fix to all other result_queue allocation sites referenced in the
review (the blocks around the given snippet and at the other line ranges noted)
to ensure atomic slot claiming.

In `@cpp/src/linear_programming/solver_solution.cu`:
- Around line 442-476: The host accessor methods of
optimization_problem_solution_t (get_primal_solution_host,
get_dual_solution_host, get_reduced_cost_host and their const overloads)
dereference unique_ptr members without checking is_device_memory_; add a runtime
guard to prevent null dereference by asserting or throwing when
is_device_memory_ is true or when the corresponding unique_ptr
(primal_solution_host_, dual_solution_host_, reduced_cost_host_) is null; update
each accessor to check is_device_memory_ (or pointer) and produce a clear error
(e.g., assert/throw with a message) before returning *primal_solution_host_,
*dual_solution_host_, or *reduced_cost_host_.
- Around line 411-421: The device accessor methods (get_primal_solution(),
get_dual_solution(), get_reduced_cost()) currently dereference the unique_ptr
members (e.g., primal_solution_) unconditionally and can dereference null when
is_device_memory_ is false; update each accessor to guard against null by
asserting or throwing a clear precondition (e.g., check is_device_memory_ or
that the unique_ptr is non-null) before dereferencing, and document the behavior
(or return/throw) so callers must call is_device_memory_ first when appropriate;
make the change inside the methods referenced above to validate
primal_solution_/dual_solution_/reduced_cost_ before returning *ptr.

In `@cpp/src/linear_programming/utilities/protobuf_serializer.cu`:
- Around line 1303-1334: lp_solution_to_proto currently calls cudaMemcpy on
device buffers (d_primal, d_dual, d_reduced_cost via
nc_solution.get_reduced_cost()) without checking for CUDA errors; update each
cudaMemcpy call to capture and check the returned cudaError_t and handle
failures (e.g., log via processLogger/return an error/throw) so serialization
reports failures instead of silently continuing. Locate the cudaMemcpy usages in
lp_solution_to_proto (references: d_primal, d_dual,
nc_solution/get_reduced_cost, pb_solution) and wrap each call by storing the
result (cudaError_t err = cudaMemcpy(...)); if err != cudaSuccess, convert the
error to a clear error path (return an error code or throw an exception or call
the existing logger and abort the proto conversion) so callers receive the
failure.

In `@docs/developer/SERIALIZATION_PLUGIN_GUIDE.md`:
- Around line 570-573: The doc incorrectly points to
`cpp/src/linear_programming/utilities/serializers/msgpack_serializer.cpp` and
its `CMakeLists.txt` as a working MsgPack C++ example; remove or replace that
reference and instead explain that MsgPack serialization is implemented in the
Python client `cuopt-sh-client` using the `msgpack` and `msgpack-numpy`
libraries, or link to an actual C++ serializer that exists under
`cpp/src/linear_programming/utilities/serializers/` if you want a C++ example;
update the sentence and any bullets to reference `cuopt-sh-client` and the
Python libraries (`msgpack`, `msgpack-numpy`) or the correct C++ serializer file
name accordingly.
🟡 Minor comments (16)
python/cuopt/cuopt/linear_programming/internals/internals.pyx-52-56 (1)

52-56: Validate typestr and shape to avoid mis-sized buffers

The current else fallback silently treats any non-float32 type as float64. If a different typestr slips through, this can create a wrong-sized view and read beyond the buffer. Consider explicit validation and a fast-fail.

Proposed fix
 def get_numpy_array(self, data, shape, typestr):
-        ctype = ctypes.c_float if typestr == "float32" else ctypes.c_double
+        dtype_map = {"float32": ctypes.c_float, "float64": ctypes.c_double}
+        if not isinstance(shape, int) or shape < 0:
+            raise ValueError("shape must be a non-negative int")
+        try:
+            ctype = dtype_map[typestr]
+        except KeyError:
+            raise ValueError(f"Unsupported typestr: {typestr}")
         buf_type = ctype * shape
         buf = buf_type.from_address(data)
         numpy_array = np.ctypeslib.as_array(buf)
         return numpy_array
GRPC_BUILD_STATUS.md-169-171 (1)

169-171: Clarify the branch reference to avoid confusion.
The doc names grpc-implementation, but this PR is remote_execution_2. Consider updating or removing the branch line to keep this status doc accurate.

🔧 Suggested edit
-**Branch**: `grpc-implementation` (3 commits, ready for review)
+**Branch**: `remote_execution_2` (this PR)
cpp/libmps_parser/CMakeLists.txt-81-85 (1)

81-85: Static library mps_parser_static missing target properties and compile options.

The static library is created but doesn't inherit the same set_target_properties and target_compile_options applied to the shared library (lines 87-100). This could cause ABI mismatches or compilation issues when linking.

🔧 Proposed fix: Apply same properties to static library
+set_target_properties(mps_parser_static
+  PROPERTIES
+  CXX_STANDARD 20
+  CXX_STANDARD_REQUIRED ON
+  POSITION_INDEPENDENT_CODE ON
+  CXX_SCAN_FOR_MODULES OFF
+)
+
+target_compile_options(mps_parser_static
+  PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>"
+)
GRPC_PROTOTYPE_STATUS.md-113-116 (1)

113-116: Remove hardcoded local development path.

The build instructions contain a developer-specific local path (/home/tmckay/repos/nvidia-cuopt). This should be replaced with a generic placeholder or relative path for public documentation.

Suggested fix
 ### Prerequisites
 ```bash
 # Ensure gRPC is available in conda environment
 conda install grpc-cpp libgrpc -c conda-forge

 # Build with gRPC enabled
-cd /home/tmckay/repos/nvidia-cuopt
+cd <path-to-cuopt-repo>
 ./build.sh  # or your build command with -DBUILD_REMOTE_SERVER=ON
</details>

</blockquote></details>
<details>
<summary>docs/developer/SERIALIZATION_PLUGIN_GUIDE.md-182-197 (1)</summary><blockquote>

`182-197`: **Incomplete code example - missing return statement.**

The `deserialize_lp_solution` example has commented-out error handling but no actual return statement for the success path. This could confuse plugin developers.


<details>
<summary>📝 Suggested fix</summary>

```diff
   optimization_problem_solution_t<i_t, f_t> deserialize_lp_solution(
     const std::vector<uint8_t>& data) override
   {
     // Parse the solution data
     // Create and return solution object
+    optimization_problem_solution_t<i_t, f_t> solution;
+    // ... populate solution from data ...
+    return solution;

     // On error, return error solution:
     // return optimization_problem_solution_t<i_t, f_t>(
     //   cuopt::logic_error("Parse error", cuopt::error_type_t::RuntimeError));
   }
cpp/cuopt_cancel_job.cpp-74-74 (1)

74-74: std::atoi silently returns 0 on invalid input.

If CUOPT_REMOTE_PORT contains non-numeric text, atoi returns 0 without indication of error, which would cause connection failure to port 0.

🔧 Suggested improvement
-  if (env_port && env_port[0]) { port = std::atoi(env_port); }
+  if (env_port && env_port[0]) {
+    try {
+      port = std::stoi(env_port);
+    } catch (const std::exception&) {
+      std::cerr << "Warning: Invalid CUOPT_REMOTE_PORT value, using default\n";
+    }
+  }
cpp/cuopt_cancel_job.cpp-82-83 (1)

82-83: std::stoi can throw on invalid input - consider adding error handling.

If a user passes an invalid port string (e.g., -p abc), std::stoi will throw std::invalid_argument or std::out_of_range, causing an unhandled exception crash.

🔧 Suggested fix
     } else if (strcmp(argv[i], "-p") == 0 && i + 1 < argc) {
-      port = std::stoi(argv[++i]);
+      try {
+        port = std::stoi(argv[++i]);
+      } catch (const std::exception& e) {
+        std::cerr << "Error: Invalid port number\n";
+        return 1;
+      }
     } else if (argv[i][0] != '-') {
python/cuopt/cuopt/linear_programming/remote.py-141-173 (1)

141-173: Socket may not be closed on all exception paths.

If an exception occurs after sock.connect() but before the explicit sock.close() calls, the socket will leak. Consider using a context manager or try/finally.

🔧 Suggested fix using context manager pattern
     try:
         # Connect to server
-        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
-        sock.settimeout(30.0)  # 30 second timeout
-        sock.connect((host, port))
-
-        # Send request (length-prefixed)
-        data = request.SerializeToString()
-        sock.sendall(struct.pack("<Q", len(data)))
-        sock.sendall(data)
-
-        # Receive response
-        size_data = b""
-        while len(size_data) < 8:
-            chunk = sock.recv(8 - len(size_data))
-            if not chunk:
-                sock.close()
-                return CancelResult(...)
-            size_data += chunk
-        ...
-        sock.close()
+        with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
+            sock.settimeout(30.0)  # 30 second timeout
+            sock.connect((host, port))
+
+            # Send request (length-prefixed)
+            data = request.SerializeToString()
+            sock.sendall(struct.pack("<Q", len(data)))
+            sock.sendall(data)
+
+            # Receive response
+            size_data = b""
+            while len(size_data) < 8:
+                chunk = sock.recv(8 - len(size_data))
+                if not chunk:
+                    return CancelResult(...)
+                size_data += chunk
+            ...
cpp/src/linear_programming/utilities/cuopt_remote_service.proto-210-221 (1)

210-221: Inconsistent job_id type: bytes vs string.

The IncumbentRequest.job_id and Incumbent.job_id fields use bytes type (lines 211, 220), while all other request messages use string type for job_id (e.g., StatusRequest.job_id at line 155, GetResultRequest.job_id at line 160).

This inconsistency could lead to encoding/decoding issues or require special handling in client/server code. If job IDs are always human-readable strings (like UUIDs), consider using string consistently:

 message IncumbentRequest {
-  bytes job_id = 1;
+  string job_id = 1;
   int64 from_index = 2;        // Return incumbents starting from this index
   int32 max_count = 3;         // Optional limit (0 or negative => no limit)
 }

 message Incumbent {
   int64 index = 1;
   double objective = 2;
   repeated double assignment = 3;
-  bytes job_id = 4;
+  string job_id = 4;
 }
cpp/src/linear_programming/utilities/remote_solve.cu-197-202 (1)

197-202: Log streaming thread captures stop_logs by reference with volatile qualifier.

The volatile bool stop_logs pattern is used to signal the log thread to terminate. While volatile prevents compiler optimizations, it does not guarantee memory visibility across threads on modern architectures. Consider using std::atomic<bool> for proper cross-thread synchronization:

🔒️ Suggested fix
-  volatile bool stop_logs = false;
+  std::atomic<bool> stop_logs{false};
   std::thread log_thread;
   if (settings.log_to_console) {
     log_thread =
-      std::thread([&]() { grpc_remote::stream_logs_to_stdout(address, job_id, &stop_logs, ""); });
+      std::thread([&]() { grpc_remote::stream_logs_to_stdout(address, job_id, stop_logs, ""); });
   }

Note: This requires updating stream_logs_to_stdout signature to accept std::atomic<bool>& or std::atomic<bool>*.

cpp/src/linear_programming/utilities/remote_solve.cu-38-47 (1)

38-47: Static variable logged_no_device is not thread-safe.

The static bool logged_no_device variable can cause a data race if copy_incumbent_to_device is called concurrently from multiple threads. While the race condition only affects logging (not correctness), it violates thread-safety expectations.

Consider using std::atomic<bool> or accepting duplicate log messages:

🔒️ Suggested fix
   int device_count       = 0;
   cudaError_t device_err = cudaGetDeviceCount(&device_count);
   if (device_err != cudaSuccess || device_count == 0) {
-    static bool logged_no_device = false;
+    static std::atomic<bool> logged_no_device{false};
-    if (!logged_no_device) {
+    bool expected = false;
+    if (logged_no_device.compare_exchange_strong(expected, true)) {
       CUOPT_LOG_INFO("[remote_solve] No CUDA device available; using host incumbents");
-      logged_no_device = true;
     }
     return false;
   }
cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp-44-51 (1)

44-51: Missing port range validation and broad exception catch.

The port parsing has two issues:

  1. No port range validation: Valid TCP ports are 1-65535. A value like 99999 or 0 would be accepted.
  2. Broad exception catch: catch (...) silently ignores all exceptions, including std::out_of_range (for values outside int range). Consider catching specific exceptions and optionally logging.
🛠️ Suggested fix
   if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
     try {
       int port_num = std::stoi(port);
+      if (port_num < 1 || port_num > 65535) {
+        return std::nullopt;  // Invalid port range
+      }
       return remote_solve_config_t{std::string(host), port_num};
-    } catch (...) {
-      // Invalid port number, fall back to local solve
+    } catch (const std::invalid_argument&) {
+      // Non-numeric port value
+      return std::nullopt;
+    } catch (const std::out_of_range&) {
+      // Port value out of int range
       return std::nullopt;
     }
   }
cpp/src/mip/solve.cu-486-494 (1)

486-494: Missing CUDA error checking on cudaMemcpyAsync and cudaStreamSynchronize.

Per coding guidelines, CUDA memory operations should have error checking. The cudaMemcpyAsync and cudaStreamSynchronize calls don't verify success:

🛠️ Suggested fix
   // Variable types need special handling (char array)
   auto var_types_span = gpu_view.get_variable_types();
   if (var_types_span.size() > 0) {
     cpu_data.variable_types.resize(var_types_span.size());
-    cudaMemcpyAsync(cpu_data.variable_types.data(),
-                    var_types_span.data(),
-                    var_types_span.size() * sizeof(char),
-                    cudaMemcpyDeviceToHost,
-                    stream);
+    cudaError_t err = cudaMemcpyAsync(cpu_data.variable_types.data(),
+                                      var_types_span.data(),
+                                      var_types_span.size() * sizeof(char),
+                                      cudaMemcpyDeviceToHost,
+                                      stream);
+    if (err != cudaSuccess) {
+      CUOPT_LOG_ERROR("Failed to copy variable types from device: %s", cudaGetErrorString(err));
+    }
   }

   // Synchronize to ensure all copies are complete
-  cudaStreamSynchronize(stream);
+  cudaError_t sync_err = cudaStreamSynchronize(stream);
+  if (sync_err != cudaSuccess) {
+    CUOPT_LOG_ERROR("Stream synchronization failed: %s", cudaGetErrorString(sync_err));
+  }
cpp/src/linear_programming/utilities/cython_solve.cu-241-305 (1)

241-305: Preserve solved_by_pdlp from remote results.

The remote LP path hard-codes solved_by_pdlp_ = false. Use term_info.solved_by_pdlp so callers see the actual solver used remotely.

🛠️ Proposed fix
-      lp_ret.solved_by_pdlp_     = false;
+      lp_ret.solved_by_pdlp_     = term_info.solved_by_pdlp;
cpp/src/linear_programming/utilities/protobuf_serializer.cu-702-719 (1)

702-719: Silent failure when embedding solution in result response.

When success is true but ParseFromArray fails (lines 707, 712), the response is created with SUCCESS status but without solution data. The client will receive a success response with missing data.

🐛 Recommended fix
       if (is_mip) {
         cuopt::remote::MIPSolution mip_sol;
         if (mip_sol.ParseFromArray(result_data.data(), result_data.size())) {
           result->mutable_mip_solution()->CopyFrom(mip_sol);
+        } else {
+          result->set_status(cuopt::remote::ERROR_INTERNAL);
+          result->set_error_message("Failed to parse MIP solution");
         }
       } else {
         cuopt::remote::LPSolution lp_sol;
         if (lp_sol.ParseFromArray(result_data.data(), result_data.size())) {
           result->mutable_lp_solution()->CopyFrom(lp_sol);
+        } else {
+          result->set_status(cuopt::remote::ERROR_INTERNAL);
+          result->set_error_message("Failed to parse LP solution");
         }
       }
cpp/src/linear_programming/solver_solution.cu-576-580 (1)

576-580: Inconsistent solved_by_pdlp state management.

get_solved_by_pdlp() returns the standalone solved_by_pdlp_ member, but write_additional_termination_statistics_to_file() (line 234) uses termination_stats_.solved_by_pdlp. These can become out of sync.

🐛 Recommended fix: Use single source of truth
 template <typename i_t, typename f_t>
 bool optimization_problem_solution_t<i_t, f_t>::get_solved_by_pdlp() const
 {
-  return solved_by_pdlp_;
+  return termination_stats_.solved_by_pdlp;
 }

 template <typename i_t, typename f_t>
 void optimization_problem_solution_t<i_t, f_t>::set_solved_by_pdlp(bool value)
 {
-  solved_by_pdlp_ = value;
+  termination_stats_.solved_by_pdlp = value;
 }

Then remove the solved_by_pdlp_ member from the class.

Also applies to: 622-626

🧹 Nitpick comments (29)
python/cuopt/cuopt/utilities/__init__.py (1)

15-26: Cache lazy imports after first access.

The current implementation triggers __getattr__ on every access. Standard practice is to cache the imported symbol in the module namespace so subsequent accesses bypass __getattr__.

♻️ Suggested improvement
 def __getattr__(name):
     """Lazy import CUDA-dependent utilities."""
     if name == "type_cast":
         from cuopt.utilities.type_casting import type_cast
+        globals()[name] = type_cast
         return type_cast
     elif name == "series_from_buf":
         from cuopt.utilities.utils import series_from_buf
+        globals()[name] = series_from_buf
         return series_from_buf
     elif name == "check_solution":
         from cuopt.utilities.utils import check_solution
+        globals()[name] = check_solution
         return check_solution
     raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
python/cuopt/cuopt/__init__.py (1)

19-24: Consider caching the imported module to avoid repeated __getattr__ overhead.

Without caching, every attribute access (e.g., cuopt.linear_programming) invokes __getattr__ and importlib.import_module. While sys.modules caches the module itself, the lookup overhead remains. The standard PEP 562 pattern caches in the module namespace:

♻️ Suggested improvement
 def __getattr__(name):
     """Lazy import submodules to support CPU-only hosts with remote solve."""
     if name in _submodules:
         import importlib
-        return importlib.import_module(f"cuopt.{name}")
+        module = importlib.import_module(f"cuopt.{name}")
+        globals()[name] = module
+        return module
     raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
cpp/libmps_parser/include/mps_parser/data_model_view.hpp (1)

409-426: Update parameter docs to reflect host/device flexibility.
Now that the view can represent CPU memory, the existing setter docs still say “Device memory pointer.” Consider updating the Doxygen text to “host or device” and referencing set_is_device_memory() for clarity.

python/libcuopt/CMakeLists.txt (1)

47-49: BUILD_SHARED_LIBS ON has global scope and may affect other dependencies.

Setting BUILD_SHARED_LIBS ON as a cache variable affects all subsequent add_library calls without explicit type, including dependencies fetched after this point. Consider scoping this more narrowly or documenting the intentional global effect.

♻️ Alternative: Use gRPC-specific variable if available

If gRPC respects gRPC_BUILD_SHARED_LIBS, use that instead:

-set(BUILD_SHARED_LIBS ON CACHE BOOL "" FORCE)
+# gRPC will build shared libs; avoid affecting other targets

Otherwise, document that this is intentional for the wheel build context.

cpp/scripts/run_dev_server.sh (1)

1-24: Add shell safety options and handle empty LD_LIBRARY_PATH.

The script would benefit from shell safety options and proper handling of an unset LD_LIBRARY_PATH:

♻️ Suggested improvements
 #!/bin/bash
 # SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
 # SPDX-License-Identifier: Apache-2.0
 
+set -euo pipefail
+
 # Development script to run the cuopt remote server with the correct library path
 # This is only needed during development when the build directory's libcuopt.so
 # needs to take precedence over the conda-installed version.
 
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 BUILD_DIR="${SCRIPT_DIR}/../build"
 
 if [ ! -f "${BUILD_DIR}/cuopt_remote_server" ]; then
     echo "Error: cuopt_remote_server not found. Please build with -DBUILD_REMOTE_SERVER=ON"
     exit 1
 fi
 
-export LD_LIBRARY_PATH="${BUILD_DIR}:${LD_LIBRARY_PATH}"
+export LD_LIBRARY_PATH="${BUILD_DIR}${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
 
 echo "Starting cuopt remote server..."
 echo "Build directory: ${BUILD_DIR}"

The ${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH} syntax appends :$LD_LIBRARY_PATH only if the variable is set and non-empty, avoiding a trailing colon.

ci/utils/install_protobuf_grpc.sh (1)

22-24: Silent failure on gRPC installation may cause confusing downstream errors.

The fallback chain ends with || true, which silently swallows installation failures. If gRPC packages aren't available, the build will fail later with a less obvious error about missing headers/libraries.

Consider adding a verification step or at least a warning:

Suggested improvement
         # gRPC C++ (headers/libs + grpc_cpp_plugin for codegen)
         # Package names can vary by repo; try the common ones first.
-        dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || true
+        dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || {
+            echo "WARNING: gRPC packages not found. gRPC-enabled builds may fail."
+        }
GRPC_ARCHITECTURE.md (2)

227-233: Clarify TLS/authentication timeline for production deployments.

The security section mentions TLS support "can enable later." For production deployments handling optimization data, consider documenting when TLS should be enabled (e.g., "TLS is recommended for any non-localhost deployment") to guide users on security posture.


183-188: Update roadmap to reflect current progress.

Phase 1 shows the branch creation as unchecked, but this PR indicates the implementation is already in progress. Consider updating the checklist to reflect actual status.

cpp/cuopt_cancel_job.cpp (1)

80-81: Option -h conflicts with common help convention.

Using -h for host conflicts with the common Unix convention where -h often means help. Consider using --host or -H instead, while keeping -h as an alias for --help.

python/cuopt/cuopt/linear_programming/remote.py (1)

22-28: Hardcoded development path is fragile.

The hardcoded relative path to cpp/build is development-specific and may not exist in production installations. This could cause import failures or silently fail to find the protobuf module.

♻️ Suggested improvement

Consider using a more robust approach or removing this entirely for production code:

-    # Add build directory for development
-    build_path = os.path.join(
-        os.path.dirname(__file__), "..", "..", "..", "..", "..", "cpp", "build"
-    )
-    if os.path.exists(build_path):
-        sys.path.insert(0, os.path.abspath(build_path))
+    # Development: set PYTHONPATH to include cpp/build if needed
     import cuopt_remote_pb2 as pb

Alternatively, wrap this in a if "CUOPT_DEV_MODE" in os.environ check.

build.sh (3)

454-495: CMake configuration duplicated from libcuopt block.

The cmake configuration (lines 461-483) is nearly identical to the libcuopt block (lines 425-446). This duplication increases maintenance burden and risks configuration drift.

Consider extracting shared cmake options into a variable or function:

# Example: Define common cmake options once
COMMON_CMAKE_OPTS=(
    -DDEFINE_ASSERT=${DEFINE_ASSERT}
    -DDEFINE_BENCHMARK="${DEFINE_BENCHMARK}"
    # ... other shared options
)

# Then use in both blocks:
cmake "${COMMON_CMAKE_OPTS[@]}" -DCUOPT_ENABLE_GRPC=ON "${REPODIR}"/cpp

517-531: SKBUILD_EXTRA_ARGS processing logic is duplicated.

The same semicolon-escaping and joining logic for SKBUILD_EXTRA_ARGS appears twice (for cuopt and cuopt_mps_parser). Consider extracting this into a shell function to reduce duplication:

♻️ Suggested refactor
# Define once before the build blocks:
function build_skbuild_cmake_args {
    local extra_args=()
    for extra_arg in "${EXTRA_CMAKE_ARGS[@]}"; do
        extra_args+=("${extra_arg//;/\\;}")
    done
    local joined=""
    if [ ${`#extra_args`[@]} -gt 0 ]; then
        joined="$(IFS=';'; echo "${extra_args[*]}")"
    fi
    local base="-DCMAKE_PREFIX_PATH=${INSTALL_PREFIX};-DCMAKE_LIBRARY_PATH=${LIBCUOPT_BUILD_DIR};-DCMAKE_CUDA_ARCHITECTURES=${CUOPT_CMAKE_CUDA_ARCHITECTURES}"
    if [ -n "$joined" ]; then
        echo "${base};${joined}"
    else
        echo "${base}"
    fi
}

# Then in each build block:
SKBUILD_CMAKE_ARGS="$(build_skbuild_cmake_args)"

Also applies to: 538-550


418-424: Cache invalidation on ZLIB path changes is a pragmatic workaround.

Automatically clearing CMakeCache.txt when it references system ZLIB is a reasonable fix to avoid stale cache issues. However, this could be surprising to developers who expect their cache to persist.

Consider adding a log message when the cache is cleared to improve visibility:

         if grep -Eq "/(usr/)?lib64/cmake/ZLIB" "${LIBCUOPT_BUILD_DIR}/CMakeCache.txt"; then
+            echo "CMake cache references system ZLIB; clearing cache to use conda ZLIB..."
             rm -f "${LIBCUOPT_BUILD_DIR}/CMakeCache.txt"
             rm -rf "${LIBCUOPT_BUILD_DIR}/CMakeFiles"
         fi
cpp/src/linear_programming/utilities/remote_solve.cu (1)

56-84: CUDA error handling uses raw return code checks instead of CUDA_CHECK macro.

Per coding guidelines, CUDA kernel launches and memory operations should use CUDA_CHECK or equivalent verification. The current implementation manually checks return codes, which is correct but inconsistent with the project's error handling patterns.

Consider using the project's standard CUDA error checking macros for consistency:

-  if (cudaMalloc(reinterpret_cast<void**>(d_assignment_out), n * sizeof(f_t)) != cudaSuccess) {
+  cudaError_t err = cudaMalloc(reinterpret_cast<void**>(d_assignment_out), n * sizeof(f_t));
+  if (err != cudaSuccess) {
     CUOPT_LOG_WARN("[remote_solve] Failed to cudaMalloc for incumbent assignment");
     return false;
   }

The current approach is functional, so this is a stylistic recommendation for consistency.

cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp (1)

25-28: Consider using uint16_t for port field.

TCP ports are unsigned 16-bit values (1-65535). Using uint16_t would make the valid range self-documenting:

 struct remote_solve_config_t {
   std::string host;
-  int port;
+  uint16_t port;
 };

This would require including <cstdint> and adjusting the parsing logic slightly.

cpp/cuopt_remote_server.cpp (2)

170-182: Potential race condition in stdout_streamer_t destructor.

The running_ flag is set to false after fflush(stdout) but the reader thread checks running_ in a loop. The close of pipe_fds_[0] is the actual mechanism that unblocks the read() call, so the logic works, but the flag assignment order could be clearer.

Consider moving running_ = false before any cleanup to signal intent:

   ~stdout_streamer_t()
   {
     if (!enabled_) return;
 
+    running_ = false;  // Signal thread to stop first
     fflush(stdout);
     dup2(original_stdout_, STDOUT_FILENO);
     close(original_stdout_);
 
-    running_ = false;
     close(pipe_fds_[0]);
 
     if (reader_thread_.joinable()) { reader_thread_.join(); }
   }

2272-2274: Detached client threads may cause shutdown issues.

Using detach() for client handler threads means they cannot be joined during shutdown. If shutdown occurs while clients are being processed, these threads may access freed resources (e.g., shared memory after cleanup_shared_memory()).

Consider using a thread pool or tracking active threads:

// Alternative: Use a bounded thread pool or track threads
std::vector<std::thread> client_threads;
// ... in accept loop:
client_threads.emplace_back([client_fd]() { handle_client(client_fd, config.stream_logs); });
// ... during shutdown, join all threads
python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx (1)

488-499: Empty warm-start arrays for CPU/remote solve path.

Returning empty arrays for warm-start data when data is not on GPU is a reasonable fallback. Consider adding a comment explaining that warm-start is not supported for remote solves.

             else:
-                # CPU/remote solve - no warm start data available
+                # CPU/remote solve - warm-start data is not available/supported
+                # (PDLP warm-start requires GPU-resident iterates)
                 current_primal_solution = np.array([], dtype=np.float64)
cpp/src/linear_programming/utilities/cuopt_remote.proto (1)

104-136: Important: Document iteration_limit sentinel value.

The comment on line 107-109 about using -1 as "unset" is crucial since proto3 defaults numeric fields to 0. This convention should be consistently applied and documented in client implementations.

Consider adding a similar comment for time_limit if 0.0 should mean "no limit" vs "instant timeout".

cpp/src/linear_programming/cuopt_c.cpp (1)

378-477: Consider extracting shared CPU/GPU population helpers.

The remote/local setup in cuOptCreateProblem, cuOptCreateRangedProblem, cuOptCreateQuadraticProblem, and cuOptCreateQuadraticRangedProblem is nearly identical. A small helper to populate problem_cpu_data_t and create the view would reduce drift and future bugs. Based on learnings, consider extracting the common population logic.

cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp (2)

274-306: Host accessor methods lack null-safety documentation.

The host accessors (get_primal_solution_host(), etc.) dereference unique_ptr members that may be null when is_device_memory() returns true. The documentation states "Only valid when is_device_memory() returns false" but consider adding [[nodiscard]] and/or assertions in the implementation to help catch misuse.


397-401: Potential inconsistency: duplicate solved_by_pdlp storage.

The class has both a standalone solved_by_pdlp_ member (line 509) and termination_stats_.solved_by_pdlp (in the struct at line 93). This duplication could lead to inconsistencies. Consider using only one source of truth—either delegate to termination_stats_.solved_by_pdlp or remove the duplicate.

Also applies to: 508-509

cpp/src/linear_programming/solver_solution.cu (1)

508-527: Inconsistent state possible with individual host setters.

Each setter (set_primal_solution_host, set_dual_solution_host, set_reduced_cost_host) independently sets is_device_memory_ = false. If a caller sets only one of these, the solution ends up in a mixed state where some data is on host, some may be uninitialized, but is_device_memory_ is false.

Consider either:

  1. Requiring all three to be set together (a single setter method)
  2. Documenting that callers must set all three
  3. Tracking initialization state per-vector
cpp/include/cuopt/linear_programming/utilities/remote_serialization.hpp (3)

37-40: Consider adding virtual clone() for polymorphic copying.

The abstract interface has a virtual destructor but no way to clone instances. If callers need to copy serializer instances (e.g., for use in multiple threads), consider adding a virtual clone method.


219-237: Nested enums/structs require verbose qualification.

The job_status_t enum and result structs are nested inside the class template, requiring verbose access like remote_serializer_t<int, double>::job_status_t. Consider moving these to namespace scope for easier use.


307-314: Use enum instead of magic integers for request types.

get_async_request_type() returns magic integers (0-6). Consider defining an enum for request types to improve type safety and readability:

♻️ Suggested improvement
enum class async_request_type_t {
  SUBMIT_JOB = 0,
  CHECK_STATUS = 1,
  GET_RESULT = 2,
  DELETE_RESULT = 3,
  GET_LOGS = 4,
  CANCEL_JOB = 5,
  WAIT_FOR_RESULT = 6,
  INVALID = -1
};

virtual async_request_type_t get_async_request_type(const std::vector<uint8_t>& data) = 0;
cpp/src/linear_programming/utilities/protobuf_serializer.cu (3)

1322-1324: Avoid const_cast - add const overload to get_reduced_cost().

The const_cast is used to work around the lack of a const overload for get_reduced_cost(). The header file (solver_solution.hpp) should add:

const rmm::device_uvector<f_t>& get_reduced_cost() const;

This would eliminate the need for the cast.


139-154: Duplicate MIP settings serialization code.

The MIP settings serialization logic is duplicated between serialize_mip_request() (lines 139-154) and serialize_async_mip_request() (lines 341-354). Extract this to a helper method to reduce duplication. Based on coding guidelines, code duplication should be refactored into shared utilities.

♻️ Suggested helper method
private:
  void serialize_mip_settings_to_proto(const mip_solver_settings_t<i_t, f_t>& settings,
                                       cuopt::remote::MIPSolverSettings* pb_settings)
  {
    pb_settings->set_time_limit(settings.time_limit);
    pb_settings->set_relative_mip_gap(settings.tolerances.relative_mip_gap);
    pb_settings->set_absolute_mip_gap(settings.tolerances.absolute_mip_gap);
    // ... rest of settings
  }

Then use it in both serialize_mip_request() and serialize_async_mip_request().

Also applies to: 341-354


194-217: Consider optimizing request type detection.

is_mip_request() potentially parses the same data multiple times (async request, MIP request, LP request). While functionally correct, this could be optimized by using a more efficient detection method or caching the parse result.

Comment on lines +289 to +297
void signal_handler(int signal)
{
if (signal == SIGINT || signal == SIGTERM) {
std::cout << "\n[Server] Received shutdown signal\n";
keep_running = false;
if (shm_ctrl) { shm_ctrl->shutdown_requested = true; }
result_cv.notify_all();
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Signal handler uses non-async-signal-safe functions.

The signal handler calls std::cout, accesses shm_ctrl, and calls result_cv.notify_all(). None of these are async-signal-safe. This can cause deadlocks or undefined behavior if the signal arrives while another thread holds a lock used by these operations.

🔒 Proposed fix using async-signal-safe pattern
+std::atomic<bool> shutdown_signal_received{false};
+
 void signal_handler(int signal)
 {
   if (signal == SIGINT || signal == SIGTERM) {
-    std::cout << "\n[Server] Received shutdown signal\n";
+    // Only use async-signal-safe operations
+    const char msg[] = "\n[Server] Received shutdown signal\n";
+    write(STDOUT_FILENO, msg, sizeof(msg) - 1);
+    shutdown_signal_received = true;
     keep_running = false;
-    if (shm_ctrl) { shm_ctrl->shutdown_requested = true; }
-    result_cv.notify_all();
+    // Let the main loop handle shm_ctrl and cv notification
   }
 }

Then in the main accept loop or a dedicated thread, check shutdown_signal_received and perform the non-safe operations there.

🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 289 - 297, The signal handler
signal_handler currently uses non-async-signal-safe operations (std::cout,
accessing shm_ctrl, and result_cv.notify_all()), so change it to only set an
atomic flag (e.g., a global volatile sig_atomic_t or std::atomic<bool>
shutdown_signal_received) and set keep_running = false if that is signal-safe;
remove direct access to shm_ctrl and result_cv from signal_handler. In the main
accept loop or a dedicated monitor thread, poll or wait on
shutdown_signal_received and there perform the non-safe actions: log the
shutdown message (std::cout), set shm_ctrl->shutdown_requested, and call
result_cv.notify_all(). Update references in code to use
shutdown_signal_received where signal_handler used to toggle shutdown behavior.

Comment on lines +303 to +313
std::string generate_job_id()
{
static std::random_device rd;
static std::mt19937 gen(rd());
static std::uniform_int_distribution<uint64_t> dis;

uint64_t id = dis(gen);
char buf[32];
snprintf(buf, sizeof(buf), "job_%016lx", id);
return std::string(buf);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Thread-safety issue: generate_job_id uses static variables without synchronization.

The function uses static std::mt19937 gen and static std::uniform_int_distribution<uint64_t> dis which are not thread-safe. Multiple client handler threads calling this concurrently will cause data races.

🔒 Proposed fix using thread-local or mutex
 std::string generate_job_id()
 {
-  static std::random_device rd;
-  static std::mt19937 gen(rd());
-  static std::uniform_int_distribution<uint64_t> dis;
+  static std::mutex mtx;
+  static std::random_device rd;
+  static std::mt19937 gen(rd());
+  static std::uniform_int_distribution<uint64_t> dis;
 
+  std::lock_guard<std::mutex> lock(mtx);
   uint64_t id = dis(gen);
   char buf[32];
   snprintf(buf, sizeof(buf), "job_%016lx", id);
   return std::string(buf);
 }

Alternatively, use thread_local for better performance:

std::string generate_job_id()
{
  thread_local std::random_device rd;
  thread_local std::mt19937 gen(rd());
  thread_local std::uniform_int_distribution<uint64_t> dis;
  // ...
}
🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 303 - 313, generate_job_id
currently declares static std::mt19937 gen and static
std::uniform_int_distribution<uint64_t> dis which are shared across threads and
cause data races; modify generate_job_id to use thread-local RNG state (e.g.,
make rd, gen, and dis thread_local) or protect the static generators with a
mutex so concurrent calls are safe; update the function implementation around
the symbols generate_job_id, gen, dis (and rd if present) to apply one of these
fixes and ensure generated IDs remain formatted the same.

Comment on lines +548 to +562
for (size_t i = 0; i < MAX_RESULTS; ++i) {
if (!result_queue[i].ready) {
strncpy(result_queue[i].job_id, job_id.c_str(), sizeof(result_queue[i].job_id) - 1);
result_queue[i].status = 2; // Cancelled status
result_queue[i].data_size = 0;
result_queue[i].shm_data_name[0] = '\0';
result_queue[i].worker_index = worker_id; // For pipe mode
strncpy(result_queue[i].error_message,
"Job was cancelled",
sizeof(result_queue[i].error_message) - 1);
result_queue[i].retrieved = false;
result_queue[i].ready = true;
break;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Race condition: Result queue slot allocation is not atomic.

Multiple workers can concurrently search for a free slot in result_queue. Two workers finding the same slot with !result_queue[i].ready could both attempt to write to it before either sets ready = true.

Consider using compare_exchange_strong similar to the job claiming pattern:

       for (size_t i = 0; i < MAX_RESULTS; ++i) {
-        if (!result_queue[i].ready) {
+        bool expected = false;
+        if (result_queue[i].ready.compare_exchange_strong(expected, true)) {
           strncpy(result_queue[i].job_id, job_id.c_str(), sizeof(result_queue[i].job_id) - 1);
           result_queue[i].status           = 2;  // Cancelled status
           // ... populate other fields ...
-          result_queue[i].ready     = true;
           break;
         }
       }

This same pattern should be applied to all result_queue slot allocations (lines 639-652, 719-739, 756-788, 997-1010).

🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 548 - 562, The current loop that
finds a free slot in result_queue sets fields non-atomically and races on
result_queue[i].ready; replace this linear check-and-write with an atomic claim
using an atomic flag and compare_exchange_strong (the same pattern used in the
job-claiming code path) so only one worker can win the slot before writing other
fields. Concretely, change the probe that tests result_queue[i].ready into an
atomic compare-and-set (e.g., attempt to swap from false to true using
std::atomic<bool> or an integer flag), and only on successful
compare_exchange_strong proceed to populate job_id, status, data_size,
shm_data_name, worker_index, error_message, retrieved, etc.; apply the same fix
to all other result_queue allocation sites referenced in the review (the blocks
around the given snippet and at the other line ranges noted) to ensure atomic
slot claiming.

Comment on lines 411 to 421
rmm::device_uvector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_primal_solution()
{
return primal_solution_;
return *primal_solution_;
}

template <typename i_t, typename f_t>
const rmm::device_uvector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_primal_solution()
const
{
return primal_solution_;
return *primal_solution_;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential null pointer dereference in device accessors.

The device memory accessors (get_primal_solution(), get_dual_solution(), get_reduced_cost()) dereference unique_ptr members without checking if they're valid. When is_device_memory_ is false (CPU-only solution), these pointers are null, leading to undefined behavior.

🔒️ Recommended fix: Add defensive checks or assertions
 template <typename i_t, typename f_t>
 rmm::device_uvector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_primal_solution()
 {
+  cuopt_assert(is_device_memory_ && primal_solution_, "Device accessor called on host-memory solution");
   return *primal_solution_;
 }

Alternatively, document the precondition clearly and ensure callers check is_device_memory() first.

🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/solver_solution.cu` around lines 411 - 421, The
device accessor methods (get_primal_solution(), get_dual_solution(),
get_reduced_cost()) currently dereference the unique_ptr members (e.g.,
primal_solution_) unconditionally and can dereference null when
is_device_memory_ is false; update each accessor to guard against null by
asserting or throwing a clear precondition (e.g., check is_device_memory_ or
that the unique_ptr is non-null) before dereferencing, and document the behavior
(or return/throw) so callers must call is_device_memory_ first when appropriate;
make the change inside the methods referenced above to validate
primal_solution_/dual_solution_/reduced_cost_ before returning *ptr.

Comment on lines +442 to 476
template <typename i_t, typename f_t>
std::vector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_primal_solution_host()
{
return *primal_solution_host_;
}

template <typename i_t, typename f_t>
const std::vector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_primal_solution_host() const
{
return *primal_solution_host_;
}

template <typename i_t, typename f_t>
std::vector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_dual_solution_host()
{
return *dual_solution_host_;
}

template <typename i_t, typename f_t>
const std::vector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_dual_solution_host() const
{
return *dual_solution_host_;
}

template <typename i_t, typename f_t>
std::vector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_reduced_cost_host()
{
return *reduced_cost_host_;
}

template <typename i_t, typename f_t>
const std::vector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_reduced_cost_host() const
{
return *reduced_cost_host_;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential null pointer dereference in host accessors.

The host memory accessors dereference unique_ptr members without null checks. When is_device_memory_ is true, these host pointers are null.

Consider adding runtime assertions or checks:

🔒️ Recommended fix
 template <typename i_t, typename f_t>
 std::vector<f_t>& optimization_problem_solution_t<i_t, f_t>::get_primal_solution_host()
 {
+  cuopt_assert(!is_device_memory_ && primal_solution_host_, "Host accessor called on device-memory solution");
   return *primal_solution_host_;
 }
🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/solver_solution.cu` around lines 442 - 476, The
host accessor methods of optimization_problem_solution_t
(get_primal_solution_host, get_dual_solution_host, get_reduced_cost_host and
their const overloads) dereference unique_ptr members without checking
is_device_memory_; add a runtime guard to prevent null dereference by asserting
or throwing when is_device_memory_ is true or when the corresponding unique_ptr
(primal_solution_host_, dual_solution_host_, reduced_cost_host_) is null; update
each accessor to check is_device_memory_ (or pointer) and produce a clear error
(e.g., assert/throw with a message) before returning *primal_solution_host_,
*dual_solution_host_, or *reduced_cost_host_.

Comment on lines +1303 to +1334
if (d_primal.size() > 0) {
std::vector<f_t> h_primal(d_primal.size());
cudaMemcpy(
h_primal.data(), d_primal.data(), d_primal.size() * sizeof(f_t), cudaMemcpyDeviceToHost);
for (const auto& v : h_primal) {
pb_solution->add_primal_solution(static_cast<double>(v));
}
}

// Copy dual solution from device
if (d_dual.size() > 0) {
std::vector<f_t> h_dual(d_dual.size());
cudaMemcpy(
h_dual.data(), d_dual.data(), d_dual.size() * sizeof(f_t), cudaMemcpyDeviceToHost);
for (const auto& v : h_dual) {
pb_solution->add_dual_solution(static_cast<double>(v));
}
}

// For reduced cost, we can access via const cast since we're just reading
auto& nc_solution = const_cast<optimization_problem_solution_t<i_t, f_t>&>(solution);
auto& d_reduced_cost = nc_solution.get_reduced_cost();
if (d_reduced_cost.size() > 0) {
std::vector<f_t> h_reduced_cost(d_reduced_cost.size());
cudaMemcpy(h_reduced_cost.data(),
d_reduced_cost.data(),
d_reduced_cost.size() * sizeof(f_t),
cudaMemcpyDeviceToHost);
for (const auto& v : h_reduced_cost) {
pb_solution->add_reduced_cost(static_cast<double>(v));
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing CUDA error checking on cudaMemcpy calls.

The cudaMemcpy calls in lp_solution_to_proto() don't check for errors. CUDA errors should be checked and handled, especially in serialization code where failures should be reported to callers.

As per coding guidelines, every CUDA operation must have error checking.

🐛 Recommended fix
-        cudaMemcpy(
-          h_primal.data(), d_primal.data(), d_primal.size() * sizeof(f_t), cudaMemcpyDeviceToHost);
+        auto err = cudaMemcpy(
+          h_primal.data(), d_primal.data(), d_primal.size() * sizeof(f_t), cudaMemcpyDeviceToHost);
+        if (err != cudaSuccess) {
+          CUOPT_LOG_ERROR(std::string("[protobuf_serializer] cudaMemcpy failed: ") + cudaGetErrorString(err));
+          return;  // or throw
+        }

Apply similar error checking to all cudaMemcpy calls in this file (lines 1305, 1315, 1327, 1399).

🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/utilities/protobuf_serializer.cu` around lines
1303 - 1334, lp_solution_to_proto currently calls cudaMemcpy on device buffers
(d_primal, d_dual, d_reduced_cost via nc_solution.get_reduced_cost()) without
checking for CUDA errors; update each cudaMemcpy call to capture and check the
returned cudaError_t and handle failures (e.g., log via processLogger/return an
error/throw) so serialization reports failures instead of silently continuing.
Locate the cudaMemcpy usages in lp_solution_to_proto (references: d_primal,
d_dual, nc_solution/get_reduced_cost, pb_solution) and wrap each call by storing
the result (cudaError_t err = cudaMemcpy(...)); if err != cudaSuccess, convert
the error to a clear error path (return an error code or throw an exception or
call the existing logger and abort the proto conversion) so callers receive the
failure.

Comment on lines +570 to +573
See the MsgPack serializer for a complete working example:
- `cpp/src/linear_programming/utilities/serializers/msgpack_serializer.cpp`
- `cpp/src/linear_programming/utilities/serializers/CMakeLists.txt`

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for msgpack serializer references
echo "=== Searching for msgpack serializer files ==="
fd -i "msgpack" --type f

echo -e "\n=== Checking serializers directory ==="
fd -t f "serializer" -x bash -c 'echo "{}"; head -1 "{}"' 2>/dev/null || true

echo -e "\n=== List all files in linear_programming/utilities/serializers/ ==="
find cpp/src/linear_programming/utilities/serializers/ -type f 2>/dev/null | head -20

Repository: NVIDIA/cuopt

Length of output: 271


🌐 Web query:

NVIDIA cuopt msgpack serializer implementation reference

💡 Result:

For cuOpt, the MessagePack (msgpack) serialization is implemented in the Python clients, not (publicly) in the core cuOpt engine source.

Primary reference (cuOpt client behavior):

  • The cuopt-sh-client package documentation states the client compresses request data using Python msgpack first, and falls back to pickle if msgpack serialization fails (with env-var overrides like CUOPT_PREFER_PICKLE / CUOPT_USE_ZLIB). [1]
  • The self-hosted service client API shows the client’s result_type default is mime_type.MSGPACK (i.e., responses are msgpack by default unless changed). [2]

What serializer implementation it’s based on (Python libraries):

  • cuopt-sh-client depends on msgpack and msgpack-numpy (so msgpack encoding/decoding is via the standard Python msgpack implementation, with numpy support via msgpack-numpy). [3]

If you need an “implementation reference” to the actual pack/unpack code paths, the closest authoritative references are therefore:

  1. cuopt-sh-client docs for the compression/selection logic [1], and
  2. the upstream Python libraries (msgpack + msgpack-numpy) that provide the serializer/deserializer implementation used by the client [3].

Sources: [1] [2] [3]


Update documentation: msgpack serializer is not a C++ reference implementation.

The guide incorrectly references msgpack_serializer.cpp as a working example. Based on the codebase, msgpack serialization is implemented in the Python client layer (cuopt-sh-client) using the standard msgpack and msgpack-numpy libraries, not as a C++ engine serializer. Remove or replace this reference with an accurate description of msgpack's role (Python client compression) or point to a serializer that actually exists in cpp/src/linear_programming/utilities/serializers/.

🤖 Prompt for AI Agents
In `@docs/developer/SERIALIZATION_PLUGIN_GUIDE.md` around lines 570 - 573, The doc
incorrectly points to
`cpp/src/linear_programming/utilities/serializers/msgpack_serializer.cpp` and
its `CMakeLists.txt` as a working MsgPack C++ example; remove or replace that
reference and instead explain that MsgPack serialization is implemented in the
Python client `cuopt-sh-client` using the `msgpack` and `msgpack-numpy`
libraries, or link to an actual C++ serializer that exists under
`cpp/src/linear_programming/utilities/serializers/` if you want a C++ example;
update the sentence and any bullets to reference `cuopt-sh-client` and the
Python libraries (`msgpack`, `msgpack-numpy`) or the correct C++ serializer file
name accordingly.

@coderabbitai
Copy link

coderabbitai bot commented Jan 27, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a complete gRPC-based remote solving architecture for cuOpt, enabling LP/MIP optimization problems to be solved on remote servers. Changes include comprehensive documentation, build system integration with gRPC and Protobuf dependencies, a full-featured remote solve server with async job management, protobuf-based serialization, updated C++/Python APIs supporting both local GPU and remote CPU solving paths, and lazy loading mechanisms for CPU-only hosts.

Changes

Cohort / File(s) Summary
Documentation & Status
GRPC_ARCHITECTURE.md, GRPC_BUILD_STATUS.md, GRPC_PROTOTYPE_STATUS.md, docs/developer/SERIALIZATION_PLUGIN_GUIDE.md, docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md
Added comprehensive documentation covering gRPC architecture, prototype status, build status, serialization plugin guide, and solution memory architecture for hybrid device/host storage.
Build System & Dependencies
build.sh, cpp/CMakeLists.txt, python/libcuopt/CMakeLists.txt, ci/build_wheel_libcuopt.sh, ci/utils/install_protobuf_grpc.sh, conda/environments/\*, conda/recipes/libcuopt/recipe.yaml, dependencies.yaml, .gitignore
Updated build scripts with new cuopt_grpc_server target, integrated gRPC and Protobuf as dependencies in CMake, conda, and CI configurations. Added Protobuf/gRPC installation script for Linux distributions and in-tree gRPC FetchContent.
Protocol Definitions
cpp/src/linear_programming/utilities/cuopt_remote.proto, cpp/src/linear_programming/utilities/cuopt_remote_service.proto
Introduced comprehensive protobuf schemas defining request/response messages for LP/MIP remote solving, async job management, streaming logs, result chunking, and status/cancellation operations.
Remote Server Implementation
cpp/cuopt_remote_server.cpp, cpp/cuopt_cancel_job.cpp, cpp/scripts/run_dev_server.sh
Implemented full-featured remote solve server with multi-process workers, IPC via pipes/shared memory, job lifecycle tracking, async API, streaming logs, and a CLI utility for job cancellation.
Serialization & Transport
cpp/src/linear_programming/utilities/protobuf_serializer.cu, cpp/src/linear_programming/utilities/remote_solve.cu, cpp/src/linear_programming/utilities/remote_solve_grpc.cpp, cpp/src/linear_programming/utilities/remote_solve_grpc.hpp
Implemented pluggable protobuf-based serializer for problem/solution exchange and gRPC client transport layer supporting chunked uploads, streaming results, and incumbent retrieval.
Core Solving Path Refactoring
cpp/src/linear_programming/solve.cu, cpp/src/linear_programming/cuopt_c.cpp, cpp/src/mip/solve.cu, cpp/libmps_parser/src/...
Refactored solve paths to support dual CPU/GPU memory handling via data_model_view_t, added remote vs local solve routing, and updated template instantiations for int64_t types.
Solution Object Updates
cpp/src/linear_programming/solver_solution.cu, cpp/src/mip/solver_solution.cu, cpp/include/cuopt/linear_programming/mip/solver_solution.hpp, cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp, cpp/src/mip/diversity/population.cu
Refactored solution storage to use unique_ptr-wrapped device vectors, added host memory variants, introduced is_device_memory() accessors, CPU-only constructors, and to_host() migration methods.
Public C++ API Additions
cpp/include/cuopt/linear_programming/solve.hpp, cpp/include/cuopt/linear_programming/data_model_view.hpp, cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp, cpp/include/cuopt/linear_programming/utilities/remote_serialization.hpp, cpp/include/cuopt/linear_programming/utilities/internals.hpp, cpp/include/cuopt/linear_programming/utilities/cython_solve.hpp
Added new public APIs for view-based solving (solve_lp, solve_mip with data_model_view_t), remote config/status enums, abstract serializer interface, memory location callbacks, and host-side solution accessors.
mps_parser Updates
cpp/libmps_parser/CMakeLists.txt, cpp/libmps_parser/include/mps_parser/data_model_view.hpp, cpp/libmps_parser/src/data_model_view.cpp, cpp/libmps_parser/src/mps_data_model.cpp
Added static library target mps_parser_static, introduced is_device_memory() flag to data_model_view_t, and updated template instantiations to int64_t types.
CLI & Cython Updates
cpp/cuopt_cli.cpp, cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp, cpp/src/linear_programming/utilities/cython_solve.cu, python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx, python/cuopt/cuopt/linear_programming/internals/internals.pyx
Added --print-grpc-max flag to CLI, crash handling infrastructure, lazy CUDA imports, remote solve detection, dual GPU/CPU data path handling in Cython wrappers, and ctypes-based numpy array conversion.
Python Remote API
python/cuopt/cuopt/__init__.py, python/cuopt/cuopt/linear_programming/__init__.py, python/cuopt/cuopt/linear_programming/remote.py, python/cuopt/cuopt/linear_programming/solver/solver.pxd, python/cuopt/cuopt/utilities/__init__.py, python/cuopt/cuopt/utilities/utils.py
Introduced JobStatus enum, CancelResult dataclass, and cancel_job() function for remote job management; implemented lazy-loading for submodules to support CPU-only hosts; updated solver return structures with host memory variants.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title '[Don't review]Remote execution 2 - testing CI' is vague and uses a placeholder format that doesn't clearly describe the changeset. It includes an instruction ('[Don't review]') rather than a clear summary, and 'testing CI' is generic. Replace the title with a clear summary of the main change, such as 'Add gRPC remote solve execution support with protobuf serialization' or similar, removing the '[Don't review]' directive and placeholder language.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx (1)

617-617: Misleading list initialization.

[] * len(py_data_model_list) returns an empty list [], not a pre-allocated list. This doesn't cause a bug since append() is used, but it's misleading code.

🐛 Fix the list initialization
-    solutions = [] * len(py_data_model_list)
+    solutions = []
cpp/include/cuopt/linear_programming/utilities/internals.hpp (1)

23-46: Add Doxygen documentation for new callback memory location API and integrate into callback implementations.

The new enum callback_memory_location and accessors (set_memory_location, get_memory_location, data_on_device) lack Doxygen documentation required for public header API. Additionally, the API is not yet integrated into callback implementations—Python callbacks in callbacks_implems.hpp and the gRPC callback never call set_memory_location(), relying entirely on the unsafe DEVICE default. This is problematic since Python callbacks should likely be HOST (they call get_numpy_array() for host memory).

Add brief Doxygen comments documenting:

  • callback_memory_location enum: purpose and valid values
  • set_memory_location(): when and why to call it for HOST-backed callbacks
  • get_memory_location() / data_on_device(): usage expectations

Then update callback constructors or initialization to properly set callback_memory_location::HOST for host-backed implementations (Python callbacks, gRPC incumbent pipe callback, etc.).

cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp (1)

43-52: Missing NULL check on PyObject_CallMethod result can cause crash.

If PyObject_CallMethod fails (e.g., the Python callback raises an exception), it returns NULL. Calling Py_DECREF(res) on a NULL pointer will crash. The same issue exists in set_solution at lines 86-90.

🐛 Proposed fix
     PyObject* res =
       PyObject_CallMethod(this->pyCallbackClass, "get_solution", "(OO)", numba_matrix, numpy_array);
     Py_DECREF(numba_matrix);
     Py_DECREF(numpy_array);
-    Py_DECREF(res);
+    if (res != nullptr) {
+      Py_DECREF(res);
+    } else {
+      // Log or handle the Python exception
+      PyErr_Print();
+    }
🤖 Fix all issues with AI agents
In `@build.sh`:
- Around line 517-531: The SKBUILD_CMAKE_ARGS environment variable is being
built but not exported, so scikit-build-core won't see it; change the assignment
of SKBUILD_CMAKE_ARGS (the variable constructed from INSTALL_PREFIX,
LIBCUOPT_BUILD_DIR, CUOPT_CMAKE_CUDA_ARCHITECTURES and optionally
SKBUILD_EXTRA_ARGS_JOINED) to export SKBUILD_CMAKE_ARGS before invoking python
"${PYTHON_ARGS_FOR_INSTALL[@]}" . and likewise export the second
SKBUILD_CMAKE_ARGS occurrence noted in the script so the python/scikit-build
process inherits the variable.

In `@ci/utils/install_protobuf_grpc.sh`:
- Around line 22-25: The current dnf install command that ends with "|| true"
can hide failures when grpc_cpp_plugin is not installed; remove the "|| true"
and instead run the install (the existing dnf install -y grpc-devel grpc-plugins
|| dnf install -y grpc-devel) and then explicitly verify the codegen binary
exists (check for grpc_cpp_plugin on PATH or the expected plugin path) and exit
non‑zero with a clear error if it is missing; update the block around the dnf
install line and the subsequent verification to fail fast when grpc_cpp_plugin
is unavailable.

In `@cpp/CMakeLists.txt`:
- Around line 296-306: The CMake logic currently forces gRPC by calling
find_package(gRPC CONFIG REQUIRED) and unconditionally setting CUOPT_ENABLE_GRPC
via add_compile_definitions, breaking gRPC-free builds; change this by
introducing a CMake option (e.g., option(CUOPT_ENABLE_GRPC "Enable gRPC support"
OFF)) and guard the existing code that checks TARGET gRPC::grpc++, calls
find_package(gRPC ...), and sets add_compile_definitions(CUOPT_ENABLE_GRPC)
behind that option so gRPC is only required when the option is ON (or, if you
intend to make gRPC mandatory, update project docs and remove the option but
keep a clear message); ensure the message(STATUS ...) reflects the chosen path
and that proto/codegen and targets that reference gRPC::grpc++ are only
configured when CUOPT_ENABLE_GRPC is enabled.

In `@cpp/cuopt_remote_server.cpp`:
- Around line 303-313: generate_job_id currently uses shared static RNG state
(static std::mt19937 gen and static std::uniform_int_distribution<uint64_t> dis)
which is not thread-safe; make the RNG state thread-local or otherwise
synchronized: replace those static variables with thread_local instances (e.g.,
thread_local std::mt19937 and thread_local distribution) or protect access with
a mutex inside generate_job_id, ensuring unique IDs remain formatted via
snprintf and returned as before; update references in generate_job_id
accordingly to avoid data races.

In `@cpp/libmps_parser/CMakeLists.txt`:
- Around line 84-85: The static target mps_parser_static is missing the same
configuration as the shared target; update CMake to apply set_target_properties
on mps_parser_static to set CXX_STANDARD, CXX_STANDARD_REQUIRED,
POSITION_INDEPENDENT_CODE, and CXX_SCAN_FOR_MODULES, add the same
target_compile_options used for the shared target to mps_parser_static, and
mirror the conditional include directories for compression libs by adding
target_include_directories(mps_parser_static PRIVATE BZip2::BZip2) when
MPS_PARSER_WITH_BZIP2 and target_include_directories(mps_parser_static PRIVATE
ZLIB::ZLIB) when MPS_PARSER_WITH_ZLIB so the static library builds with the same
flags and includes.

In `@cpp/src/linear_programming/solve.cu`:
- Around line 1207-1221: The code around cudaPointerGetAttributes/cudaMemcpy in
solve.cu does not check CUDA return codes; wrap the calls
(cudaPointerGetAttributes, cudaMemcpy, any cudaMemcpyAsync,
cudaStreamSynchronize used in the nearby blocks) with the project’s CUDA
error-check helper so failures are propagated and handled instead of ignored;
specifically, replace direct calls around var_types/host_var_types with the
helper (e.g., check the return of cudaPointerGetAttributes(&attrs,
var_types.data()) and on device path call checked
cudaMemcpy(host_var_types.data(), var_types.data(),
var_types.size()*sizeof(char), cudaMemcpyDeviceToHost), and on host path avoid
cudaGetLastError() and instead check/clear errors via the helper), and apply the
same pattern to the similar blocks at the other locations (~1260-1290 and
~1406-1414) so all cudaMemcpy/cudaMemcpyAsync/cudaStreamSynchronize calls use
the centralized CUDA error-check helper and propagate failures.
- Around line 1303-1367: The cpu_problem_data_t struct and its create_view()
method do not preserve row_types and warm-start information (initial primal and
dual solutions) during GPU to CPU serialization. Add vector member variables to
cpu_problem_data_t for row_types, initial_primal_solution, and
initial_dual_solution. Update the create_view() method to conditionally set
these fields on the view using appropriate setter methods (similar to how
constraint_bounds and other fields are set). Additionally, ensure the
copy_view_to_cpu() function copies these fields from the source view to the
cpu_problem_data_t struct so they are preserved during serialization.

In `@cpp/src/linear_programming/solver_solution.cu`:
- Around line 441-476: The host accessors get_primal_solution_host,
get_dual_solution_host and get_reduced_cost_host (both const and non-const
overloads) dereference pointers (primal_solution_host_, dual_solution_host_,
reduced_cost_host_) that can be null for GPU-only constructions; add explicit
null checks at the start of each of these methods and, if the respective pointer
is null, throw a clear std::runtime_error (or appropriate exception type used in
the project) describing which host vector is unavailable (e.g.
"primal_solution_host_ is null: host data not initialized"); this ensures
callers get a clear error instead of UB — alternatively, if preferred, lazily
allocate the host vector here before dereferencing, but be explicit about which
behavior you choose and implement it consistently across all six accessor
overloads.
- Around line 410-439: The device accessor methods in
optimization_problem_solution_t (get_primal_solution, get_dual_solution,
get_reduced_cost) dereference pointers that can be null for CPU-only instances;
add defensive checks at the start of each method to validate the corresponding
pointer (primal_solution_, dual_solution_, reduced_cost_) and either throw a
clear exception (e.g., std::runtime_error with a message like "device access on
CPU-only solution") or use an assert to fail fast, so callers get a meaningful
error instead of undefined behavior when the pointer is null.

In `@cpp/src/linear_programming/utilities/cuopt_remote.proto`:
- Around line 21-24: The proto declares IndexType with INT64 and allows clients
to set index_type = INT64, but the actual index fields (A_indices, A_offsets,
Q_indices, Q_offsets) are int32 causing potential truncation; either remove
INT64 from the IndexType enum or change all those fields to 64-bit (int64) to
match; update the message definitions referencing A_indices, A_offsets,
Q_indices, Q_offsets (and any other index-related fields) to use int64_t
equivalents and ensure any code generating/consuming index_type handles the
promoted types consistently when index_type == INT64.

In `@cpp/src/linear_programming/utilities/protobuf_serializer.cu`:
- Around line 1303-1333: The cudaMemcpy calls that copy device vectors d_primal,
d_dual and d_reduced_cost into host vectors h_primal, h_dual and h_reduced_cost
must be checked for errors: wrap each cudaMemcpy call with the project's CUDA
error-check helper (or check the returned cudaError_t and handle non-success
values), and on error log/propagate the failure instead of proceeding to add
values to pb_solution; update the copy sites that handle
d_primal/d_dual/d_reduced_cost (and the corresponding host buffers
h_primal/h_dual/h_reduced_cost) as well as the analogous copy at lines 1395-1402
to use the error-checking helper and return or abort serialization on failure.
- Around line 871-905: The code in the loop inside the constraint bounds
calculation assumes arrays row_types_span and b_span have at least n_rows
elements, causing possible out-of-bounds access when their size is smaller. To
fix this, add a conditional check in the loop for each i to verify that i is
within the size of row_types_span before accessing rt_ptr[i], otherwise default
row_type to 'E'. Similarly, check that i is within b_span size before accessing
b_ptr[i], otherwise default rhs to 0. This guards indexing to prevent OOB reads
while maintaining correct defaults.

In `@cpp/src/linear_programming/utilities/remote_solve_grpc.hpp`:
- Around line 72-77: The stop flag parameter for stream_logs_to_stdout is using
volatile which is not thread-safe; change the function signature to accept
std::atomic<bool>* stop_flag, update the implementation to use stop_flag->load()
(and store() where written) for loop checks and any writers, and update all call
sites to pass a std::atomic<bool> instance instead of a volatile bool pointer so
signaling is synchronized across threads (refer to function
stream_logs_to_stdout and any callers that set/clear the flag).

In `@cpp/src/linear_programming/utilities/remote_solve.cu`:
- Around line 41-45: The static flag logged_no_device in
copy_incumbent_to_device is written without synchronization causing a race;
replace it with a thread-safe mechanism (e.g., a std::once_flag + std::call_once
or an std::atomic<bool>) so the "[remote_solve] No CUDA device available; using
host incumbents" message is logged exactly once without data races—locate
logged_no_device and the copy_incumbent_to_device function, remove the
unsynchronized static write, and implement one of those thread-safe patterns to
perform the logging safely.
- Around line 219-227: The current status check in the polling loop does not
handle unexpected values like "UNKNOWN", leading to infinite polling. Update the
status handling in the relevant loop in the remote_solve function to treat any
unknown or unsupported status as a failure: set stop_logs to true, join the log
thread if joinable, delete the remote result, and return an error indicating the
unexpected status. Apply this fix to both code sections handling status checks
around lines 219-227 and 402-410.

In `@cpp/src/mip/solver_solution.cu`:
- Around line 159-169: mip_solution_t's get_solution_host() and its const
overload dereference solution_host_ which can be nullptr when constructed with
the GPU path; update both get_solution_host() methods to check solution_host_
before dereferencing (e.g., if nullptr, either lazily allocate/copy device data
into solution_host_ or throw a clear exception) and ensure the chosen behavior
is consistent with other accessors (refer to mip_solution_t constructor(s),
solution_device_, and any get_solution_device() logic) so callers won't
dereference a null pointer.
- Around line 148-157: The get_solution() methods currently dereference
solution_ which can be nullptr for objects constructed via the CPU-only
constructor; modify mip_solution_t<i_t, f_t>::get_solution() (both const and
non-const overloads) to guard against null solution_—e.g., add an assertion or
throw a clear exception when solution_ == nullptr (or return/obtain an
appropriate CPU-side container if that is the intended behavior) so callers no
longer dereference a null pointer; ensure the guard message references solution_
and the class/method name for easy tracing.
🟡 Minor comments (17)
python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx-171-185 (1)

171-185: The fallback to type(obj) will fail when calling .astype() on non-array types.

The code falls back to type(obj) when the input lacks both dtype and dtypes attributes. While this comparison works fine with np.issubdtype(), line 184 calls obj.astype() which fails for Python scalars (e.g., int, float) that lack this method. For example, if a scalar 5 reaches this fallback, obj.astype() raises AttributeError: 'int' object has no attribute 'astype'.

Although current usage only passes numpy arrays or Series (which have .astype()), the function should explicitly reject unsupported types rather than silently fail at line 184.

🐛 Suggested fix
     elif hasattr(obj, 'dtypes'):
         # DataFrame-like object
         if all([np.issubdtype(dtype, np.number) for dtype in obj.dtypes]):  # noqa
             obj_type = obj.dtypes[0]
         else:
             msg = "All columns in " + name + " should be numeric"
             raise Exception(msg)
     else:
-        obj_type = type(obj)
+        raise Exception(
+            f"{name} must be a numpy array or have a dtype attribute"
+        )
GRPC_BUILD_STATUS.md-66-70 (1)

66-70: Add language specifier to fenced code block.

The code block at lines 66-70 is missing a language identifier. For environment/version output, use text or plaintext:

-```
+```text
 libgrpc: 1.73.1
 libprotobuf: 6.31.1
 protobuf (pip): 6.33.2

</blockquote></details>
<details>
<summary>docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md-73-78 (1)</summary><blockquote>

`73-78`: **Specify a language for the ASCII diagram fences.**

This removes MD040 warnings and keeps formatting stable.

<details>
<summary>🔧 Suggested fix</summary>

```diff
-```
+```text
 ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
 │   Solver    │ ──► │  Solution   │ ──► │    User     │
 │   (GPU)     │     │ (GPU mem)   │     │ (GPU/CPU)   │
 └─────────────┘     └─────────────┘     └─────────────┘

```diff
-```
+```text
 ┌──────────┐     ┌──────────────────────────────────────────┐     ┌──────────┐
 │  Client  │ ──► │              SERVER                      │ ◄── │  Client  │
 │ (no GPU) │     │ GPU solve → to_host() → serialize        │     │(solution)│
 └──────────┘     └──────────────────────────────────────────┘     └──────────┘
</details>



Also applies to: 90-95

</blockquote></details>
<details>
<summary>GRPC_PROTOTYPE_STATUS.md-130-144 (1)</summary><blockquote>

`130-144`: **Add a language tag to the expected-output fence.**

markdownlint flags the unlabeled fence; using `text` (or similar) keeps it compliant and clearer.

<details>
<summary>🔧 Suggested fix</summary>

```diff
-```
+```text
 [Client] Job submitted successfully
 [Client] Job ID: <hex-id>
 [Client] Polling for job completion...
 [Client] Job status: QUEUED - Job queued
 [Client] Job status: PROCESSING - Job processing
 [Client] Job status: COMPLETED - Job completed
 [Client] Retrieving result...
 [Client] Result retrieved successfully
 [Client] LP Solution:
 [Client]   Status: 2  (PDLP_OPTIMAL)
 [Client]   Objective: 42.0  (dummy value)
 [Client] Test completed successfully!
</details>

</blockquote></details>
<details>
<summary>cpp/cuopt_cancel_job.cpp-18-18 (1)</summary><blockquote>

`18-18`: **Add cuopt_cancel_job.cpp to the build target.**

The file `cpp/cuopt_cancel_job.cpp` is not registered in any CMakeLists.txt. The header `<cuopt/linear_programming/utilities/remote_solve.hpp>` exists and is correctly located; however, the source file itself must be added to a build target (either `cuopt` or `cuopt_cli`) for compilation to succeed. Verify which target should own this utility and add it to the appropriate `add_executable()` or source list in `cpp/CMakeLists.txt`.

</blockquote></details>
<details>
<summary>cpp/cuopt_cancel_job.cpp-70-84 (1)</summary><blockquote>

`70-84`: **Harden port parsing and range validation.**

`std::stoi` can throw on invalid input, and `atoi` silently accepts bad values. Validate ports and surface a clean error.

<details>
<summary>🔧 Suggested fix</summary>

```diff
+#include <limits>
+
+bool parse_port(const char* s, int& out)
+{
+  if (!s || !*s) return false;
+  char* end = nullptr;
+  long v    = std::strtol(s, &end, 10);
+  if (*end != '\0' || v < 1 || v > 65535) return false;
+  out = static_cast<int>(v);
+  return true;
+}
+
 int main(int argc, char** argv)
 {
   // Parse arguments
   std::string job_id;
   std::string host = "localhost";
   int port         = 9090;
@@
   const char* env_host = std::getenv("CUOPT_REMOTE_HOST");
   const char* env_port = std::getenv("CUOPT_REMOTE_PORT");
   if (env_host && env_host[0]) { host = env_host; }
-  if (env_port && env_port[0]) { port = std::atoi(env_port); }
+  if (env_port && env_port[0]) {
+    if (!parse_port(env_port, port)) {
+      std::cerr << "Error: Invalid CUOPT_REMOTE_PORT value\n";
+      return 1;
+    }
+  }
@@
-    } else if (strcmp(argv[i], "-p") == 0 && i + 1 < argc) {
-      port = std::stoi(argv[++i]);
+    } else if (strcmp(argv[i], "-p") == 0 && i + 1 < argc) {
+      if (!parse_port(argv[++i], port)) {
+        std::cerr << "Error: Invalid port value\n\n";
+        print_usage(argv[0]);
+        return 1;
+      }
     } else if (argv[i][0] != '-') {
       job_id = argv[i];
     }
   }
cpp/libmps_parser/include/mps_parser/data_model_view.hpp-409-426 (1)

409-426: Ensure all view construction paths explicitly set is_device_memory flag for clarity and consistency.

The flag defaults to false (CPU memory), which is correct for MPS data. However, cpp/cuopt_cli.cpp's create_view_from_mps_data_model() relies on this default without explicitly calling set_is_device_memory(false), unlike the equivalent functions in cpp/src/linear_programming/solve.cu (line 1167) and cpp/src/mip/solve.cu (line 508), which explicitly set the flag. For consistency and defensive clarity, all view creation helpers should explicitly set the flag after construction rather than relying on defaults.

cpp/src/linear_programming/utilities/cuopt_remote_service.proto-210-221 (1)

210-221: Inconsistent job_id type: bytes vs string.

IncumbentRequest.job_id (line 211) and Incumbent.job_id (line 221) use bytes, while all other messages (StatusRequest, GetResultRequest, ResultChunk, DeleteRequest, CancelRequest, WaitRequest, StreamLogsRequest) use string for job_id.

This inconsistency may cause confusion and require type conversions in client code. Consider standardizing on one type (typically string for human-readable identifiers, or bytes if UUIDs are stored as raw binary).

💡 Suggested fix to use string consistently
 message IncumbentRequest {
-  bytes job_id = 1;
+  string job_id = 1;
   int64 from_index = 2;        // Return incumbents starting from this index
   int32 max_count = 3;         // Optional limit (0 or negative => no limit)
 }

 message Incumbent {
   int64 index = 1;
   double objective = 2;
   repeated double assignment = 3;
-  bytes job_id = 4;
+  string job_id = 4;
 }
cpp/cuopt_remote_server.cpp-548-562 (1)

548-562: Potential result queue slot exhaustion when storing cancelled job results.

When a cancelled job stores its result, if all result queue slots are full (all have ready == true), the loop exits without storing the result, leaving the job in an inconsistent state. Consider adding error handling or logging.

docs/developer/SERIALIZATION_PLUGIN_GUIDE.md-374-377 (1)

374-377: Add language specifier to fenced code block.

Per static analysis (MD040), use text or plaintext for console output examples.

📝 Proposed fix
-```
+```text
 [remote_solve] Loading custom serializer from: /path/to/libcuopt_my_serializer.so
 [remote_solve] Using custom serializer: my_format
</details>

</blockquote></details>
<details>
<summary>docs/developer/SERIALIZATION_PLUGIN_GUIDE.md-13-25 (1)</summary><blockquote>

`13-25`: **Add language specifier to fenced code block.**

Per static analysis (MD040), fenced code blocks should have a language specified. For ASCII diagrams, use `text` or `plaintext`.


<details>
<summary>📝 Proposed fix</summary>

```diff
-```
+```text
 ┌─────────────┐                              ┌─────────────┐
 │   Client    │                              │   Server    │
docs/developer/SERIALIZATION_PLUGIN_GUIDE.md-182-197 (1)

182-197: Incomplete method stubs may cause compilation errors.

The deserialize_lp_solution and deserialize_mip_solution methods lack return statements. Developers copying this template will encounter compilation errors. Add explicit return statements with placeholder solutions or // TODO comments.

📝 Proposed fix
   optimization_problem_solution_t<i_t, f_t> deserialize_lp_solution(
     const std::vector<uint8_t>& data) override
   {
     // Parse the solution data
     // Create and return solution object
+    // TODO: Implement actual deserialization logic
+    return optimization_problem_solution_t<i_t, f_t>();

     // On error, return error solution:
     // return optimization_problem_solution_t<i_t, f_t>(
     //   cuopt::logic_error("Parse error", cuopt::error_type_t::RuntimeError));
   }

   mip_solution_t<i_t, f_t> deserialize_mip_solution(
     const std::vector<uint8_t>& data) override
   {
     // Similar to LP solution
+    // TODO: Implement actual deserialization logic
+    return mip_solution_t<i_t, f_t>();
   }
conda/recipes/libcuopt/recipe.yaml-152-153 (1)

152-153: Add grpc-cpp to the run requirements for the libcuopt package.

The cuopt_grpc_server executable is installed as part of the libcuopt package and requires gRPC at runtime. Even though grpc-cpp is linked privately during the build, the installed binary still needs the library to function. This should follow the same pattern as libprotobuf, which is also in the run section despite being privately linked.

cpp/src/mip/solve.cu-482-494 (1)

482-494: Missing CUDA error checking on cudaMemcpyAsync and cudaStreamSynchronize.

Per coding guidelines, every CUDA memory operation must have error checking. The cudaMemcpyAsync and cudaStreamSynchronize calls should be wrapped with error verification.

🔧 Proposed fix
   // Variable types need special handling (char array)
   auto var_types_span = gpu_view.get_variable_types();
   if (var_types_span.size() > 0) {
     cpu_data.variable_types.resize(var_types_span.size());
-    cudaMemcpyAsync(cpu_data.variable_types.data(),
-                    var_types_span.data(),
-                    var_types_span.size() * sizeof(char),
-                    cudaMemcpyDeviceToHost,
-                    stream);
+    CUOPT_CUDA_TRY(cudaMemcpyAsync(cpu_data.variable_types.data(),
+                                    var_types_span.data(),
+                                    var_types_span.size() * sizeof(char),
+                                    cudaMemcpyDeviceToHost,
+                                    stream));
   }
 
   // Synchronize to ensure all copies are complete
-  cudaStreamSynchronize(stream);
+  CUOPT_CUDA_TRY(cudaStreamSynchronize(stream));
cpp/src/linear_programming/cuopt_c.cpp-364-375 (1)

364-375: Potential memory leak if mps_data_model_to_optimization_problem throws.

Unlike cuOptCreateProblem and similar functions which wrap GPU operations in try/catch and delete problem_and_stream on exception, this code path lacks exception handling. If the GPU allocation fails, problem_and_stream will leak.

🔧 Proposed fix
   } else {
     // Local: store in GPU memory using existing mps_data_model_to_optimization_problem
-    problem_and_stream->gpu_problem =
-      std::make_unique<optimization_problem_t<cuopt_int_t, cuopt_float_t>>(
-        mps_data_model_to_optimization_problem(problem_and_stream->get_handle_ptr(),
-                                               *mps_data_model_ptr));
-    // Create view pointing to GPU data
-    problem_and_stream->create_view_from_gpu_problem();
+    try {
+      problem_and_stream->gpu_problem =
+        std::make_unique<optimization_problem_t<cuopt_int_t, cuopt_float_t>>(
+          mps_data_model_to_optimization_problem(problem_and_stream->get_handle_ptr(),
+                                                 *mps_data_model_ptr));
+      // Create view pointing to GPU data
+      problem_and_stream->create_view_from_gpu_problem();
+    } catch (const std::exception& e) {
+      CUOPT_LOG_INFO("Error creating GPU problem: %s", e.what());
+      delete problem_and_stream;
+      *problem_ptr = nullptr;
+      return CUOPT_INVALID_ARGUMENT;
+    }
   }
python/cuopt/cuopt/linear_programming/remote.py-141-217 (1)

141-217: Socket resource leak on exception paths.

If an exception occurs after socket creation (line 143) but before sock.close() (line 173), the socket may leak. Use a context manager or try/finally to ensure proper cleanup.

🔧 Proposed fix using context manager
     try:
         # Connect to server
-        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
-        sock.settimeout(30.0)  # 30 second timeout
-        sock.connect((host, port))
-
-        # Send request (length-prefixed)
-        data = request.SerializeToString()
-        sock.sendall(struct.pack("<Q", len(data)))
-        sock.sendall(data)
-
-        # Receive response
-        size_data = b""
-        while len(size_data) < 8:
-            chunk = sock.recv(8 - len(size_data))
-            if not chunk:
-                sock.close()
-                return CancelResult(
-                    success=False,
-                    message="Failed to receive response size",
-                    job_status=JobStatus.NOT_FOUND,
-                )
-            size_data += chunk
-
-        size = struct.unpack("<Q", size_data)[0]
-        response_data = b""
-        while len(response_data) < size:
-            chunk = sock.recv(size - len(response_data))
-            if not chunk:
-                break
-            response_data += chunk
-
-        sock.close()
+        with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
+            sock.settimeout(30.0)  # 30 second timeout
+            sock.connect((host, port))
+
+            # Send request (length-prefixed)
+            data = request.SerializeToString()
+            sock.sendall(struct.pack("<Q", len(data)))
+            sock.sendall(data)
+
+            # Receive response
+            size_data = b""
+            while len(size_data) < 8:
+                chunk = sock.recv(8 - len(size_data))
+                if not chunk:
+                    return CancelResult(
+                        success=False,
+                        message="Failed to receive response size",
+                        job_status=JobStatus.NOT_FOUND,
+                    )
+                size_data += chunk
+
+            size = struct.unpack("<Q", size_data)[0]
+            response_data = b""
+            while len(response_data) < size:
+                chunk = sock.recv(size - len(response_data))
+                if not chunk:
+                    break
+                response_data += chunk
cpp/src/linear_programming/utilities/protobuf_serializer.cu-702-715 (1)

702-715: Treat result_data parse failures as errors.
If parsing result_data fails, the response still reports SUCCESS but carries no solution payload, which downstream treats as a missing-solution error. Please flip the status to an error and return a message when parsing fails.

🔧 Suggested fix
       if (is_mip) {
         cuopt::remote::MIPSolution mip_sol;
-        if (mip_sol.ParseFromArray(result_data.data(), result_data.size())) {
-          result->mutable_mip_solution()->CopyFrom(mip_sol);
-        }
+        if (!mip_sol.ParseFromArray(result_data.data(), result_data.size())) {
+          result->set_status(cuopt::remote::ERROR_INTERNAL);
+          result->set_error_message("Failed to parse MIP result payload");
+        } else {
+          result->mutable_mip_solution()->CopyFrom(mip_sol);
+        }
       } else {
         cuopt::remote::LPSolution lp_sol;
-        if (lp_sol.ParseFromArray(result_data.data(), result_data.size())) {
-          result->mutable_lp_solution()->CopyFrom(lp_sol);
-        }
+        if (!lp_sol.ParseFromArray(result_data.data(), result_data.size())) {
+          result->set_status(cuopt::remote::ERROR_INTERNAL);
+          result->set_error_message("Failed to parse LP result payload");
+        } else {
+          result->mutable_lp_solution()->CopyFrom(lp_sol);
+        }
       }
🧹 Nitpick comments (33)
python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx (5)

54-56: Unused import and module-level pyarrow.

InputValidationError is imported but doesn't appear to be used in this file. Additionally, pyarrow is imported at module level but only used for pa.float64() as a dtype specifier. Consider:

  1. Removing unused InputValidationError if not needed.
  2. Using np.float64 instead of pa.float64() since numpy is already imported and they're equivalent for dtype purposes.

59-79: Consider validating port range.

The port validation only checks if it's an integer, but valid ports are 1-65535. A port value of -1 or 99999 would pass validation but fail at connection time.

♻️ Suggested improvement
     if host and port:
         try:
-            int(port)  # Validate port is a valid integer
-            return True
+            port_num = int(port)
+            return 1 <= port_num <= 65535
         except ValueError:
             return False
     return False

82-91: Repeated lazy imports could be cached.

While Python caches imports internally, _get_cuda_imports() is called repeatedly (once per _convert_device_buffer_to_numpy call). For a solution with many arrays, this adds overhead. Consider caching at module level:

♻️ Suggested caching pattern
+_cuda_imports_cache = None
+
 def _get_cuda_imports():
     """Lazily import CUDA-dependent modules.

     Only call this when GPU operations are actually needed.
     """
+    global _cuda_imports_cache
+    if _cuda_imports_cache is not None:
+        return _cuda_imports_cache
     import cupy as cp
     import cudf
     from cuopt.utilities import series_from_buf
-    return cp, cudf, series_from_buf
+    _cuda_imports_cache = (cp, cudf, series_from_buf)
+    return _cuda_imports_cache

139-152: LGTM with a minor error message update.

The dual-path logic for CPU/GPU arrays is correct. The error message could be updated to reflect the broader support:

♻️ Suggested message update
     else:
         raise Exception(
-            "get_data_ptr must be called with cudf.Series or np.ndarray"
+            "get_data_ptr must be called with np.ndarray or CUDA array"
         )

488-498: Document the warm start limitation for remote solves.

The code silently returns empty arrays for warm start data in CPU/remote solves. Users expecting warm start functionality for remote solves may be surprised by this behavior. Consider adding a warning or documenting this limitation.

📝 Optional: Add warning for better UX
             else:
                 # CPU/remote solve - no warm start data available
+                import warnings
+                warnings.warn(
+                    "Warm start data is not available for CPU/remote solves",
+                    stacklevel=2
+                )
                 current_primal_solution = np.array([], dtype=np.float64)

Note: This may be intentional and documented elsewhere. Verify if users are expected to know about this limitation.

#!/bin/bash
# Check if warm start limitation is documented
rg -i "warm.?start" --type rst --type md -C2 | head -50
conda/environments/all_cuda-129_arch-x86_64.yaml (1)

28-28: Consider pinning compatible versions for grpc-cpp and libprotobuf.

The GRPC_BUILD_STATUS.md in this PR documents version conflicts between grpc-cpp (which requires protobuf 3.x) and libprotobuf 6.x. Without explicit version constraints here, CI builds may encounter the same conflicts.

Per the recommendation in GRPC_BUILD_STATUS.md, consider pinning compatible versions:

- grpc-cpp
+ grpc-cpp=1.62
...
- libprotobuf
+ libprotobuf=5.27

Alternatively, ensure dependencies.yaml (the source file) specifies compatible version ranges.

Also applies to: 36-36

python/cuopt/cuopt/linear_programming/internals/internals.pyx (1)

52-57: Memory lifetime responsibility should be documented.

The ctypes.from_address() approach creates a view over raw memory without ownership. The returned numpy array will become invalid if the underlying memory is deallocated.

This is appropriate for callback contexts where the memory is guaranteed valid during the callback, but consider adding a brief comment clarifying the lifetime contract:

     def get_numpy_array(self, data, shape, typestr):
+        # Note: Returned array is a view over `data` address - caller must
+        # ensure the underlying memory remains valid for the array's lifetime.
         ctype = ctypes.c_float if typestr == "float32" else ctypes.c_double
         buf_type = ctype * shape
         buf = buf_type.from_address(data)
         numpy_array = np.ctypeslib.as_array(buf)
         return numpy_array
cpp/include/cuopt/linear_programming/mip/solver_solution.hpp (1)

80-85: Consider adding a precondition check in get_solution_host() accessors.

The documentation states these are "only valid when is_device_memory() returns false." Consider adding an assertion or throwing an exception when the precondition is violated to catch misuse early rather than returning invalid/empty data.

💡 Suggested defensive check
std::vector<f_t>& get_solution_host() {
  cuopt_expects(!is_device_memory_, error_type_t::ValidationError,
                "get_solution_host() called but data is in device memory");
  return *solution_host_;
}
GRPC_ARCHITECTURE.md (1)

101-120: Minor: Python example stream_logs is incomplete.

The stream_logs method references an undefined request variable. Consider completing the example for clarity.

📝 Suggested fix
     def stream_logs(self, job_id: str):
         """Generator that yields log lines as they arrive"""
+        request = cuopt_remote_pb2.StreamLogsRequest(job_id=job_id)
         for log_msg in self.stub.StreamLogs(request):
             yield log_msg.line
cpp/src/linear_programming/utilities/cython_solve.cu (1)

249-256: Consider using std::any_of for LP/MIP detection.

The manual loop to detect MIP problems can be simplified using standard algorithms for better readability.

♻️ Suggested refactor
-    // Determine if LP or MIP based on variable types
-    bool is_mip    = false;
-    auto var_types = data_model->get_variable_types();
-    for (size_t i = 0; i < var_types.size(); ++i) {
-      if (var_types.data()[i] != 'C') {
-        is_mip = true;
-        break;
-      }
-    }
+    // Determine if LP or MIP based on variable types
+    auto var_types = data_model->get_variable_types();
+    bool is_mip = std::any_of(var_types.data(), 
+                               var_types.data() + var_types.size(),
+                               [](char c) { return c != 'C'; });
build.sh (2)

454-495: Duplicate CMake configuration violates DRY principle.

The cmake invocation for cuopt_grpc_server (lines 461-483) duplicates nearly all arguments from the libcuopt build (lines 425-446). This creates a maintenance burden where changes must be made in two places.

Consider extracting common cmake arguments into a bash array variable and reusing it in both places.

♻️ Suggested refactor approach
# Define common cmake args once
COMMON_CMAKE_ARGS=(
    -DDEFINE_ASSERT=${DEFINE_ASSERT}
    -DDEFINE_BENCHMARK="${DEFINE_BENCHMARK}"
    # ... other common args
)

# Use in libcuopt build
cmake "${COMMON_CMAKE_ARGS[@]}" "${EXTRA_CMAKE_ARGS[@]}" "${REPODIR}"/cpp

# Use in cuopt_grpc_server build (add -DCUOPT_ENABLE_GRPC=ON)
cmake "${COMMON_CMAKE_ARGS[@]}" -DCUOPT_ENABLE_GRPC=ON "${EXTRA_CMAKE_ARGS[@]}" "${REPODIR}"/cpp

489-493: Add error handling for gRPC server installation.

The install command doesn't check if the build produced the executable before attempting to copy it.

💡 Suggested fix
     if [ -z "${INSTALL_TARGET}" ]; then
         echo "Skipping install of cuopt_grpc_server (-n flag set)"
     else
+        if [ ! -f "${LIBCUOPT_BUILD_DIR}/cuopt_grpc_server" ]; then
+            echo "Error: cuopt_grpc_server binary not found in ${LIBCUOPT_BUILD_DIR}"
+            exit 1
+        fi
+        mkdir -p "${INSTALL_PREFIX}/bin/"
         install -m 755 "${LIBCUOPT_BUILD_DIR}/cuopt_grpc_server" "${INSTALL_PREFIX}/bin/"
         echo "Installed cuopt_grpc_server to ${INSTALL_PREFIX}/bin/"
     fi
cpp/cuopt_cli.cpp (2)

240-242: Potential null pointer dereference when is_remote_solve is false but handle creation fails.

If std::make_unique<raft::handle_t>() throws an exception, handle_ptr remains nullptr. While the exception will propagate, consider adding explicit error handling for CUDA initialization failures.


508-554: Silent catch block may hide parsing errors.

The try/catch block at lines 514-522 silently ignores exceptions when parsing CUOPT_GRPC_MAX_MESSAGE_MB. Consider logging a warning when the environment variable contains an invalid value.

Suggested improvement
       try {
         int64_t mb = std::stoll(env_mb);
         if (mb <= 0) {
           client_effective_bytes = -1;
         } else {
           client_effective_bytes = mb * kMiB;
         }
-      } catch (...) {
+      } catch (const std::exception& e) {
+        std::cerr << "Warning: Invalid CUOPT_GRPC_MAX_MESSAGE_MB value '" << env_mb
+                  << "', using default\n";
       }
cpp/cuopt_remote_server.cpp (3)

514-525: Race condition when claiming jobs from the queue.

The atomic compare_exchange_strong on claimed is good, but there's a TOCTOU window between checking job_queue[i].ready && !job_queue[i].claimed and the compare_exchange. Multiple workers could pass the initial check before any succeeds at the CAS, which is fine, but consider using a single atomic operation pattern for clarity.


1645-1655: Hardcoded log directory path /tmp/cuopt_logs.

Consider making the log directory configurable via command-line argument or environment variable for flexibility in deployment scenarios where /tmp may be restricted or have limited space.


2272-2273: Detached threads for client handling may outlive the server.

When the server shuts down, detached client handler threads may still be running. Consider using a thread pool or tracking active connections to ensure clean shutdown.

Consider tracking active client threads

Instead of detaching threads, consider maintaining a vector of active threads and joining them during shutdown, or using a bounded thread pool to limit concurrent connections.

cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp (2)

274-306: Host accessor documentation should clarify preconditions.

The documentation states "Only valid when is_device_memory() returns false" but this is only a comment, not enforced. Consider documenting the behavior when called incorrectly (undefined behavior, assertion, exception).

Enhanced documentation suggestion
   /**
    * `@brief` Returns the primal solution in host (CPU) memory.
-   * Only valid when is_device_memory() returns false.
+   * `@pre` is_device_memory() must return false, otherwise behavior is undefined.
+   * `@note` Use to_host() first if data is currently on GPU.
    *
    * `@return` std::vector<f_t>& The host memory container for the primal solution.
    */

492-509: Consider documenting thread-safety of solution objects.

Per coding guidelines for public headers, thread-safety should be documented. Add a note indicating whether solution objects are thread-safe or require external synchronization.

conda/environments/all_cuda-131_arch-aarch64.yaml (1)

28-36: Align/pin grpc/protobuf versions for reproducibility

Please ensure grpc-cpp and libprotobuf versions are constrained in the generator source so this environment stays reproducible and ABI-compatible with the build tooling.

conda/environments/all_cuda-131_arch-x86_64.yaml (1)

28-36: Align/pin grpc/protobuf versions for reproducibility

Same request here: please constrain grpc-cpp and libprotobuf versions in dependencies.yaml to avoid ABI drift across builds.

ci/build_wheel_libcuopt.sh (1)

20-23: Keep protobuf/gRPC versions aligned across install script and envs

Now that the wheel build always installs protobuf/gRPC and enables CUOPT_ENABLE_GRPC, please ensure install_protobuf_grpc.sh pins versions consistent with the conda environments to avoid ABI or linkage drift.

cpp/src/linear_programming/CMakeLists.txt (1)

11-12: Verify protobuf dependency is intentional for non-GRPC builds

protobuf_serializer.cu is now always compiled. If protobuf isn’t guaranteed in non-GRPC builds, consider guarding it behind a build option or ensure protobuf is always available.

cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp (2)

43-46: Consider renaming variables to reflect conditional data source.

After the conditional change, numba_matrix may actually hold a NumPy array (host data), and numpy_array is also conditionally selected. Consider renaming to solution_array and objective_array to avoid confusion about data location.

Also applies to: 82-85


20-39: Code duplication between callback classes.

get_numba_matrix and get_numpy_array methods are identical in both default_get_solution_callback_t and default_set_solution_callback_t. Consider extracting these to a common base class or utility to reduce duplication.

Also applies to: 59-78

cpp/scripts/run_dev_server.sh (1)

1-24: Consider adding safer shell script options.

The script works correctly, but adding set -euo pipefail after the shebang would make it more robust by:

  • -e: Exit immediately on command failure
  • -u: Treat unset variables as errors
  • -o pipefail: Catch errors in pipelines
♻️ Proposed improvement
 #!/bin/bash
 # SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
 # SPDX-License-Identifier: Apache-2.0
+set -euo pipefail

 # Development script to run the cuopt remote server with the correct library path
python/cuopt/cuopt/linear_programming/remote.py (2)

19-32: Avoid modifying sys.path with relative build directories in production code.

Inserting a relative build path (cpp/build) into sys.path is fragile and inappropriate for distributed packages. This path manipulation should be removed or guarded by a development-only flag. In production, cuopt_remote_pb2 should be installed as part of the package.

♻️ Suggested approach
 try:
-    import sys
-
-    # Add build directory for development
-    build_path = os.path.join(
-        os.path.dirname(__file__), "..", "..", "..", "..", "..", "cpp", "build"
-    )
-    if os.path.exists(build_path):
-        sys.path.insert(0, os.path.abspath(build_path))
     import cuopt_remote_pb2 as pb
 
     _HAS_PROTOBUF = True
 except ImportError:
     _HAS_PROTOBUF = False

55-72: Consider more specific type annotation.

The return type Optional[tuple] could be more descriptive as Optional[Tuple[str, int]] for better IDE support and documentation.

♻️ Suggested fix
+from typing import Optional, Tuple
+
-def get_remote_config() -> Optional[tuple]:
+def get_remote_config() -> Optional[Tuple[str, int]]:
cpp/src/mip/solve.cu (1)

383-448: Code duplication: cpu_problem_data_t is defined in multiple files.

This struct duplicates functionality from problem_cpu_data_t in cuopt_c.cpp. Consider consolidating into a shared header (e.g., in cuopt/linear_programming/utilities/) to avoid maintenance burden and divergence.

Based on learnings, refactor code duplication in solver components into shared utilities.

cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp (1)

39-54: Consider validating port number range.

The code catches conversion errors but doesn't validate that the port number is within the valid range (1-65535). An out-of-range port would silently fall back to local solve, which might confuse users.

♻️ Suggested improvement
   if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
     try {
       int port_num = std::stoi(port);
+      if (port_num < 1 || port_num > 65535) {
+        return std::nullopt;  // Invalid port range
+      }
       return remote_solve_config_t{std::string(host), port_num};
     } catch (...) {
       // Invalid port number, fall back to local solve
       return std::nullopt;
     }
   }
cpp/include/cuopt/linear_programming/utilities/remote_serialization.hpp (1)

217-237: Potential confusion: job_status_t duplicates remote_job_status_t.

The job_status_t enum inside remote_serializer_t has identical values to remote_job_status_t in remote_solve.hpp. Consider unifying these to avoid mapping overhead and potential inconsistencies.

One option is to use remote_job_status_t from remote_solve.hpp consistently throughout, or define a single shared enum in a common location.

cpp/src/linear_programming/cuopt_c.cpp (1)

33-147: Code duplication: problem_cpu_data_t is similar to struct in solve.cu.

This struct duplicates the cpu_problem_data_t in cpp/src/mip/solve.cu. Both serve the same purpose of holding CPU-side problem data. Consider consolidating into a shared utility header.

cpp/src/linear_programming/utilities/cuopt_remote.proto (1)

10-14: Remove FloatType enum and float_type field—not used in any payload.

The float_type field and FloatType enum are defined in the proto schema but never consumed by the codebase. All numeric payloads use double exclusively, making the FLOAT32 option misleading to clients expecting float32 wire format support. Remove both the enum (lines 26-29) and the field (line 13) until float32 payloads are actually implemented.

♻️ Optional clean-up
 message RequestHeader {
   uint32 version = 1;           // Protocol version (currently 1)
   ProblemType problem_type = 2; // LP or MIP
   IndexType index_type = 3;     // INT32 or INT64
-  FloatType float_type = 4;     // FLOAT32 or DOUBLE
 }
@@
-enum FloatType {
-  FLOAT32 = 0;
-  DOUBLE = 1;
-}

Comment on lines +517 to 531
# Convert EXTRA_CMAKE_ARGS into a semicolon-delimited list, escaping
# any semicolons in values so scikit-build-core treats each -D as one arg.
SKBUILD_EXTRA_ARGS=()
for extra_arg in "${EXTRA_CMAKE_ARGS[@]}"; do
SKBUILD_EXTRA_ARGS+=("${extra_arg//;/\\;}")
done
SKBUILD_EXTRA_ARGS_JOINED=""
if [ ${#SKBUILD_EXTRA_ARGS[@]} -gt 0 ]; then
SKBUILD_EXTRA_ARGS_JOINED="$(IFS=';'; echo "${SKBUILD_EXTRA_ARGS[*]}")"
fi
SKBUILD_CMAKE_ARGS="-DCMAKE_PREFIX_PATH=${INSTALL_PREFIX};-DCMAKE_LIBRARY_PATH=${LIBCUOPT_BUILD_DIR};-DCMAKE_CUDA_ARCHITECTURES=${CUOPT_CMAKE_CUDA_ARCHITECTURES}"
if [ -n "${SKBUILD_EXTRA_ARGS_JOINED}" ]; then
SKBUILD_CMAKE_ARGS="${SKBUILD_CMAKE_ARGS};${SKBUILD_EXTRA_ARGS_JOINED}"
fi
python "${PYTHON_ARGS_FOR_INSTALL[@]}" .
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check all occurrences of SKBUILD_CMAKE_ARGS in build.sh
rg -n "SKBUILD_CMAKE_ARGS" build.sh

# Also check for SKBUILD pattern usage
echo "=== All SKBUILD references ==="
rg -n "SKBUILD" build.sh | head -30

Repository: NVIDIA/cuopt

Length of output: 1736


🏁 Script executed:

#!/bin/bash
# Get context around the first python command (lines 517-535)
sed -n '515,535p' build.sh | cat -n

echo "=== Second occurrence context (lines 540-560) ==="
sed -n '540,560p' build.sh | cat -n

Repository: NVIDIA/cuopt

Length of output: 2219


🏁 Script executed:

#!/bin/bash
# Find definition of PYTHON_ARGS_FOR_INSTALL
rg -n "PYTHON_ARGS_FOR_INSTALL" build.sh | head -20

echo "=== Check for export statements ==="
rg -n "export SKBUILD" build.sh

echo "=== Check broader context for python install patterns ==="
rg -n "python.*install" build.sh -A 2 -B 2 | head -50

Repository: NVIDIA/cuopt

Length of output: 438


🏁 Script executed:

#!/bin/bash
# Check scikit-build-core documentation or examples
# Look for patterns of how cmake args should be passed
rg -n "\-D[A-Z_]+=" build.sh | head -20

echo "=== Check if SKBUILD_CMAKE_ARGS should be in environment or passed differently ==="
# Search for any CMake-related env variables
rg -n "CMAKE_" build.sh | grep -E "(export|=)" | head -20

echo "=== Look for scikit-build references or documentation ==="
rg -n "scikit-build" build.sh

Repository: NVIDIA/cuopt

Length of output: 3181


🏁 Script executed:

#!/bin/bash
# Check if SKBUILD_CMAKE_ARGS should be an environment variable
# Look for any environment variable patterns in the code
sed -n '515,535p' build.sh

echo "=== Check if there's an export or env setup before the python call ==="
sed -n '500,535p' build.sh

Repository: NVIDIA/cuopt

Length of output: 2467


🏁 Script executed:

#!/bin/bash
# Search for how scikit-build-core expects to receive CMAKE args
# Check if there's documentation or comments about SKBUILD env vars
rg -n "SKBUILD\|skbuild" . --type py --type txt --type md 2>/dev/null | head -30

echo "=== Check for pyproject.toml or setup.py that might configure scikit-build ==="
fd -e "pyproject.toml" -o -e "setup.py" -o -e "setup.cfg" | head -10

Repository: NVIDIA/cuopt

Length of output: 268


🏁 Script executed:

#!/bin/bash
# Find pyproject.toml and setup files
find . -maxdepth 5 -name "pyproject.toml" -o -name "setup.py" -o -name "setup.cfg" | head -20

echo "=== Check python/cuopt directory structure ==="
ls -la python/cuopt/ | head -30

Repository: NVIDIA/cuopt

Length of output: 919


🏁 Script executed:

#!/bin/bash
# Check the pyproject.toml for python/cuopt to see scikit-build configuration
cat python/cuopt/pyproject.toml

Repository: NVIDIA/cuopt

Length of output: 3420


🏁 Script executed:

#!/bin/bash
# Check scikit-build-core documentation for how it expects CMAKE args
# Look for patterns in the codebase where SKBUILD env vars might be set
grep -r "SKBUILD_CMAKE_ARGS\|SKBUILD_BUILD" . --include="*.sh" --include="*.py" 2>/dev/null

echo "=== Check if there are any environment variable exports before python calls ==="
sed -n '515,533p' build.sh

Repository: NVIDIA/cuopt

Length of output: 2112


Export SKBUILD_CMAKE_ARGS before the python command.

The variable SKBUILD_CMAKE_ARGS is constructed (lines 527-530) but not exported, so scikit-build-core will not receive it. The CI scripts (e.g., ci/build_wheel_cuopt.sh) correctly use export SKBUILD_CMAKE_ARGS=... before invoking the build. Add export before assigning SKBUILD_CMAKE_ARGS on line 527 (and line 546 for the second occurrence).

🤖 Prompt for AI Agents
In `@build.sh` around lines 517 - 531, The SKBUILD_CMAKE_ARGS environment variable
is being built but not exported, so scikit-build-core won't see it; change the
assignment of SKBUILD_CMAKE_ARGS (the variable constructed from INSTALL_PREFIX,
LIBCUOPT_BUILD_DIR, CUOPT_CMAKE_CUDA_ARCHITECTURES and optionally
SKBUILD_EXTRA_ARGS_JOINED) to export SKBUILD_CMAKE_ARGS before invoking python
"${PYTHON_ARGS_FOR_INSTALL[@]}" . and likewise export the second
SKBUILD_CMAKE_ARGS occurrence noted in the script so the python/scikit-build
process inherits the variable.

Comment on lines +296 to +306
# gRPC for remote solve service.
#
# We treat gRPC as required, but allow it to come from either:
# - an installed CMake package (gRPCConfig.cmake), or
# - an in-tree build (e.g. python/libcuopt uses FetchContent(grpc), which defines gRPC::grpc++).
if(NOT TARGET gRPC::grpc++)
find_package(gRPC CONFIG REQUIRED)
endif()
set(CUOPT_ENABLE_GRPC ON)
add_compile_definitions(CUOPT_ENABLE_GRPC)
message(STATUS "gRPC enabled (target gRPC::grpc++ is available)")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check whether CUOPT_ENABLE_GRPC is already user-configurable elsewhere
rg -n "option\(.*CUOPT_ENABLE_GRPC|CUOPT_ENABLE_GRPC" -g '*.cmake'

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Also check in CMakeLists.txt files
rg -n "option\(.*CUOPT_ENABLE_GRPC|CUOPT_ENABLE_GRPC" cpp/CMakeLists.txt

Repository: NVIDIA/cuopt

Length of output: 264


🏁 Script executed:

#!/bin/bash
# Check the broader context around lines 296-306 in cpp/CMakeLists.txt
# to see if gRPC was previously optional or if there's an option() call
sed -n '280,320p' cpp/CMakeLists.txt

Repository: NVIDIA/cuopt

Length of output: 1787


🏁 Script executed:

#!/bin/bash
# Search for any option() calls that mention gRPC or enable flags
rg -n "option\(" cpp/CMakeLists.txt | head -20

Repository: NVIDIA/cuopt

Length of output: 978


Make gRPC optional or confirm it is now a mandatory dependency.

find_package(gRPC CONFIG REQUIRED) on line 302 makes gRPC unconditionally required, and CUOPT_ENABLE_GRPC is hardcoded to ON on line 304 with no corresponding user-facing option(). If gRPC-free builds were previously supported, this breaks backward compatibility. Either gate the gRPC and proto codegen logic behind a CMake option() with default OFF, or document that gRPC is now mandatory.

🤖 Prompt for AI Agents
In `@cpp/CMakeLists.txt` around lines 296 - 306, The CMake logic currently forces
gRPC by calling find_package(gRPC CONFIG REQUIRED) and unconditionally setting
CUOPT_ENABLE_GRPC via add_compile_definitions, breaking gRPC-free builds; change
this by introducing a CMake option (e.g., option(CUOPT_ENABLE_GRPC "Enable gRPC
support" OFF)) and guard the existing code that checks TARGET gRPC::grpc++,
calls find_package(gRPC ...), and sets
add_compile_definitions(CUOPT_ENABLE_GRPC) behind that option so gRPC is only
required when the option is ON (or, if you intend to make gRPC mandatory, update
project docs and remove the option but keep a clear message); ensure the
message(STATUS ...) reflects the chosen path and that proto/codegen and targets
that reference gRPC::grpc++ are only configured when CUOPT_ENABLE_GRPC is
enabled.

Comment on lines +303 to +313
std::string generate_job_id()
{
static std::random_device rd;
static std::mt19937 gen(rd());
static std::uniform_int_distribution<uint64_t> dis;

uint64_t id = dis(gen);
char buf[32];
snprintf(buf, sizeof(buf), "job_%016lx", id);
return std::string(buf);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Thread-unsafe static variables in generate_job_id.

The static std::mt19937 gen and std::uniform_int_distribution<uint64_t> dis are shared across threads but not protected by a mutex. This function can be called from multiple client handler threads concurrently.

🔒 Proposed fix using thread-local storage
 std::string generate_job_id()
 {
-  static std::random_device rd;
-  static std::mt19937 gen(rd());
-  static std::uniform_int_distribution<uint64_t> dis;
+  thread_local std::random_device rd;
+  thread_local std::mt19937 gen(rd());
+  thread_local std::uniform_int_distribution<uint64_t> dis;
 
   uint64_t id = dis(gen);
   char buf[32];
   snprintf(buf, sizeof(buf), "job_%016lx", id);
   return std::string(buf);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
std::string generate_job_id()
{
static std::random_device rd;
static std::mt19937 gen(rd());
static std::uniform_int_distribution<uint64_t> dis;
uint64_t id = dis(gen);
char buf[32];
snprintf(buf, sizeof(buf), "job_%016lx", id);
return std::string(buf);
}
std::string generate_job_id()
{
thread_local std::random_device rd;
thread_local std::mt19937 gen(rd());
thread_local std::uniform_int_distribution<uint64_t> dis;
uint64_t id = dis(gen);
char buf[32];
snprintf(buf, sizeof(buf), "job_%016lx", id);
return std::string(buf);
}
🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 303 - 313, generate_job_id
currently uses shared static RNG state (static std::mt19937 gen and static
std::uniform_int_distribution<uint64_t> dis) which is not thread-safe; make the
RNG state thread-local or otherwise synchronized: replace those static variables
with thread_local instances (e.g., thread_local std::mt19937 and thread_local
distribution) or protect access with a mutex inside generate_job_id, ensuring
unique IDs remain formatted via snprintf and returned as before; update
references in generate_job_id accordingly to avoid data races.

Comment on lines +84 to +85
# Static library for linking into libcuopt
add_library(mps_parser_static STATIC ${MPS_PARSER_SOURCES})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Static library is missing several configurations applied to the shared library.

The mps_parser_static target lacks:

  1. set_target_properties for CXX_STANDARD, CXX_STANDARD_REQUIRED, POSITION_INDEPENDENT_CODE, and CXX_SCAN_FOR_MODULES
  2. target_compile_options for compiler flags
  3. Conditional target_include_directories for BZip2/ZLIB (lines 137-143)

This may cause build failures or inconsistent behavior when linking the static library.

Proposed fix
 # Static library for linking into libcuopt
 add_library(mps_parser_static STATIC ${MPS_PARSER_SOURCES})

+set_target_properties(mps_parser_static
+  PROPERTIES
+  CXX_STANDARD 20
+  CXX_STANDARD_REQUIRED ON
+  POSITION_INDEPENDENT_CODE ON
+  CXX_SCAN_FOR_MODULES OFF
+)
+
+target_compile_options(mps_parser_static
+  PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>"
+)

Also add BZip2/ZLIB includes for the static library after line 143:

if(MPS_PARSER_WITH_BZIP2)
    target_include_directories(mps_parser_static PRIVATE BZip2::BZip2)
endif()

if(MPS_PARSER_WITH_ZLIB)
    target_include_directories(mps_parser_static PRIVATE ZLIB::ZLIB)
endif()
🤖 Prompt for AI Agents
In `@cpp/libmps_parser/CMakeLists.txt` around lines 84 - 85, The static target
mps_parser_static is missing the same configuration as the shared target; update
CMake to apply set_target_properties on mps_parser_static to set CXX_STANDARD,
CXX_STANDARD_REQUIRED, POSITION_INDEPENDENT_CODE, and CXX_SCAN_FOR_MODULES, add
the same target_compile_options used for the shared target to mps_parser_static,
and mirror the conditional include directories for compression libs by adding
target_include_directories(mps_parser_static PRIVATE BZip2::BZip2) when
MPS_PARSER_WITH_BZIP2 and target_include_directories(mps_parser_static PRIVATE
ZLIB::ZLIB) when MPS_PARSER_WITH_ZLIB so the static library builds with the same
flags and includes.

Comment on lines +72 to +77
// Stream logs to stdout until stop_flag is true or server indicates job_complete.
// If print_prefix is non-empty, it will be printed before each log line.
void stream_logs_to_stdout(const std::string& address,
const std::string& job_id,
volatile bool* stop_flag,
const std::string& print_prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*.hpp" -o -name "*.cpp" | grep -E "(remote_solve|grpc)" | head -20

Repository: NVIDIA/cuopt

Length of output: 271


🏁 Script executed:

cat -n cpp/src/linear_programming/utilities/remote_solve_grpc.hpp | head -100

Repository: NVIDIA/cuopt

Length of output: 4113


🏁 Script executed:

fd "remote_solve_grpc" --type f

Repository: NVIDIA/cuopt

Length of output: 173


🏁 Script executed:

cat -n cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | head -300

Repository: NVIDIA/cuopt

Length of output: 12449


🏁 Script executed:

grep -n "stream_logs_to_stdout" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 114


🏁 Script executed:

sed -n '350,500p' cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 5047


🏁 Script executed:

sed -n '597,700p' cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 2324


🏁 Script executed:

sed -n '597,750p' cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 2324


🏁 Script executed:

grep -rn "stream_logs_to_stdout" cpp/ --include="*.cpp" --include="*.hpp"

Repository: NVIDIA/cuopt

Length of output: 290


🏁 Script executed:

grep -n "volatile bool" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 112


🏁 Script executed:

grep -rn "stream_logs_to_stdout" cpp/ --include="*.cpp" -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 535


🏁 Script executed:

grep -rn "stream_logs_to_stdout" . --include="*.py" 2>/dev/null | head -20

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

cat -n cpp/cuopt_grpc_server.cpp | head -150

Repository: NVIDIA/cuopt

Length of output: 5879


🏁 Script executed:

grep -n "stream_logs" cpp/cuopt_grpc_server.cpp

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

grep -rn "stop_flag" cpp/ --include="*.cpp" --include="*.hpp"

Repository: NVIDIA/cuopt

Length of output: 559


🏁 Script executed:

grep -rn "StreamLogs" cpp/ --include="*.cpp" --include="*.hpp"

Repository: NVIDIA/cuopt

Length of output: 429


🏁 Script executed:

sed -n '2566,2650p' cpp/cuopt_grpc_server.cpp

Repository: NVIDIA/cuopt

Length of output: 3097


🏁 Script executed:

grep -B 10 -A 30 "StreamLogs(ServerContext" cpp/cuopt_grpc_server.cpp

Repository: NVIDIA/cuopt

Length of output: 1488


Use std::atomic<bool> for the stop flag to ensure thread-safe signaling.

volatile doesn't synchronize between threads—it only prevents compiler optimizations. The log streaming loop reads *stop_flag in a loop while another thread may write to it, creating a data race. Replace with std::atomic<bool>* and use load()/store() semantics, then update the implementation and all call sites.

🔧 Proposed API adjustment
+#include <atomic>
 ...
-void stream_logs_to_stdout(const std::string& address,
-                           const std::string& job_id,
-                           volatile bool* stop_flag,
-                           const std::string& print_prefix);
+void stream_logs_to_stdout(const std::string& address,
+                           const std::string& job_id,
+                           std::atomic<bool>* stop_flag,
+                           const std::string& print_prefix);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Stream logs to stdout until stop_flag is true or server indicates job_complete.
// If print_prefix is non-empty, it will be printed before each log line.
void stream_logs_to_stdout(const std::string& address,
const std::string& job_id,
volatile bool* stop_flag,
const std::string& print_prefix);
`#include` <atomic>
// ... other includes ...
// Stream logs to stdout until stop_flag is true or server indicates job_complete.
// If print_prefix is non-empty, it will be printed before each log line.
void stream_logs_to_stdout(const std::string& address,
const std::string& job_id,
std::atomic<bool>* stop_flag,
const std::string& print_prefix);
🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/utilities/remote_solve_grpc.hpp` around lines 72 -
77, The stop flag parameter for stream_logs_to_stdout is using volatile which is
not thread-safe; change the function signature to accept std::atomic<bool>*
stop_flag, update the implementation to use stop_flag->load() (and store() where
written) for loop checks and any writers, and update all call sites to pass a
std::atomic<bool> instance instead of a volatile bool pointer so signaling is
synchronized across threads (refer to function stream_logs_to_stdout and any
callers that set/clear the flag).

Comment on lines +41 to +45
static bool logged_no_device = false;
if (!logged_no_device) {
CUOPT_LOG_INFO("[remote_solve] No CUDA device available; using host incumbents");
logged_no_device = true;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Protect logged_no_device against races.
static bool logged_no_device is written without synchronization, which is a data race if multiple threads call copy_incumbent_to_device. Use std::once_flag or an atomic.

🔧 Thread-safe logging
-    static bool logged_no_device = false;
-    if (!logged_no_device) {
-      CUOPT_LOG_INFO("[remote_solve] No CUDA device available; using host incumbents");
-      logged_no_device = true;
-    }
+    static std::once_flag logged_no_device;
+    std::call_once(logged_no_device, []() {
+      CUOPT_LOG_INFO("[remote_solve] No CUDA device available; using host incumbents");
+    });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static bool logged_no_device = false;
if (!logged_no_device) {
CUOPT_LOG_INFO("[remote_solve] No CUDA device available; using host incumbents");
logged_no_device = true;
}
static std::once_flag logged_no_device;
std::call_once(logged_no_device, []() {
CUOPT_LOG_INFO("[remote_solve] No CUDA device available; using host incumbents");
});
🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/utilities/remote_solve.cu` around lines 41 - 45,
The static flag logged_no_device in copy_incumbent_to_device is written without
synchronization causing a race; replace it with a thread-safe mechanism (e.g., a
std::once_flag + std::call_once or an std::atomic<bool>) so the "[remote_solve]
No CUDA device available; using host incumbents" message is logged exactly once
without data races—locate logged_no_device and the copy_incumbent_to_device
function, remove the unsynchronized static write, and implement one of those
thread-safe patterns to perform the logging safely.

Comment on lines +219 to +227
if (status == "COMPLETED") { break; }
if (status == "FAILED" || status == "CANCELLED" || status == "NOT_FOUND") {
stop_logs = true;
if (log_thread.joinable()) { log_thread.join(); }
grpc_remote::delete_result(address, job_id);
return optimization_problem_solution_t<i_t, f_t>(
cuopt::logic_error("Remote job did not complete successfully (status=" + status + ")",
cuopt::error_type_t::RuntimeError));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle unexpected status values to avoid infinite polling.
check_status can return "UNKNOWN", which currently falls through and keeps the loop alive forever. Treat unknown/unsupported status as a failure (or add a timeout) to avoid hanging remote solves.

🔧 Suggested guard
-    if (status == "COMPLETED") { break; }
-    if (status == "FAILED" || status == "CANCELLED" || status == "NOT_FOUND") {
+    if (status == "COMPLETED") { break; }
+    if (status == "QUEUED" || status == "PROCESSING") {
+      // keep waiting
+    } else if (status == "FAILED" || status == "CANCELLED" || status == "NOT_FOUND") {
+      ...
+    } else {
+      stop_logs = true;
+      if (log_thread.joinable()) { log_thread.join(); }
+      grpc_remote::delete_result(address, job_id);
+      return optimization_problem_solution_t<i_t, f_t>(
+        cuopt::logic_error("Remote job returned unexpected status: " + status,
+                           cuopt::error_type_t::RuntimeError));
     }

Also applies to: 402-410

🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/utilities/remote_solve.cu` around lines 219 - 227,
The current status check in the polling loop does not handle unexpected values
like "UNKNOWN", leading to infinite polling. Update the status handling in the
relevant loop in the remote_solve function to treat any unknown or unsupported
status as a failure: set stop_logs to true, join the log thread if joinable,
delete the remote result, and return an error indicating the unexpected status.
Apply this fix to both code sections handling status checks around lines 219-227
and 402-410.

Comment on lines 148 to +157
const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const
{
return solution_;
return *solution_;
}

template <typename i_t, typename f_t>
rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution()
{
return solution_;
return *solution_;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential null pointer dereference in get_solution() when using CPU-only path.

If the object was constructed with a CPU-only constructor, solution_ will be nullptr. Calling get_solution() will dereference a null pointer. Consider adding a guard or assertion.

🛡️ Suggested guard
 template <typename i_t, typename f_t>
 const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const
 {
+  cuopt_assert(is_device_memory_ && solution_, "Cannot access device solution when data is on host");
   return *solution_;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const
{
return solution_;
return *solution_;
}
template <typename i_t, typename f_t>
rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution()
{
return solution_;
return *solution_;
}
const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const
{
cuopt_assert(is_device_memory_ && solution_, "Cannot access device solution when data is on host");
return *solution_;
}
template <typename i_t, typename f_t>
rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution()
{
return *solution_;
}
🤖 Prompt for AI Agents
In `@cpp/src/mip/solver_solution.cu` around lines 148 - 157, The get_solution()
methods currently dereference solution_ which can be nullptr for objects
constructed via the CPU-only constructor; modify mip_solution_t<i_t,
f_t>::get_solution() (both const and non-const overloads) to guard against null
solution_—e.g., add an assertion or throw a clear exception when solution_ ==
nullptr (or return/obtain an appropriate CPU-side container if that is the
intended behavior) so callers no longer dereference a null pointer; ensure the
guard message references solution_ and the class/method name for easy tracing.

Comment on lines +159 to 169
template <typename i_t, typename f_t>
std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host()
{
return *solution_host_;
}

template <typename i_t, typename f_t>
const std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() const
{
return *solution_host_;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Similar null pointer risk in get_solution_host() when using GPU path.

If the object was constructed with a GPU constructor, solution_host_ will be nullptr. Consider adding guards to both host and device accessors.

🤖 Prompt for AI Agents
In `@cpp/src/mip/solver_solution.cu` around lines 159 - 169, mip_solution_t's
get_solution_host() and its const overload dereference solution_host_ which can
be nullptr when constructed with the GPU path; update both get_solution_host()
methods to check solution_host_ before dereferencing (e.g., if nullptr, either
lazily allocate/copy device data into solution_host_ or throw a clear exception)
and ensure the chosen behavior is consistent with other accessors (refer to
mip_solution_t constructor(s), solution_device_, and any get_solution_device()
logic) so callers won't dereference a null pointer.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@ci/utils/install_protobuf_grpc.sh`:
- Around line 35-40: After installing only grpc-devel in the conditional branch
in install_protobuf_grpc.sh, verify that the grpc_cpp_plugin binary is actually
present (e.g., check PATH or common install locations for grpc_cpp_plugin) and
if it is missing, treat this branch as a failure and fall back to the existing
source-build branch instead of continuing; update the if/elif/else logic so the
elif path (which echoes "Installed grpc-devel...") performs a post-install
existence check for grpc_cpp_plugin and redirects to the source build path when
absent.
🧹 Nitpick comments (2)
cpp/CMakeLists.txt (2)

740-746: Inconsistent protobuf::libprotobuf link visibility.

cuopt_remote_server links protobuf::libprotobuf as PUBLIC, while cuopt links it as PRIVATE (line 508). Unless consumers of cuopt_remote_server need direct protobuf access, consider making this PRIVATE for consistency.

♻️ Suggested fix
   target_link_libraries(cuopt_remote_server
     PUBLIC
     cuopt
     OpenMP::OpenMP_CXX
-    protobuf::libprotobuf
     PRIVATE
+    protobuf::libprotobuf
   )

773-808: Consider removing or gating dead code behind a proper option.

The if(FALSE) block (lines 773-808) is permanently disabled dead code. While the TODO explains the sources are missing, this clutters the CMakeLists. Consider either:

  1. Removing the block entirely and re-adding when sources are available
  2. Gating with a proper option(BUILD_REMOTE_TEST_CLIENTS ... OFF) if the intent is to keep as a template

Comment on lines +35 to +40
if dnf install -y grpc-devel grpc-plugins; then
echo "Installed gRPC from dnf."
elif dnf install -y grpc-devel; then
echo "Installed grpc-devel from dnf (grpc_cpp_plugin may be missing)."
else
echo "grpc-devel not available. Building and installing Protobuf + gRPC from source..."
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Verify grpc_cpp_plugin availability after partial install.

When only grpc-devel is installed (without grpc-plugins), the script warns that grpc_cpp_plugin may be missing but continues. This could lead to downstream CMake failures during code generation. Consider verifying the plugin exists and falling back to source build if missing.

🔧 Suggested fix
     if dnf install -y grpc-devel grpc-plugins; then
         echo "Installed gRPC from dnf."
     elif dnf install -y grpc-devel; then
-        echo "Installed grpc-devel from dnf (grpc_cpp_plugin may be missing)."
+        if ! command -v grpc_cpp_plugin >/dev/null 2>&1; then
+            echo "grpc-devel installed but grpc_cpp_plugin missing. Building from source..."
+            # Fall through to source build below
+        else
+            echo "Installed grpc-devel from dnf with grpc_cpp_plugin."
+        fi
     else
🤖 Prompt for AI Agents
In `@ci/utils/install_protobuf_grpc.sh` around lines 35 - 40, After installing
only grpc-devel in the conditional branch in install_protobuf_grpc.sh, verify
that the grpc_cpp_plugin binary is actually present (e.g., check PATH or common
install locations for grpc_cpp_plugin) and if it is missing, treat this branch
as a failure and fall back to the existing source-build branch instead of
continuing; update the if/elif/else logic so the elif path (which echoes
"Installed grpc-devel...") performs a post-install existence check for
grpc_cpp_plugin and redirects to the source build path when absent.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@cpp/CMakeLists.txt`:
- Around line 229-265: The CUOPT_ZLIB_HINT_CODE block is adding a global
CMAKE_IGNORE_PATH entry which can break downstream consumers; remove the line
that appends "/usr/lib64/cmake/ZLIB" and "/lib64/cmake/ZLIB" to
CMAKE_IGNORE_PATH inside CUOPT_ZLIB_HINT_CODE and retain only the ZLIB_DIR and
ZLIB_ROOT (and ZLIB_USE_STATIC_LIBS) hints so the
rapids_export_post_find_package_code calls (both BUILD and INSTALL for
rapids_logger) still provide targeted ZLIB guidance without modifying the global
CMake search path.
- Around line 312-377: The add_custom_command blocks for proto/grpc generation
must declare explicit target dependencies and use generator expressions
consistently: update the protoc executable selection to prefer
$<TARGET_FILE:protobuf::protoc> (matching the existing
$<TARGET_FILE:grpc_cpp_plugin> usage) and add the corresponding imported targets
(protobuf::protoc and grpc_cpp_plugin) to the DEPENDS list of their respective
add_custom_command calls so CMake will re-run the commands when those plugins
are rebuilt; adjust the GRPC custom command DEPENDS to include grpc_cpp_plugin
and the proto custom command DEPENDS to include protobuf::protoc.

Comment on lines +229 to +265
# Ensure downstream consumers prefer config packages (avoids Protobuf
# module/config mixing when gRPC also depends on Protobuf).
include(${rapids-cmake-dir}/export/detail/post_find_package_code.cmake)
set(CUOPT_DEP_PREFER_CONFIG ON)
rapids_export_post_find_package_code(BUILD rapids_logger
"set(CMAKE_FIND_PACKAGE_PREFER_CONFIG ON)"
EXPORT_SET cuopt-exports
CONDITION CUOPT_DEP_PREFER_CONFIG
)
rapids_export_post_find_package_code(INSTALL rapids_logger
"set(CMAKE_FIND_PACKAGE_PREFER_CONFIG ON)"
EXPORT_SET cuopt-exports
CONDITION CUOPT_DEP_PREFER_CONFIG
)

# Ensure ZLIB resolves to the active prefix for consumers (avoid broken system
# ZLIB config referencing missing libz.a).
set(CUOPT_ZLIB_HINT_CODE
"set(_CUOPT_PREFIX \"\${CMAKE_CURRENT_LIST_DIR}/../../..\")\n"
"if(EXISTS \"\${_CUOPT_PREFIX}/lib/cmake/ZLIB\")\n"
" set(ZLIB_DIR \"\${_CUOPT_PREFIX}/lib/cmake/ZLIB\")\n"
"endif()\n"
"set(ZLIB_ROOT \"\${_CUOPT_PREFIX}\")\n"
"set(ZLIB_USE_STATIC_LIBS OFF)\n"
"list(APPEND CMAKE_IGNORE_PATH \"/usr/lib64/cmake/ZLIB\" \"/lib64/cmake/ZLIB\")\n"
)
rapids_export_post_find_package_code(BUILD rapids_logger
"${CUOPT_ZLIB_HINT_CODE}"
EXPORT_SET cuopt-exports
CONDITION CUOPT_DEP_PREFER_CONFIG
)
rapids_export_post_find_package_code(INSTALL rapids_logger
"${CUOPT_ZLIB_HINT_CODE}"
EXPORT_SET cuopt-exports
CONDITION CUOPT_DEP_PREFER_CONFIG
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the full context of the CMakeLists.txt file around the ZLIB hint code
cd cpp && wc -l CMakeLists.txt

Repository: NVIDIA/cuopt

Length of output: 74


🏁 Script executed:

# Get the exact snippet and surrounding context
sed -n '229,265p' cpp/CMakeLists.txt

Repository: NVIDIA/cuopt

Length of output: 1434


🏁 Script executed:

# Check if there are any comments or documentation about CMAKE_IGNORE_PATH usage in the repo
rg "CMAKE_IGNORE_PATH" --type cmake -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 373


🌐 Web query:

CMAKE_IGNORE_PATH scope CMake documentation persistence across packages

💡 Result:

  • What scope does CMAKE_IGNORE_PATH have?
    CMAKE_IGNORE_PATH is just a normal CMake variable unless you put it in the cache. Normal variables live in the current directory scope (and child directories created with add_subdirectory() start by copying the parent’s variables). Cache entries are separate and only changed explicitly (e.g., set(... CACHE ...) or -D...=...). [1], [2]

  • Does it “persist across packages”? It depends what you mean by “package”.

    1. Same CMake configure run (same build tree)
      If “package” means a dependency brought in via add_subdirectory()/FetchContent (i.e., configured as part of the same top-level configure), then setting CMAKE_IGNORE_PATH in a parent directory scope will generally be seen by subdirectories unless overwritten. This is standard directory-scope behavior. [2]
    2. Separate CMake configure runs (separate build trees)
      If “package” means something configured in a different CMake invocation (e.g., ExternalProject_Add()), then normal variables do not carry over. You must pass values via cache arguments / command line (e.g., -DCMAKE_IGNORE_PATH=...) or equivalent ExternalProject cache/preload mechanisms. [2], [3]
  • Does it persist across re-configures?
    Only if it’s in the CMake cache for that build directory (CMakeCache.txt). Otherwise it must be set again each configure run. [2]

  • Docs note (what it affects):
    CMAKE_IGNORE_PATH is a semicolon-separated list of directories ignored by find_*() and also affects find_package() Config mode (ignores <Name>Config.cmake found in ignored dirs). For ignoring whole prefixes, prefer CMAKE_IGNORE_PREFIX_PATH. [1], [4]

Practical takeaway: if you want it to apply “everywhere” in one build tree and survive reruns, set it as a cache variable (e.g., via -DCMAKE_IGNORE_PATH=...), and if you spawn separate CMake builds (ExternalProject), pass it through that project’s CMake cache args. [1], [2], [3]

Sources:
[1] CMake docs: CMAKE_IGNORE_PATH (latest)
[2] CMake docs: cmake-language(7) (variable scopes + cache persistence)
[3] CMake docs: ExternalProject (separate configure + cache/preload args mechanisms)
[4] CMake docs: CMAKE_IGNORE_PREFIX_PATH


Limit ZLIB hinting to avoid unintended CMAKE_IGNORE_PATH side effects on downstream consumers.

The exported CUOPT_ZLIB_HINT_CODE appends to CMAKE_IGNORE_PATH, which persists throughout the consumer's CMake configure step and affects all subsequent find_package() calls in Config mode—not just ZLIB. This can break downstream dependencies that legitimately require system CMake packages in /usr/lib64/cmake/ZLIB or /lib64/cmake/ZLIB. Remove the CMAKE_IGNORE_PATH line; ZLIB_DIR and ZLIB_ROOT alone are sufficient to guide ZLIB discovery.

🔧 Suggested fix (remove global ignore-path)
 set(CUOPT_ZLIB_HINT_CODE
   "set(_CUOPT_PREFIX \"\${CMAKE_CURRENT_LIST_DIR}/../../..\")\n"
   "if(EXISTS \"\${_CUOPT_PREFIX}/lib/cmake/ZLIB\")\n"
   "  set(ZLIB_DIR \"\${_CUOPT_PREFIX}/lib/cmake/ZLIB\")\n"
   "endif()\n"
   "set(ZLIB_ROOT \"\${_CUOPT_PREFIX}\")\n"
   "set(ZLIB_USE_STATIC_LIBS OFF)\n"
-  "list(APPEND CMAKE_IGNORE_PATH \"/usr/lib64/cmake/ZLIB\" \"/lib64/cmake/ZLIB\")\n"
 )
🤖 Prompt for AI Agents
In `@cpp/CMakeLists.txt` around lines 229 - 265, The CUOPT_ZLIB_HINT_CODE block is
adding a global CMAKE_IGNORE_PATH entry which can break downstream consumers;
remove the line that appends "/usr/lib64/cmake/ZLIB" and "/lib64/cmake/ZLIB" to
CMAKE_IGNORE_PATH inside CUOPT_ZLIB_HINT_CODE and retain only the ZLIB_DIR and
ZLIB_ROOT (and ZLIB_USE_STATIC_LIBS) hints so the
rapids_export_post_find_package_code calls (both BUILD and INSTALL for
rapids_logger) still provide targeted ZLIB guidance without modifying the global
CMake search path.

Comment on lines +312 to +377
# Generate C++ code from .proto file
set(PROTO_FILE "${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities/cuopt_remote.proto")
set(PROTO_SRCS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.cc")
set(PROTO_HDRS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.h")

# Find protoc compiler (provided by config package or target)
if(TARGET protobuf::protoc)
get_target_property(_PROTOBUF_PROTOC protobuf::protoc IMPORTED_LOCATION_RELEASE)
if(NOT _PROTOBUF_PROTOC)
get_target_property(_PROTOBUF_PROTOC protobuf::protoc IMPORTED_LOCATION)
endif()
else()
set(_PROTOBUF_PROTOC ${Protobuf_PROTOC_EXECUTABLE})
endif()

if(NOT _PROTOBUF_PROTOC)
# FindProtobuf.cmake can report libprotobuf without locating protoc.
# Fall back to PATH lookup (e.g. /usr/local/bin/protoc or /usr/bin/protoc).
find_program(_PROTOBUF_PROTOC protoc)
endif()

if(NOT _PROTOBUF_PROTOC)
message(FATAL_ERROR "protoc not found (install protobuf-compiler / ensure protoc is on PATH)")
endif()

# Generate gRPC service code if gRPC is available
if(CUOPT_ENABLE_GRPC)
if(TARGET grpc_cpp_plugin)
set(_GRPC_CPP_PLUGIN_EXECUTABLE "$<TARGET_FILE:grpc_cpp_plugin>")
else()
find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)
if(NOT _GRPC_CPP_PLUGIN_EXECUTABLE)
message(FATAL_ERROR "grpc_cpp_plugin not found")
endif()
endif()

set(GRPC_PROTO_FILE "${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities/cuopt_remote_service.proto")
set(GRPC_PROTO_SRCS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote_service.pb.cc")
set(GRPC_PROTO_HDRS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote_service.pb.h")
set(GRPC_SERVICE_SRCS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote_service.grpc.pb.cc")
set(GRPC_SERVICE_HDRS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote_service.grpc.pb.h")

add_custom_command(
OUTPUT "${GRPC_PROTO_SRCS}" "${GRPC_PROTO_HDRS}" "${GRPC_SERVICE_SRCS}" "${GRPC_SERVICE_HDRS}"
COMMAND ${_PROTOBUF_PROTOC}
ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR}
--grpc_out ${CMAKE_CURRENT_BINARY_DIR}
--plugin=protoc-gen-grpc=${_GRPC_CPP_PLUGIN_EXECUTABLE}
--proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities
${GRPC_PROTO_FILE}
DEPENDS ${GRPC_PROTO_FILE} ${PROTO_FILE}
COMMENT "Generating gRPC C++ code from cuopt_remote_service.proto"
VERBATIM
)
endif()

add_custom_command(
OUTPUT "${PROTO_SRCS}" "${PROTO_HDRS}"
COMMAND ${_PROTOBUF_PROTOC}
ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR}
--proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities
${PROTO_FILE}
DEPENDS ${PROTO_FILE}
COMMENT "Generating C++ code from cuopt_remote.proto"
VERBATIM
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd "CMakeLists.txt" --type f | head -20

Repository: NVIDIA/cuopt

Length of output: 723


🏁 Script executed:

find . -name "CMakeLists.txt" -path "*/cpp/*" -type f

Repository: NVIDIA/cuopt

Length of output: 733


🏁 Script executed:

if [ -f "cpp/CMakeLists.txt" ]; then
  echo "=== Lines 312-377 of cpp/CMakeLists.txt ==="
  sed -n '312,377p' cpp/CMakeLists.txt
  echo -e "\n=== Total file length ==="
  wc -l cpp/CMakeLists.txt
fi

Repository: NVIDIA/cuopt

Length of output: 2828


🏁 Script executed:

if [ -f "cpp/CMakeLists.txt" ]; then
  echo "=== Searching for FetchContent usage ==="
  grep -n "FetchContent\|protobuf\|grpc" cpp/CMakeLists.txt | head -30
fi

Repository: NVIDIA/cuopt

Length of output: 1979


🏁 Script executed:

if [ -f "cpp/CMakeLists.txt" ]; then
  echo "=== Lines around FetchContent declarations ==="
  sed -n '200,230p' cpp/CMakeLists.txt
  echo -e "\n=== Lines 282-310 (Protobuf/gRPC setup) ==="
  sed -n '282,310p' cpp/CMakeLists.txt
fi

Repository: NVIDIA/cuopt

Length of output: 2453


🏁 Script executed:

find . -path "*/python/libcuopt/*" -name "CMakeLists.txt" -type f | head -5

Repository: NVIDIA/cuopt

Length of output: 88


🏁 Script executed:

if [ -f "python/libcuopt/CMakeLists.txt" ]; then
  echo "=== Checking python/libcuopt CMakeLists.txt for FetchContent ==="
  grep -n "FetchContent" python/libcuopt/CMakeLists.txt | head -10
fi

Repository: NVIDIA/cuopt

Length of output: 210


🌐 Web query:

FetchContent protobuf grpc targets IMPORTED vs non-imported CMake

💡 Result:

When you consume protobuf / gRPC with FetchContent, you usually get build targets (non‑IMPORTED); when you consume them with find_package(), you usually get IMPORTED targets.

Why that happens

FetchContent_MakeAvailable(): non‑IMPORTED (built “in-tree”)

FetchContent_MakeAvailable(dep) downloads the source and (if it has a top-level CMakeLists.txt) effectively does an add_subdirectory() on it. Targets created that way are normal targets built as part of your build, so they are not IMPORTED. (cmake.org)

They may still have names like protobuf::libprotobuf / gRPC::grpc++, but in that case they’re typically ALIAS targets pointing at a real target name (because only IMPORTED and ALIAS targets may contain ::). (stackoverflow.com)

find_package(Protobuf|gRPC ...): IMPORTED (prebuilt/installed package)

When protobuf/gRPC are found via find_package(), the package’s config files (CONFIG mode) or CMake’s FindProtobuf module (MODULE mode) provide IMPORTED targets like protobuf::libprotobuf, protobuf::protoc, etc. (cmake.org)

gRPC’s installed gRPCConfig.cmake also expects certain dependency targets (protobuf, absl, etc.) to exist as imported targets, which is why errors often mention “missing imported targets …”. (stackoverflow.com)

How to tell in CMake (robustly)

get_target_property(_is_imported protobuf::libprotobuf IMPORTED)
message(STATUS "protobuf::libprotobuf IMPORTED = ${_is_imported}")
  • If it prints TRUE, it’s an IMPORTED target (from find_package).
  • If it prints FALSE (or empty), it’s a normal/alias target (typical for FetchContent/add_subdirectory).

Practical implications / common pitfall

  • Don’t mix “FetchContent builds from source” and “find_package uses installed package” for the same dependency, or you can get target-name clashes or mismatched expectations (e.g., gRPC package expecting imported protobuf targets).
  • Pick one model consistently:
    • all-in-tree (FetchContent_MakeAvailable(protobuf), FetchContent_MakeAvailable(grpc)), or
    • all-prebuilt (find_package(Protobuf CONFIG REQUIRED), find_package(gRPC CONFIG REQUIRED)).

If you share your exact error and whether you’re using find_package(... MODULE|CONFIG) (or gRPC_PROTOBUF_PACKAGE_TYPE), I can map it to the specific fix.

Citations:


🏁 Script executed:

grep -rn "IMPORTED_LOCATION\|non-imported\|FetchContent" cpp/CMakeLists.txt | head -20

Repository: NVIDIA/cuopt

Length of output: 389


🌐 Web query:

CMake add_custom_command DEPENDS target executable build order required

💡 Result:

In CMake there are two different “dependency” effects you may want for a custom command:

  1. Build order only (make sure the tool target is built before the custom command runs)
    If you use an executable target as the command or reference it via generator expressions like $<TARGET_FILE:tool>, CMake adds a target-level dependency automatically (i.e., correct build order). This does not necessarily make the custom command re-run when the tool is re-linked/rebuilt. [1]

  2. Re-run when the tool binary changes (file-level dependency)
    To make the custom command re-run whenever the tool target is rebuilt, list the target name in the custom command’s DEPENDS. When the dependency is an executable/library target, CMake adds both a target-level and a file-level dependency so the rule re-runs when the target is recompiled/relinked. [1][2]

Minimal pattern that usually does what people mean by “depends on a target executable and get the build order right”:

add_executable(mytool tool.cpp)

add_custom_command(
  OUTPUT  ${CMAKE_CURRENT_BINARY_DIR}/generated.txt
  COMMAND $<TARGET_FILE:mytool> ${CMAKE_CURRENT_BINARY_DIR}/generated.txt
  DEPENDS mytool            # ensures mytool is built AND reruns if mytool changes
)

add_custom_target(generate ALL
  DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/generated.txt
)

If you omit DEPENDS mytool and only use $<TARGET_FILE:mytool> in COMMAND, you typically get build order, but not a reliable “re-run when mytool changes” relationship. [1][2]

Sources: CMake add_custom_command() documentation (behavior of target references, DEPENDS on targets, target-level vs file-level deps). [1][2]


🏁 Script executed:

# Check if there are any related issues or TODOs in the file
grep -n "TODO\|FIXME\|XXX\|HACK" cpp/CMakeLists.txt | head -10

Repository: NVIDIA/cuopt

Length of output: 137


🏁 Script executed:

# Look for how other custom commands handle target dependencies in the repo
grep -B5 -A5 "add_custom_command" cpp/CMakeLists.txt | head -50

Repository: NVIDIA/cuopt

Length of output: 1306


Ensure custom command targets have explicit build dependencies and consistent generator expressions.

The custom commands for protobuf and gRPC code generation lack explicit target dependencies, so they won't re-run if protobuf::protoc or grpc_cpp_plugin are rebuilt. Additionally, protobuf::protoc uses get_target_property() for IMPORTED_LOCATION while grpc_cpp_plugin already uses the more robust $<TARGET_FILE:> generator expression—this should be made consistent.

Add explicit target dependencies to the DEPENDS clause and use generator expressions for both tools to ensure proper build ordering and re-run semantics:

Suggested adjustment
 if(TARGET protobuf::protoc)
-  get_target_property(_PROTOBUF_PROTOC protobuf::protoc IMPORTED_LOCATION_RELEASE)
-  if(NOT _PROTOBUF_PROTOC)
-    get_target_property(_PROTOBUF_PROTOC protobuf::protoc IMPORTED_LOCATION)
-  endif()
+  set(_PROTOBUF_PROTOC "$<TARGET_FILE:protobuf::protoc>")
+  set(_PROTOBUF_PROTOC_DEP protobuf::protoc)
 else()
   set(_PROTOBUF_PROTOC ${Protobuf_PROTOC_EXECUTABLE})
 endif()
 ...
 if(TARGET grpc_cpp_plugin)
   set(_GRPC_CPP_PLUGIN_EXECUTABLE "$<TARGET_FILE:grpc_cpp_plugin>")
+  set(_GRPC_CPP_PLUGIN_DEP grpc_cpp_plugin)
 else()
   find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)
   ...
 endif()
 ...
 add_custom_command(
   OUTPUT "${GRPC_PROTO_SRCS}" "${GRPC_PROTO_HDRS}" "${GRPC_SERVICE_SRCS}" "${GRPC_SERVICE_HDRS}"
   COMMAND ${_PROTOBUF_PROTOC}
   ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR}
        --grpc_out ${CMAKE_CURRENT_BINARY_DIR}
        --plugin=protoc-gen-grpc=${_GRPC_CPP_PLUGIN_EXECUTABLE}
        --proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities
        ${GRPC_PROTO_FILE}
-  DEPENDS ${GRPC_PROTO_FILE} ${PROTO_FILE}
+  DEPENDS ${GRPC_PROTO_FILE} ${PROTO_FILE} ${_PROTOBUF_PROTOC_DEP} ${_GRPC_CPP_PLUGIN_DEP}
   COMMENT "Generating gRPC C++ code from cuopt_remote_service.proto"
   VERBATIM
 )
 ...
 add_custom_command(
   OUTPUT "${PROTO_SRCS}" "${PROTO_HDRS}"
   COMMAND ${_PROTOBUF_PROTOC}
   ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR}
        --proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities
        ${PROTO_FILE}
-  DEPENDS ${PROTO_FILE}
+  DEPENDS ${PROTO_FILE} ${_PROTOBUF_PROTOC_DEP}
   COMMENT "Generating C++ code from cuopt_remote.proto"
   VERBATIM
 )
🤖 Prompt for AI Agents
In `@cpp/CMakeLists.txt` around lines 312 - 377, The add_custom_command blocks for
proto/grpc generation must declare explicit target dependencies and use
generator expressions consistently: update the protoc executable selection to
prefer $<TARGET_FILE:protobuf::protoc> (matching the existing
$<TARGET_FILE:grpc_cpp_plugin> usage) and add the corresponding imported targets
(protobuf::protoc and grpc_cpp_plugin) to the DEPENDS list of their respective
add_custom_command calls so CMake will re-run the commands when those plugins
are rebuilt; adjust the GRPC custom command DEPENDS to include grpc_cpp_plugin
and the proto custom command DEPENDS to include protobuf::protoc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants