About HPR-LP
HPR-LP is a GPU-accelerated solver for large-scale linear programming (LP). It is based on the Halpern Peaceman–Rachford (HPR) method with an adaptive restart strategy for stability and speed.
Problem statement
Problem form. The general LP solved by HPR-LP is:
where \(c \in \mathbb{R}^n\) is the objective vector, \(A \in \mathbb{R}^{m \times n}\) is the constraint matrix,
\(\mathcal{K} := \{ s \in \mathbb{R}^m : l_c \leq s \leq u_c \}\) with bounds \(l_c \in (\mathbb{R} \cup \{-\infty\})^m\) and \(u_c \in (\mathbb{R} \cup \{\infty\})^m\),
and \(\mathcal{C} := \{ x \in \mathbb{R}^n : l_v \leq x \leq u_v \}\) with bounds \(l_v \in (\mathbb{R} \cup \{-\infty\})^n\) and \(u_v \in (\mathbb{R} \cup \{\infty\})^n\).
Dual form. The corresponding dual problem is:
where \(\delta_S^*(\cdot)\) denotes the convex conjugate of the indicator function \(\delta_S(\cdot)\) associated with a closed convex set \(S\).
HPR method for LP
HPR-LP is based on the Halpern–Peaceman–Rachford (HPR) method for linear programming. The base algorithm appears below, followed by its convergence guarantees and complexity properties, which in turn motivate the algorithmic enhancements described later.
Base algorithm
For any \((y, z, x) \in \mathbb{R}^m \times \mathbb{R}^n \times \mathbb{R}^n\), the augmented Lagrangian of the dual problem is
where \(\sigma > 0\) is a penalty parameter. For notational convenience, let \(w := (y, z, x) \in \mathbb{W} := \mathbb{R}^m \times \mathbb{R}^n \times \mathbb{R}^n\). Then, an HPR method with semi-proximal terms for solving the above problems is summarized in Algorithm 1.
Remark. Steps 1–3 match the Douglas–Rachford (DR) updates. Step 4 applies the Peaceman–Rachford (PR) relaxation, and Step 5 adds a Halpern step with step size \(1/(k+2)\). Together, Algorithm 1 behaves like an accelerated, preconditioned ADMM-type method (pADMM) with \(\alpha=2\).
An optimal dual pair \((y^*,z^*)\) exists if there is \(x^*\in\mathbb{R}^n\) such that \((y^*,z^*,x^*)\) satisfies the following KKT system:
Assumption 1 There exists a vector \((y^*, z^*, x^*) \in \mathbb{R}^m \times \mathbb{R}^n \times \mathbb{R}^n\) satisfying the KKT system above.
Under Assumption 1, the primal–dual problem is equivalent to finding \(w^*\) with \(0 \in \mathcal{T}w^*\), where the maximal monotone operator \(\mathcal{T}\) is
The global convergence of Algorithm 1 is established in the following proposition.
Proposition 2. Suppose that Assumption 1 holds. Then the sequence \(\{\bar{w}^k\} = \{(\bar{y}^k, \bar{z}^k, \bar{x}^k)\}\) generated by the HPR method with semi-proximal terms in Algorithm 1 converges to a point \(w^* = (y^*, z^*, x^*)\), where \((y^*, z^*)\) solves the dual problem and \(x^*\) solves the primal problem.
For the complexity analysis, define the self-adjoint positive semidefinite operator \(\mathcal{M}:\mathbb{R}^m \times \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^m \times \mathbb{R}^n \times \mathbb{R}^n\) as
where \(I_n\) is the \(n\times n\) identity. Two quantities are tracked: the KKT residual and the objective error. The residual map is
Let \(\{(\bar{y}^k,\bar{z}^k)\}\) be generated by Algorithm 1. Define the objective error as
where \((y^*, z^*)\) is the limit point of the sequence \(\{(\bar{y}^k, \bar{z}^k)\}\).
The complexity of the HPR method with semi-proximal terms is summarized in the following theorem.
Theorem 3 Suppose that Assumption 1 holds. Let \(\{w^k\} = \{(y^k, z^k, x^k)\}\) and \(\{\bar{w}^k\} = \{(\bar{y}^k, \bar{z}^k, \bar{x}^k)\}\) be two sequences generated by the HPR method with semi-proximal terms in Algorithm 1, and let \(w^* = (y^*, z^*, x^*)\) be its limit point. Define \(R_0 = \|w^0 - w^*\|_{\mathcal{M}}\). Then for all \(k \geq 0\), the following iteration complexity bounds hold:
Algorithmic enhancements
To boost performance on LP (and CCQP), HPR uses two key enhancements: restart strategies and adaptive updates of the penalty \(\sigma\). These are motivated by the \(O(1/k)\) bounds in Theorem 3. For completeness, Algorithm 2 shows the full HPR-LP framework.
Restart strategy
Restarting plays a central role in Halpern iterations. The complexity bound depends on the initial anchor distance \(R_0\). As the iterates move closer to the solution, keeping a far-away anchor becomes inefficient. Resetting the anchor to the current iterate tightens the bound and improves late-stage progress.
This motivates the merit function
where \(w^*\) is any solution of the KKT system. Since \(w^*\) is unknown, the practical surrogate
is employed in defining restart rules. The following criteria are commonly adopted:
Sufficient decay:
Necessary decay + no local progress:
Long inner loop:
where \(0 < \alpha_1 < \alpha_2 < 1\) and \(0 < \alpha_3 < 1\). When any criterion is met, the inner loop restarts with \(w^{r+1,0}=\bar{w}^{r,\tau_r}\) and an updated \(\sigma_{r+1}\).
Update rules for \(\sigma\)
The penalty parameter \(\sigma\) is updated at restart points to tighten the bound and reduce residuals. Ideally, \(\sigma\) is chosen to minimize the weighted distance to the solution:
where \(w^*\) is any solution of the KKT system.
Substituting the definition of \(\mathcal{M}\) leads to the closed-form expression
Since \((x^*,y^*)\) are unknown, observable progress is used instead:
which yields the implementable update rule
Several special cases of \(\mathcal{T}_1\) are listed below:
Case \(\mathcal{T}_1 = 0\).
This case occurs when \(l_c = u_c = b\), which arises in applications with special structure in \(A\). The \(y\)-update then reduces to solving the linear system\[A A^* \bar{y}^{r,t+1} = \frac{1}{\sigma_r} \big( b - A(\bar{x}^{r,t+1} + \sigma_r(\bar{z}^{r,t+1} - c)) \big),\]which is computationally affordable in practice. In this case, the update rule simplifies to
\[\sigma_{r+1} = \frac{\| \bar{x}^{r,\tau_r} - x^{r,0} \|}{\| A^*(\bar{y}^{r,\tau_r} - y^{r,0}) \|}.\]Case \(\mathcal{T}_1 = \lambda_A I_m - A A^*\) with \(\lambda_A \geq \|A\|_2^2\).
This choice applies when \(l_c \neq u_c\) or when solving the system in case 1 is too expensive. The \(y\)-update takes the form\[\bar{y}^{r,t+1} = \frac{1}{\sigma_r \lambda_A} \Big( \Pi_{\mathcal{K}}(R_y) - R_y \Big),\]where \(R_y := A(2\bar{x}^{r,t+1} - x^{r,t}) - \sigma_r \lambda_A y^{r,t}\). In this setting, the update for \(\sigma\) becomes
\[\sigma_{r+1} = \frac{\| \bar{x}^{r,\tau_r} - x^{r,0} \|}{\sqrt{\lambda_A} \; \| \bar{y}^{r,\tau_r} - y^{r,0} \|}.\]
Note that \(\Delta_x\) and \(\Delta_y\) may deviate significantly from the true quantities.
GPU Implementation
We first present the update formulas for each subproblem in HPR-LP. Specifically, for any \(r\ge 0\) and \(t\ge 0\), the update of \(z^{r,t+1}\) is:
The update of \(x^{r,t+1}\) is:
For general LP problems, set \(\mathcal{T}_1=\lambda I_m- AA^*\) with \(\lambda \ge \lambda_1(AA^*)\) in HPR-LP. The update for \(y^{r,t+1}\) is:
where \(R_y := A\big(2x^{r, t+1} - x^{r,t}\big) - \sigma_r \lambda_A y^{r,t}.\) Combining these relations shows \(z^{r,t+1}\) need not be computed at every step; it is only required when checking termination. Each step reduces to SpMV, vector operations, and simple projections, with per-iteration cost \(O(\mathrm{nnz}(A))\).
On GPUs, these operations are mapped to custom CUDA kernels. Matrix–vector products use cusparseSpMV() with CUSPARSE_SPMV_CSR_ALG2 for deterministic results.