Introduction to SOLNP ver 2.0.0

Introduction

The SOLNP algorithm, first proposed by Ye (1989), is a Sequential Quadratic Programming (SQP) approach designed to solve general nonlinear optimization problems with both equality and inequality constraints. In 2010, it was implemented in R as the Rsolnp package, originally to fill the need for a robust and flexible constrained optimizer required by the author’s own GARCH modeling package, rugarch. Since then, Rsolnp has remained one of the few readily available and user-friendly options for nonlinear constrained optimization in R.

Despite the growth of R’s ecosystem in statistics and data science, the development and availability of modern nonlinear constrained optimization solvers in R has lagged behind other languages, with notable exceptions being nloptr (which wraps NLopt solvers) and ipoptr (an R interface to the IPOPT solver)1. Moreover, in the nonlinear programming (NLP) space, most state-of-the-art solvers remain closed source and commercial, in contrast to other algorithmic advances that have transitioned from academia to open-source availability.

Rsolnp does not claim to be competitive with commercial grade nonlinear solvers, but it has generally served its purpose reasonably well for small to medium sized problems.

Problem Statement

The general nonlinear optimization problem addressed by SOLNP can be stated as follows:

\[ \begin{align} & \min_{x \in \mathbb{R}^n} \quad f(x) \\\\ & \text{subject to:} \\\\ & \quad l_g \leq g(x) \leq u_g \qquad \text{(general inequality constraints)} \\\\ & \quad h(x) = B \qquad \text{(equality constraints with bound vector \(B\))} \\\\ & \quad x^L \leq x \leq x^U \qquad \text{(variable bounds)} \end{align} \]

where:

Note:

Inequalities

Each inequality constraint \(l_i \leq g_i(x) \leq u_i\) is transformed into two equality constraints via slack variables. Introduce slack variable \(s_i \geq 0\), then define:

\[ h_i(x, s_i) = g_i(x) - s_i = l_i \quad \text{if } g_i(x) \geq l_i \\ h_i(x, s_i) = g_i(x) + s_i = u_i \quad \text{if } g_i(x) \leq u_i \]

This can be unified into:

\[ g_i(x) - s_i = l_i \quad \text{and} \quad g_i(x) + s_i = u_i \]

So every bounded inequality becomes an equality constraint with a slack, and the slack becomes part of the optimization variable vector.

In the actual implementation, if a constraint is double-bounded, a “penalty equality constraint” is introduced \(h_i(x, s_i) := g_i(x) - s_i = m_i\) and add a penalty term to objective if \(s_i < 0\).

Slack Variable Embedding

The augmented optimization vector becomes:

\(\tilde{x} = \begin{bmatrix} x \\ s \end{bmatrix}\)

And the original constraints are reformulated into a system of equalities only:

\[ \tilde{g}(x, s) = \begin{bmatrix} g_{\text{eq}}(x) \\ g_{\text{ineq}}(x) - s - l \\ g_{\text{ineq}}(x) + s - u \end{bmatrix} = 0 \]

Optimization via Augmented Lagrangian

Unlike traditional SQP, SOLNP does not explicitly solve for the Karush–Kuhn–Tucker (KKT) conditions. Instead, it uses a partial augmented Lagrangian:

\[ L(x, \lambda, \rho) = f(x) + \lambda^\top g(x) + \frac{\rho}{2} \|g(x)\|^2 \]

where:

No Stationarity-Based KKT Step

Unlike SQP or interior-point methods, SOLNP does not solve the full KKT system at each iteration, instead using:

The stationarity check is only implicit in terms of convergence of \(\nabla f + J^\top \lambda \to 0\), not enforced directly.

Why Higher Tolerances Are Not Preferred

Setting the tolerance in SOLNP (or augmented Lagrangian solvers) much tighter than 1e-8—such as 1e-12—can actually degrade solution quality because the algorithm becomes sensitive to numerical round-off and finite precision errors, especially in gradient and Hessian computations or matrix solves. As the tolerance approaches the limits of double-precision arithmetic, the solver may “chase noise” rather than real improvements, leading to erratic or even worse solutions, with unreliable convergence and possible violation of constraints. For most problems, practical and reliable results are achieved with tolerances in the 1e-6 to 1e-8 range.

C++ Version

The original solnp function did not make use of analytic gradients or Jacobians, and was written entirely in R, a direct translation from the Matlab code. Since version 2.0.0, a new function csolnp, written in C++2 allows the user to provide analytic gradient and/or Jacobians for the inequality and equality constraints. In the absence of analytic functions, finite differences are used using the functions from the numDeriv package.

The function signature for the csolnp function is now slightly different from that of solnp in both content and code styling:

Code
args(csolnp)
#> function (pars, fn, gr = NULL, eq_fn = NULL, eq_b = NULL, eq_jac = NULL, 
#>     ineq_fn = NULL, ineq_lower = NULL, ineq_upper = NULL, ineq_jac = NULL, 
#>     lower = NULL, upper = NULL, control = list(), use_r_version = FALSE, 
#>     ...) 
#> NULL

vs

Code
args(solnp)
#> function (pars, fun, eqfun = NULL, eqB = NULL, ineqfun = NULL, 
#>     ineqLB = NULL, ineqUB = NULL, LB = NULL, UB = NULL, control = list(), 
#>     ...) 
#> NULL

Speedup of anywhere up to 10x can be expected for some problems. Additionally, certain enhancements have been made in the C++ code in regards to criteria for early termination to avoid stalling in the line search phase.

Multi-Start Solver

A new function csolnp_ms implements a multi-start strategy, similar to the existing gosolnp function. However, it differs in the way starting candidate values are calculated, with the original function using rejection sampling to find feasible starting points, whilst the new function uses a combination of buffered box sampling and a fast interior-point feasibility projection. This approach efficiently generates initial values that strictly satisfy parameter bounds and nonlinear constraints by first sampling points slightly away from the boundaries and then rapidly projecting them into the feasible region using a penalized minimization. This results in more reliable and diverse initial candidates for multi-start optimization, especially in problems with complex or tightly constrained feasible regions.

Test Suite

As of version 2.0.0, a new test suite based on Hock and Schittkowski (1980) and Schittkowski (2012) has been included, translated from the Fortran codes here. Currently, about 60 problems of the 306 have been translated, and it is the intention of the author to eventually translate all the tests.

Each test, returns a list with the following information:

The suite can be called at once or by reference to a specific test, as illustrated below.

Code
prob <- solnp_problem_suite(number = 10)
sol <- csolnp(pars = prob$start, fn = prob$fn, gr = prob$gr, eq = prob$eq_fn, eq_b = prob$eq_b, 
              eq_jac = prob$eq_jac, ineq_fn = prob$ineq_fn, ineq_lower = prob$ineq_lower, 
              ineq_upper = prob$ineq_upper, ineq_jac = prob$ineq_jac, lower = prob$lower,
              upper = prob$upper)
#> Warning in solnp_problem_setup(pars, fn, gr, eq_fn, eq_b, eq_jac, ineq_fn, : 
#> lower inequality values violated with initial pars.
print(prob$name)
#> [1] "hs10"
print(c("convergence" = sol$convergence))
#> convergence 
#>           0
print(c("csolnp objective" = sol$objective, "best objective" = prob$best_fn))
#> csolnp objective   best objective 
#>       -0.9998931       -1.0000000
print(sol$elapsed_time)
#> Time difference of 0.005572796 secs

We also test a more specific problem based on the GARCH(1,1) model with analytic derivatives:

Code
prob <- solnp_problem_suite(suite = "Other", number = 4)
sol <- csolnp(pars = prob$start, fn = prob$fn, gr = prob$gr, eq = prob$eq_fn, eq_b = prob$eq_b, 
              eq_jac = prob$eq_jac, ineq_fn = prob$ineq_fn, ineq_lower = prob$ineq_lower, 
              ineq_upper = prob$ineq_upper, ineq_jac = prob$ineq_jac, lower = prob$lower,
              upper = prob$upper)
print(prob$name)
#> [1] "garch"
print(c("convergence" = sol$convergence))
#> convergence 
#>           0
print(c("csolnp objective" = sol$objective, "best objective" = prob$best_fn))
#> csolnp objective   best objective 
#>          1074.36          1074.36
print(sol$elapsed_time)
#> Time difference of 0.01990199 secs

Finally, we take a look at a solver-killer problem, HS55 (from the Hock-Schittkowski suite), which poses the following challenges for NLP solvers:

Instead of using csolnp which fails in this case, we try out the csolnp_ms approach.

Code
prob <- solnp_problem_suite(number = 55)
sol <- csolnp_ms(prob$fn, gr = prob$gr, eq_fn = prob$eq_fn, eq_b = prob$eq_b, 
              eq_jac = prob$eq_jac, ineq_fn = prob$ineq_fn, ineq_lower = prob$ineq_lower, 
              ineq_upper = prob$ineq_upper, ineq_jac = prob$ineq_jac, lower = prob$lower,
              upper = prob$upper, n_candidates = 200, penalty = 1e5, 
              control = list(min_iter = 1000, max_iter = 100, tol = 1e-8),
              seed = 300)

print(paste0("Solution : ", round(sol$objective,3), " | Best Objective :", round(prob$best_fn,3)))
#> [1] "Solution : 6.257 | Best Objective :6.333"
print(paste0("Equaility Violation : ", round(sol$kkt_diagnostics$eq_violation,3)))
#> [1] "Equaility Violation : 0.172"

The result is is not too bad, though the equality violation is still > 1e-6 which is what we were aiming for.

Differences with full Active-Set/SQP solvers

Below is a short summary of the key differences between the SOLNP algorithm and full Active-set/SQP methods:

Aspect SOLNP Active-Set SQP / QP
Inequality handling Slacks, penalty, all treated equally Active set, complementarity enforced
Subproblem Least squares (QR, no QP) Quadratic programming
KKT stationarity Not always enforced Enforced at every step
Complementarity Not enforced Enforced at every step
Scaling sensitivity Can be significant Generally better
Globalization Penalty parameter tuning crucial Line search/trust region, more robust
Numerical behavior Simple, robust for easy problems More complex, better for hard ones

Conclusion

The C++ rewrite of solnp, inclusion of analytic derivative options and the new test suite should hopefully elevate the quality of the Rsolnp package. Future enhancements will include expansion of the test suite and a deeper investigation into some of the underlying approaches used and ways to enhance the solver.

References

Hock, W, and K Schittkowski. 1980. “Test Examples for Nonlinear Programming Codes.” Journal of Optimization Theory and Applications 30 (1): 127–29.
Schittkowski, Klaus. 2012. “More Test Examples for Nonlinear Programming Codes.” Lecture Notes in Economics and Mathematical Systems 282.
Ye, Yinyu. 1989. SOLNP USERS’GUIDE.

  1. the latter is no longer available on CRAN due to licensing issues↩︎

  2. Using Rcpp and RcppArmadillo↩︎

mirror server hosted at Truenetwork, Russian Federation.