A Continuous Relaxation of the Constrained $$\ell _2-\ell _0$$ Problem

Bechensteen, Arne Henrik; Blanc-Féraud, Laure; Aubert, Gilles

doi:10.1007/s10851-020-01014-y

A Continuous Relaxation of the Constrained $\ell _2-\ell _0$ Problem

Published: 09 January 2021

Volume 63, pages 472–491, (2021)
Cite this article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Arne Henrik Bechensteen ORCID: orcid.org/0000-0002-5744-6244¹,
Laure Blanc-Féraud¹ &
Gilles Aubert²

391 Accesses
Explore all metrics

Abstract

We focus on the minimization of the least square loss function under a k-sparse constraint encoded by a $\ell _0$ pseudo-norm. This is a non-convex, non-continuous and NP-hard problem. Recently, for the penalized form (sum of the least square loss function and a $\ell _0$ penalty term), a relaxation has been introduced which has strong results in terms of minimizers. This relaxation is continuous and does not change the global minimizers, among other favorable properties. The question that has driven this paper is the following: can a continuous relaxation of the k-sparse constraint problem be developed following the same idea and same steps as for the penalized $\ell _2-\ell _0$ problem? We calculate the convex envelope of the constrained problem when the observation matrix is orthogonal and propose a continuous non-smooth, non-convex relaxation of the k-sparse constraint functional. We give some equivalence of minimizers between the original and the relaxed problems. The subgradient is calculated as well as the proximal operator of the new regularization term, and we propose an algorithm that ensures convergence to a critical point of the k-sparse constraint problem. We apply the algorithm to the problem of single-molecule localization microscopy and compare the results with well-known sparse minimization schemes. The results of the proposed algorithm are as good as the state-of-the-art results for the penalized form, while fixing the constraint constant is usually more intuitive than fixing the penalty parameter.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stage convex relaxation approach to least squares loss constrained low-rank plus sparsity optimization problems

Article 07 October 2015

Nonuniqueness of Solutions of a Class of $$\ell _{0}$$ -minimization Problems

Article 29 January 2021

New Insights on the Optimality Conditions of the $$\ell _2-\ell _0$$ Minimization Problem

Article 18 October 2019

References

Andersson, F., Carlsson, M., Olsson, C.: Convex envelopes for fixed rank approximation. Optim. Lett. 11(8), 1783–1795 (2017)
Article MathSciNet Google Scholar
Bechensteen, A., Blanc-Féraud, L., Aubert, G.: New $l_2- l_0$ algorithm for single-molecule localization microscopy. Biomed. Opt. Express 11(2), 1153–1174 (2020)
Article Google Scholar
Beck, A., Eldar, Y.C.: Sparsity constrained nonlinear optimization: optimality conditions and algorithms. SIAM J. Optim. 23(3), 1480–1509 (2013)
Article MathSciNet Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). https://doi.org/10.1137/080716542
Article MathSciNet MATH Google Scholar
Betzig, E., Patterson, G.H., Sougrat, R., Lindwasser, O.W., Olenych, S., Bonifacino, J.S., Davidson, M.W., Lippincott-Schwartz, J., Hess, H.F.: Imaging intracellular fluorescent proteins at nanometer resolution. Science 313(5793), 1642–1645 (2006). https://doi.org/10.1126/science.1127344
Article Google Scholar
Bi, S., Liu, X., Pan, S.: Exact penalty decomposition method for zero-norm minimization based on mpec formulation. SIAM J. Sci. Comput. 36(4), A1451–A1477 (2014)
Article MathSciNet Google Scholar
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1), 459–494 (2014). https://doi.org/10.1007/s10107-013-0701-9
Article MathSciNet MATH Google Scholar
Bourguignon, S., Ninin, J., Carfantan, H., Mongeau, M.: Exact sparse approximation problems via mixed-integer programming: formulations and computational performance. IEEE Trans. Signal Process. 64(6), 1405–1419 (2016)
Article MathSciNet Google Scholar
Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37(4), 373–384 (1995). https://doi.org/10.2307/1269730
Article MathSciNet MATH Google Scholar
Burke, J.V., Curtis, F.E., Lewis, A.S., Overton, M.L., Simões, L.E.: Gradient sampling methods for nonsmooth optimization. arXiv preprint arXiv:1804.11003 (2018)
Candes, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006). https://doi.org/10.1109/TIT.2005.862083
Article MathSciNet MATH Google Scholar
Candes, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted $\ell _1$ minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)
Article MathSciNet Google Scholar
Carlsson, M.: On convexification/optimization of functionals including an l2-misfit term. arXiv:1609.09378 [math] (2016)
Carlsson, M.: On convex envelopes and regularization of non-convex functionals without moving global minima. J. Optim. Theory Appl. 183(1), 66–84 (2019)
Article MathSciNet Google Scholar
Chahid, M.: Echantillonnage compressif appliqué à la microscopie de fluorescence et à la microscopie de super résolution. Ph.D. thesis, Bordeaux (2014)
Clarke, F.H.: Optimization and Nonsmooth Analysis, vol. 5. SIAM, Philadelphia (1990)
Book Google Scholar
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward–backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005)
Article MathSciNet Google Scholar
Gazagnes, S., Soubies, E., Blanc-Féraud, L.: High density molecule localization for super-resolution microscopy using CEL0 based sparse approximation. In: 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), pp. 28–31. IEEE (2017)
Hess, S.T., Girirajan, T.P.K., Mason, M.D.: Ultra-high resolution imaging by fluorescence photoactivation localization microscopy. Biophys. J. 91(11), 4258–4272 (2006). https://doi.org/10.1529/biophysj.106.091116
Article Google Scholar
Larsson, V., Olsson, C.: Convex low rank approximation. Int. J. Comput. Vis. 120(2), 194–214 (2016). https://doi.org/10.1007/s11263-016-0904-7
Article MathSciNet MATH Google Scholar
Li, H., Lin, Z.: Accelerated proximal gradient methods for nonconvex programming. In: Advances in Neural Information Processing Systems, pp. 379–387 (2015)
Lu, Z., Zhang, Y.: Sparse approximation via penalty decomposition methods. SIAM J. Optim. 23(4), 2448–2478 (2013)
Article MathSciNet Google Scholar
Mallat, S.G., Zhang, Z.: Matching pursuits with time–frequency dictionaries. IEEE Trans. Signal Process. 41(12), 3397–3415 (1993). https://doi.org/10.1109/78.258082
Article MATH Google Scholar
Mordukhovich, B.S., Nam, N.M.: An easy path to convex analysis and applications. Synth. Lect. Math. Stat. 6(2), 1–218 (2013)
Article Google Scholar
Nikolova, M.: Relationship between the optimal solutions of least squares regularized with $\ell _0$-norm and constrained by k-sparsity. Appl. Comput. Harmonic Anal. 41(1), 237–265 (2016). https://doi.org/10.1016/j.acha.2015.10.010
Article MathSciNet MATH Google Scholar
Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of 27th Asilomar Conference on Signals, Systems and Computers, Vol. 1, pp. 40–44 (1993). https://doi.org/10.1109/ACSSC.1993.342465
Peleg, D., Meir, R.: A bilinear formulation for vector sparsity optimization. Signal Process. 88(2), 375–389 (2008). https://doi.org/10.1016/j.sigpro.2007.08.015
Article MATH Google Scholar
Pilanci, M., Wainwright, M.J., El Ghaoui, L.: Sparse learning via Boolean relaxations. Math. Program. 151(1), 63–87 (2015)
Article MathSciNet Google Scholar
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317. Springer, Berlin (2009)
MATH Google Scholar
Rust, M.J., Bates, M., Zhuang, X.: Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM). Nat. Methods 3(10), 793–796 (2006). https://doi.org/10.1038/nmeth929
Article Google Scholar
Sage, D., Kirshner, H., Pengo, T., Stuurman, N., Min, J., Manley, S., Unser, M.: Quantitative evaluation of software packages for single-molecule localization microscopy. Nat. Methods 12(8), 717 (2015)
Article Google Scholar
Sage, D., Pham, T.A., Babcock, H., Lukes, T., Pengo, T., Chao, J., Velmurugan, R., Herbert, A., Agrawal, A., Colabrese, S., et al.: Super-resolution fight club: assessment of 2d and 3d single-molecule localization microscopy software. Nat. Methods 16(5), 387–395 (2019)
Article Google Scholar
Selesnick, I.: Sparse regularization via convex analysis. IEEE Trans. Signal Process. 65(17), 4481–4494 (2017)
Article MathSciNet Google Scholar
Simon, B.: Trace Ideals and Their Applications, Vol. 120. American Mathematical Society, Philadelphia (2005)
Soubies, E., Blanc-Féraud, L., Aubert, G.: A continuous exact $\ell _0$ penalty (CEL0) for least squares regularized problem. SIAM J. Imaging Sci. 8(3), 1607–1639 (2015)
Article MathSciNet Google Scholar
Soubies, E., Blanc-Féraud, L., Aubert, G.: A unified view of exact continuous penalties for $\backslash $ell\_2-$\backslash $ell\_0 minimization. SIAM J. Optim. 27(3), 2034–2060 (2017)
Article MathSciNet Google Scholar
Soussen, C., Idier, J., Brie, D., Duan, J.: From Bernoulli–Gaussian deconvolution to sparse signal restoration. IEEE Trans. Signal Process. 59(10), 4572–4584 (2011)
Article MathSciNet Google Scholar
Tono, K., Takeda, A., Gotoh, J.: Efficient dc algorithm for constrained sparse optimization. arXiv preprint arXiv:1701.08498 (2017)

Download references

Author information

Authors and Affiliations

Université Côte d’Azur, CNRS, Inria, Laboratoire I3S UMR 7271, 06903, Sophia Antipolis, France
Arne Henrik Bechensteen & Laure Blanc-Féraud
Université Côte d’Azur, UNS, Laboratoire J. A. Dieudonné UMR 7351, 06100, Nice, France
Gilles Aubert

Authors

Arne Henrik Bechensteen
View author publications
You can also search for this author in PubMed Google Scholar
Laure Blanc-Féraud
View author publications
You can also search for this author in PubMed Google Scholar
Gilles Aubert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arne Henrik Bechensteen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors would like to thank the anonymous reviewers for their detailed comments and suggestions. This work has been supported by the French government, through a financial Ph.D. allocation from MESRI and through the 3IA Côte d’Azur Investments in the Future project managed by the National Research Agency (ANR) with the Reference Number ANR-19-P3IA-0002.

A Appendix

1.1 A.1 Preliminary Results for Lemma 1

Proposition 2 (Reminder) Let $x\in {\mathbb {R}}^N.$ There exists $j\in {\mathbb {N}}$ such that $0<j\le k$ and

$$\begin{aligned} |x_{k-j+1}^\downarrow |\le \frac{1}{j}\sum _{i=k-j+1}^N |x_i^\downarrow |\le |x_{k-j}^\downarrow | \end{aligned}$$

(23)

where the left inequality is strict if $j\ne 1$, and where $x_0=+\infty $. Furthermore, $T_k(x)$ is defined as the smallest integer that verifies the double inequality.

Proof

First, we suppose that (23) is not true for $j\in \{1,2,\dots , k-1\}$, i.e., either

$$\begin{aligned} |x_{k-j+1}^\downarrow |>\frac{1}{j}\sum _{i=k-j+1}^N |x_i^\downarrow |, \end{aligned}$$

(24)

or

$$\begin{aligned} \frac{1}{j}\sum _{i=k-j+1}^N |x_i^\downarrow |> |x_{k-j}^\downarrow |, \end{aligned}$$

(25)

or both. We prove by recurrence that if (23) is not true $\forall j \in \{1,2,\dots ,k-1\}$, then (24) is false, and (25) is true. We investigate the case $j=1$:

$$\begin{aligned} \sum _{i=k}^N|x_i^\downarrow |=|x_k^\downarrow | +\sum _{i=k+1}^N|x_i^\downarrow |\ge |x_k^\downarrow |. \end{aligned}$$

(26)

The above inequality is obvious, and we can conclude that for $j=1$, (24) is false, and thus, (25) must be true, i.e.,

$$\begin{aligned} \sum _{i=k}^N |x_i^\downarrow |> |x_{k-1}^\downarrow |. \end{aligned}$$

(27)

We suppose that for some $j\in \{1,2,\dots ,k-1\}$, (24) is false and (25) is true, and we investigate $j+1$.

$$\begin{aligned} \frac{1}{j+1}\sum _{i=k-j}^N|x_i^\downarrow |&=\frac{1}{j+1}\left( |x_{k-j}^\downarrow | +\frac{j}{j}\sum _{i=k-j+1}^N|x_i^\downarrow |\right) \nonumber \\&>\frac{1}{j+1}\left( |x_{k-j}^\downarrow | +j |x_{k-j}^\downarrow |\right) =|x_{k-j+1}^\downarrow |. \end{aligned}$$

(28)

We get (28) since we have supposed (25) is true for j. Thus, by recurrence, we can conclude that (24) is false, and (25) is true $\forall j \in \{1,2,\dots ,k-1\}$.

Now, we investigate $j=k$:

$$\begin{aligned} \frac{1}{k}\sum _{i=1}^N|x_i^\downarrow |&=\frac{1}{k}\left( |x_{1}^\downarrow | +\frac{k-1}{k-1}\sum _{i=2}^N|x_i^\downarrow |\right) \nonumber \\&>\frac{1}{k}\left( |x_{1}^\downarrow | +(k-1) |x_{1}^\downarrow |\right) =|x_{1}^\downarrow |. \end{aligned}$$

(29)

We use the fact that (25) is true for $j=k-1$ to obtain the above inequality. Thus, (24) is false. By definition $x^\downarrow _0=+\infty $, and thus, (25) is also false. Thus, $T_k(x)=k$ verifies the double inequality in (23).

To conclude, either $T_k(x)=k$, or there exists $j\in \{1,2,\dots ,k-1\}$ such that $T_k(x)=j$. $\square $

Definition 4

Let $P^{(x)}\in {\mathbb {R}}^{N\times N}$ be a permutation matrix such that $P^{(x)}x=x^{\downarrow }$. The space ${\mathcal {D}}(x)$ is defined as:

$$\begin{aligned} {\mathcal {D}}(x)=\{b; \exists P^{(x)} \text { s.t. } P^{(x)}b=b^{\downarrow } \}. \end{aligned}$$

$z \in {\mathcal {D}}(x)$ means $<z,x>=<z^{\downarrow },x^{\downarrow }>$.

Remark 3

${\mathcal {D}}(x)={\mathcal {D}}(|x|)$, since we have $|x^{\downarrow }|=|x|^\downarrow $.

Proposition 4

Let $(a,b)\in {\mathbb {R}}_{\ge 0}^N\times {\mathbb {R}}_{\ge 0}^N$. Then,

$$\begin{aligned} \sum _i a_ib_i\le \sum _i a^{\downarrow }_i b^{\downarrow }_i \end{aligned}$$

and the inequality is strict if $b\notin {\mathcal {D}}(a)$.

Proof

[34, Lemma 1.8] proves it without proving the strict inequality.

We assume that a is not on the form $a=t(1,1\dots ,1)^T$, i.e., there exists $ i\ne j,\,\, a_i\ne a_j$. If $a=t(1,1\dots ,1)^T$, then $b\in {\mathcal {D}}(a)$, and $\sum _i a_ib_i =\sum _i a^{\downarrow }_i b^{\downarrow }_i$. Moreover, for simplicity, without loss of generality, we suppose $a=a^{\downarrow }$. We write

$$\begin{aligned}&\sum _i^Na_ib_i= a_N\sum _{i=1}^Nb_i +(a_{N-1}-a_N)\nonumber \\&\quad \sum _{i=1}^{N-1}b_i+\dots +(a_1-a_2)b_1 . \end{aligned}$$

(30)

As it is obvious that $\forall \, j=1,\dots N$

$$\begin{aligned} \sum _{i=1}^j b_i\le \sum _{i=1}^j b^{\downarrow }_i, \end{aligned}$$

(31)

and since $a_{j-1}-a_j\ge 0\, \forall \, j$, we get

$$\begin{aligned} \sum _{i=1}^N a_ib_i\le \sum _{i=1}^N a_i b^{\downarrow }_i= \sum _{i=1}^N a^{\downarrow }_i b^{\downarrow }_i \end{aligned}$$

(32)

The goal of Proposition 4 is to show that the inequality in (32) is strict if $b\notin {\mathcal {D}}(a)$.

First, we can remark if $b\notin {\mathcal {D}}(a)$, then there exists $j_0\in \{2,3,\dots ,N\}$ such

$$\begin{aligned} \sum _{i=1}^{j_0-1} b_i < \sum _{i=1}^{j_0-1} b^{\downarrow }_i. \end{aligned}$$

(33)

By contradiction, if (33) is not true, we have $\forall \, j\in \{2,3,\dots ,N\}$

$$\begin{aligned} \sum _{1=1}^{j-1}b^{\downarrow }_i \le \sum _{1=1}^{j-1}b_i, \end{aligned}$$

and with (31), we get

$$\begin{aligned} \sum _{1=1}^{j-1}b^{\downarrow }_i =\sum _{1=1}^{j-1}b_i. \end{aligned}$$

(34)

From (34), we easily obtain $\forall \, j,$

$$\begin{aligned} b_j=b^{\downarrow }_j, \end{aligned}$$

which means $b^{\downarrow }=b$, i.e., $b\in {\mathcal {D}}(a)$, which contradicts the hypothesis $b\notin {\mathcal {D}}(a)$. So there exists $j_0$ such that (33) is true, and if $a_{j_0-1}\ne a_{j_0}$

$$\begin{aligned} (a_{j_0-1}-a_{j_0})\sum _{i=1}^{j_0-1}b_i < (a_{j_0-1}-a_{j_0})\sum _{i=1}^{j_0-1}b^{\downarrow }_i, \end{aligned}$$

which, with (30), implies

$$\begin{aligned} \sum _{i=1}^N a_ib_i < \sum _{i=1}^N a_i b^{\downarrow }_i. \end{aligned}$$

It remains to examine the case where $a_{j_0-1}=a_{j_0}$. In this case, we claim there exists $j_1\in \{1,\dots ,j_{0-2}\}$ such that

$$\begin{aligned} \sum _{i=1}^{j_1}b_i <\sum _{i=1}^{j_1}b^{\downarrow }_i , \end{aligned}$$

(35)

or $j_1\in \{j_0,\dots ,N\}$ such that

$$\begin{aligned} \sum _{i=j_0}^{j_1}b_i< \sum _{i=j_0}^{j_1}b^{\downarrow }_i. \end{aligned}$$

(36)

If not, with the same proof as before we get

$$\begin{aligned} b^{\downarrow }_i=b_i \,\,\,\, i\in \{1,\dots ,j_{0}-2\} \cup \{j_0+1,\dots ,N\}, \end{aligned}$$

i.e., we have

$$\begin{aligned} \left( \begin{matrix} b^{\downarrow }_1 \\ b^{\downarrow }_2 \\ \vdots \\ b^{\downarrow }_{j_0-2} \\ x^{\downarrow }_1 \\ x^{\downarrow }_2\\ b^{\downarrow }_{j_0+1} \\ \vdots \\ b^{\downarrow }_N \end{matrix}\right) = \left( \begin{matrix} b_1\\ b_2\\ \vdots \\ b_{j_0-2} \\ x_1\\ x_2\\ b_{j_0+1}\\ \vdots \\ b_N \end{matrix}\right) \end{aligned}$$

where $(x_1,x_2)=(b_{j_0-1},b_{j_0})$ or $(b_{j_0},b_{j_0-1})$. The order does not matter since $a_{j_0-1}=a_{j_0}$. This implies that $b\in {\mathcal {D}}(a)$, which contradicts the hypothesis. So (35) and (36) are true and we get, for example,

$$\begin{aligned} (a_{j_1-1}-a_{j_1})\sum _{i=1}^{j_1-1} b_i < (a_{j_1-1}-a_{j_1})\sum _{i=1}^{j_1-1} b^{\downarrow }_i, \end{aligned}$$

and if $a_{j_1-1}-a_{j_1}\ne 0$ we deduce

$$\begin{aligned} \sum _i a_ib_i< \sum _i a_ib^{\downarrow }_i. \end{aligned}$$

(37)

If $a_{j_1-1}=a_{j_1}$, we repeat the same argument and proof as above, and we are sure to find an index $j_w$ such that $a_{j_w-1}-a_{j_w}\ne 0$ since we have supposed that $a\ne t(1,1,\dots ,1)^T$. Therefore, (37) is always true which concludes the proof. $\square $

Proposition 5

[38] $g(x):{\mathbb {R}}^N\rightarrow {\mathbb {R}}$ defined as $ g(x)=\frac{1}{2}\sum _{i=1}^k x_i^{\downarrow 2}$, is convex. Furthermore, note that $g(|x|)=g(x)$.

Lemma 4

Let $f_1(z,x)\in {\mathbb {R}}^N\times {\mathbb {R}}^N \rightarrow {\mathbb {R}}$ be defined as

$$\begin{aligned} f_1(z,x){:}{=} -\frac{1}{2}\sum _{i=1}^k z^{\downarrow 2}_i+<z^{\downarrow },x^{\downarrow }>. \end{aligned}$$

Let us consider the concave problem

$$\begin{aligned} \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|). \end{aligned}$$

(38)

Problem (38) has the following optimal arguments

$$\begin{aligned}&\mathop {{\mathrm{arg \, sup}}}\limits _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|) = \{z; \exists \, P\in {\mathbb {R}}^{N\times N}\nonumber \\&\quad \text { a permutation matrix s.t. } Pz={\hat{z}}\}, \end{aligned}$$

(39)

where ${\hat{z}}$ is defined as

$$\begin{aligned} {\hat{z}}_j={\left\{ \begin{array}{ll}\frac{1}{T_k(x)}\sum _{i=k-T_k(x)+1}^N |x^{\downarrow }_i| &{}\text { if } k\ge j\ge k-T_k(x)+1 \\ &{}\text { or if } j>k \text { and } x^{\downarrow }_j\ne 0\\ \left[ 0,\frac{1}{T_k(x)}\sum _{i=k-T_k(x)+1}^N |x^{\downarrow }_i|\right] &{}\text { if } j>k \text { and } x^{\downarrow }_j= 0\\ |x^{\downarrow }_j| &{}\text { if } j< k-T_k(x)+1. \end{array}\right. } \end{aligned}$$

(40)

We can remark that ${\hat{z}}={\hat{z}}^\downarrow $, and $T_k(x)$ is defined in Proposition 2. The value of the supremum problem is

$$\begin{aligned} \frac{1}{2}\sum _{i=1}^{k-T_k(x)} x^{\downarrow 2}_i + \frac{1}{2T_k(x)}\left( \sum _{i=k-T_k(x)+1}^N |x^{\downarrow }_i| \right) ^2. \end{aligned}$$

(41)

Proof

Problem (38) can be written as:

$$\begin{aligned} \sup _{z\in {\mathbb {R}}^N_{\ge 0}} \sum _{i=1}^k|x^{\downarrow }_i|z^{\downarrow }_i -\frac{1}{2}\sum _{i=1}^k z^{\downarrow 2}_i+\sum _{i=k+1}^N |x^{\downarrow }_i|z^{\downarrow }_i . \end{aligned}$$

(42)

We remark that finding the supremum for $z^{\downarrow }_i \, , i>k$ reduces to finding the supremum of the following term, knowing that $z^{\downarrow }_i $ is upper bounded by $z^{\downarrow }_{i-1}$:

$$\begin{aligned} \sum _{i=k+1}^N |x^{\downarrow }_i| z^{\downarrow }_i . \end{aligned}$$

(43)

Let $z^{\downarrow }_k$ be a constant. The sum in (43) is nonnegative and increasing with respect to $z^{\downarrow }_j$, and the supremum is obtained when $z^{\downarrow }_j$ reaches its upper bound, i.e., $z^{\downarrow }_j=z^{\downarrow }_{j-1} \, \forall j>k$ and $|x^{\downarrow }_j|\ne 0$. By recursion, $z^{\downarrow }_j=z^{\downarrow }_{k} \, \forall j>k$ and $|x^{\downarrow }_j|\ne 0$. When $\exists \, j>k, |x^{\downarrow }_j|=0$, we observe that $z^{\downarrow }_j$ is multiplied with zero, and can take on every value between its lower bound and upper bounds, which is between 0 and $z^{\downarrow }_k$. Then, obviously, the supremum argument for (43) is

$$\begin{aligned} z^{\downarrow }_i {\left\{ \begin{array}{ll} =z^{\downarrow }_k \text { if } |x^{\downarrow }_i| \ne 0\\ \in [0,z^{\downarrow }_k] \text { if } |x^{\downarrow }_i|=0 \end{array}\right. } \end{aligned}$$

(44)

Further, from (42), we observe that for $i<k$, the optimal argument is

$$\begin{aligned} z^{\downarrow }_i =\max (|x^{\downarrow }_i|,z^{\downarrow }_{i+1}). \end{aligned}$$

(45)

By recursion, we can write this as

$$\begin{aligned} z^{\downarrow }_i =\max (|x^{\downarrow }_i|,z^{\downarrow }_k). \end{aligned}$$

(46)

It remains to find the value of $z^{\downarrow }_k$.

Inserting (44) and (46) into (42), we obtain:

$$\begin{aligned}&\sup _{z^{\downarrow }_k} \sum _{i=1}^k|x^{\downarrow }_i|\max (|x^{\downarrow }_i|,z^{\downarrow }_k)-\frac{1}{2}\sum _{i=1}^k \max (|x^{\downarrow }_i|,z^{\downarrow }_k)^2\nonumber \\&\quad +\sum _{i=k+1}^N |x^{\downarrow }_i|z^{\downarrow }_k. \end{aligned}$$

(47)

To treat the term $\max (|x^{\downarrow }_i|,z^{\downarrow }_k)$, we introduce $j^*(k)= \sup _j \{j: z^{\downarrow }_k\le |x^{\downarrow }_j|\}$ , i.e., $j^*(k)$ is the largest index such that $|x^{\downarrow }_{j^*(k)}|\ge z^{\downarrow }_k$, and we define $x^{\downarrow }_0=+\infty $. Therefore, (47) is rewritten as:

$$\begin{aligned}&\sup _{z^{\downarrow }_k} \sum _{i=1}^{j^*(k)}|x^{\downarrow }_i|^2-\frac{1}{2}\sum _{i=1}^{j^*(k)} |x^{\downarrow }_i|^2 + \sum _{i=j^*(k)+1}^k |x^{\downarrow }_i| z^{\downarrow }_k\nonumber \\&\quad -\frac{1}{2}\sum _{i=j^*(k)+1}^k z^{\downarrow 2}_k+\sum _{i=k+1}^N |x^{\downarrow }_i|z^{\downarrow }_k. \end{aligned}$$

(48)

(48) is a concave problem, and the optimality condition yields

$$\begin{aligned} -\sum _{i=j^*(k)+1}^k z^{\downarrow }_k+\sum _{j^*(k)+1}^N |x^{\downarrow }_i| = 0. \end{aligned}$$

(49)

We define $\sum _{i=j^*(k)+1}^k 1 =S$. Then, $j^*(k)=k-S$ and

$$\begin{aligned} z^{\downarrow }_k=\frac{1}{S}\sum _{k-S+1}^N |x^{\downarrow }_i|. \end{aligned}$$

(50)

Furthermore, since $j^*(k)=k-S$ was the largest index such that $|x_{k-S}|\ge z^{\downarrow }_k> |x_{k-S+1}|$. This translates to

$$\begin{aligned} |x^{\downarrow }_{k-S}|\ge \frac{1}{S}\sum _{k-S+1}^N |x^{\downarrow }_i| > |x^{\downarrow }_{k-S+1}|, \end{aligned}$$

which implies $S=T_k(x)$ (see Proposition 2). Note that if $j^*(k)=k$ (which is the same to say $T_k(x)=1$), then the right part of the above inequality is not strict.

Now, assume $|x^{\downarrow }_{j^*(k)}|= z^{\downarrow }_k$. Then, the max function can both take $z^{\downarrow }_k$ or $|x^{\downarrow }_{j^*(k)}|$. If it is the latter, than the expression above is correct. In the former case, $\max (|x^{\downarrow }_{j^*(k)}|,z^{\downarrow }_k)=z^{\downarrow }_k$. We obtain

$$\begin{aligned} z^{\downarrow }_k=\frac{1}{T_k(x)+1}\sum _{k-T_k(x)}^N |x^{\downarrow }_i|. \end{aligned}$$

(51)

Furthermore, we use the fact that $|x^{\downarrow }_{j^*(k)}|= z^{\downarrow }_k$ and $j^*(k)=k-T_k(x)$, and develop (51) as:

$$\begin{aligned} z^{\downarrow }_k&=\frac{1}{T_k(x)+1}\left( x_{k-T_k(x)}+\sum _{k-T_k(x)+1}^N |x^{\downarrow }_i|\right) \end{aligned}$$

(52)

$$\begin{aligned} (T_k(x)+1)z^{\downarrow }_k&= z^{\downarrow }_k+\sum _{k-T_k(x)+1}^N |x^{\downarrow }_i| \end{aligned}$$

(53)

$$\begin{aligned} T_k(x)z^{\downarrow }_k&=\sum _{k-T_k(x)+1}^N |x^{\downarrow }_i| \end{aligned}$$

(54)

$$\begin{aligned} z^{\downarrow }_k&=(40) \end{aligned}$$

(55)

The unique value of $z^{\downarrow }_k$ is given by (50). $\square $

Lemma 5

Let $x\in {\mathbb {R}}^N$ and $f_2(y,x)\in {\mathbb {R}}^N\times {\mathbb {R}}^N \rightarrow {\mathbb {R}}$, defined as

$$\begin{aligned} f_2(y,x) =-\frac{1}{2}\sum _{i=1}^k y^{\downarrow 2}_i+<y,x> \end{aligned}$$

The following concave supremum problem

$$\begin{aligned} \sup _{y\in {\mathbb {R}}^N } f_2(y,x) \end{aligned}$$

(56)

is equivalent to

$$\begin{aligned} \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|). \end{aligned}$$

(57)

The arguments are such that ${\hat{y}}_i^\downarrow ={{\,\mathrm{sign}\,}}^*(x_i^{\downarrow {\hat{z}}}){\hat{z}}_i^\downarrow $.

Proof

Let ${\hat{z}}\in {\mathbb {R}}^N_{\ge 0}$ be the argument of the supremum in (57), ${\hat{y}}$ be such that ${\hat{y}}_i={{\,\mathrm{sign}\,}}(x_i){\hat{z}}_i$, and note that $f_2(y,x)=-g(y)+<y,x>$ with g defined as in Proposition 5 in “Appendix A.1.” First, $f_2(y,x)$ is a concave function in y (see Proposition 5). Furthermore, $f_2(y,x)$ is such that $-f_2(y,x)$ is coercive in y. Thus, a supremum exists. Further note that $g({\hat{y}})=g(|{\hat{y}}|)=g({\hat{z}})$. Then, the following sequence of equalities/inequalities completes the proof:

$$\begin{aligned} (57)&= \sup _{z\in {\mathbb {R}}^N_{\ge 0}}f_2(z,|x|)=-g({\hat{z}})\\&\quad +\sum _{i=1}^N {\hat{z}}_i|x_i| = -g({\hat{z}})+ \sum _{i=1}^N {{\,\mathrm{sign}\,}}(x_i){\hat{z}}_ix_i\\&= -g({\hat{y}})+\sum _{i=1}^N {\hat{y}}_ix_i \le (56)\\&= \sup _{y\in {\mathbb {R}}^N} f_2(y,x) \underset{<y,x>\le <|y|,|x|>}{\le } \sup _{y\in {\mathbb {R}}^N} f_2(|y|, |x|)\\&=\sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|)= (57) \end{aligned}$$

$\square $

1.2 A.2 Proof of Lemma 1

Proof

Note that a similar problem has been studied in [1]. They do, however, work with low-rank approximation; therefore, they did not have the problem of how to permute x since they work with matrices. First, let ${\mathcal {D}}(x)$ be as defined in Definition 4.

We are interested in

$$\begin{aligned} \sup _{y\in {\mathbb {R}}^N } f_2(y,x), \end{aligned}$$

and its arguments, with $f_2$ defined in Lemma 5. From this lemma, we know that we can rather study

$$\begin{aligned} \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|). \end{aligned}$$

Furthermore, from Lemma 4, we know the expression of $\sup _{z\in {\mathbb {R}}^N_{\ge 0}}f_1(z,|x|)$ and its arguments. We want to show that $\sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|)=\sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|)$, and to find a connection between the arguments of $f_2$ and $f_1$.

First, note that

$$\begin{aligned} \sup _{z \in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|)\ge \sup _{z \in {\mathbb {R}}^N_{\ge 0} \cap {\mathcal {D}}(x)} f_2(z,|x|). \end{aligned}$$

(58)

From [34, Lemma 1.8] and Proposition 4, we have that $\forall (y,x) \in {\mathbb {R}}_{\ge 0}^N\times {\mathbb {R}}_{\ge 0}^N$:

$$\begin{aligned}<y,x>\le <y^{\downarrow },x^{\downarrow }>, \end{aligned}$$

and the inequality is strict if $y\notin {\mathcal {D}}(x)$, and thus

$$\begin{aligned} \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|)\le \sup _{z \in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|). \end{aligned}$$

(59)

Note that we have ${\mathcal {D}}(|x|)={\mathcal {D}}(x)$, then $\forall z \in {\mathcal {D}}(x)$, $f_2(z,|x|)=f_1(z,|x|)$ and:

$$\begin{aligned}&\sup _{z \in {\mathbb {R}}^N_{\ge 0} \cap {\mathcal {D}}(x)} f_2(z,|x|) = \sup _{z\in {\mathbb {R}}^N_{\ge 0}} \sum _{i=1}^N z^{\downarrow }_i|x^{\downarrow }_i|\nonumber \\&-\frac{1}{2}\sum _{i=1}^k z^{\downarrow 2}_i= \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|). \end{aligned}$$

(60)

Using inequalities (58) and (59) and connecting them to (60), we obtain

$$\begin{aligned} \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|)= & {} \sup _{z \in {\mathbb {R}}^N_{\ge 0} \cap {\mathcal {D}}(x)} f_2(z,|x|)\\\le & {} \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|) \le \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|). \end{aligned}$$

$f_2(z,|x|)$ is upper and lower bounded by the same value; thus, we have

$$\begin{aligned} \sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|)=\sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|) \end{aligned}$$

(61)

The $\sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|)$ is known from Lemma 4:

$$\begin{aligned}&\sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|) = \frac{1}{2}\sum _{i=1}^{k-T_k(x)}x^{\downarrow 2}_i\nonumber \\&\quad +\frac{1}{2T_k(x)}\left( \sum _{i=k-T_k(x)+1}^N |x^{\downarrow }_i|\right) ^2 \end{aligned}$$

(62)

with the optimal arguments:

$$\begin{aligned}&\mathop {{\mathrm{arg \, sup}}}\limits _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|) = \{z; \exists \, P\in {\mathbb {R}}^{N\times N}\nonumber \\&\text { a permutation matrix s.t. } Pz={\hat{z}}\}, \end{aligned}$$

(63)

where ${\hat{z}}$ is such that:

$$\begin{aligned} {\hat{z}}_j={\left\{ \begin{array}{ll}\frac{1}{T_k(x)}\sum _{i=k-T_k(x)+1}^N |x^{\downarrow }_i| &{}\text { if } k\ge j\ge k-T_k(x)+1 \\ &{}\text { or if } j>k \text { and } x^{\downarrow }_j\ne 0\\ \left[ 0,\frac{1}{T_k(x)}\sum _{i=k-T_k(x)+1}^N |x^{\downarrow }_i|\right] &{}\text { if } j>k \text { and } |x^{\downarrow }_j|= 0\\ |x^{\downarrow }_j| &{}\text { if } j< k-T_k(x)+1. \end{array}\right. } \end{aligned}$$

(64)

Now we are interested in the optimal arguments of $f_2$. Let $P^{(x)}$ be such that $P^{(x)}x=x^{\downarrow }$. We define $z^*=P^{(x)^{-1}} {\hat{z}}$. Evidently, $P^{(x)}z^*={\hat{z}}$, and since ${\hat{z}}$ is sorted by its absolute value, $P^{(x)}z^*=z^{* \downarrow }$, and thus, $z^*\in {\mathcal {D}}(x)$. Furthermore, from Lemma 4, $z^*$ is an optimal argument of $f_1$.

We have then $f_2(z^*,|x|)=f_1(z^*,|x|)=\sup _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|)$. $z^*$ is therefore an optimal argument of $f_2$ since (61) shows the equality between the supremum value of $f_1$ and $f_2$.

We have shown that there exists ${\hat{z}}\in \mathop {{\mathrm{arg \, sup}}}\limits _{z\in {\mathbb {R}}^N_{\ge 0}} f_1(z,|x|)$, from which we can construct $z^*\in {\mathcal {D}}(x)$, an optimal argument of $f_2$. Now, by contradiction, we show that all optimal arguments of $f_2$ are in ${\mathcal {D}}(x)$. Assume ${\hat{z}} = \mathop {{\mathrm{arg \, sup}}}\limits _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|)$ and that ${\hat{z}}\notin {\mathcal {D}}(x)$. We can construct $z^*$, such that $z^{* \downarrow }={\hat{z}}^{\downarrow }$, and $z^*\in {\mathcal {D}}(x)$. We have then

$$\begin{aligned}&f_2(z^*,|x|)-f_2({\hat{z}},|x|)\\&\quad =-\frac{1}{2}\sum _i^k z^{*\downarrow 2}_i +<z^*,|x|> +\frac{1}{2}\sum _i^k {\hat{z}}_i^{\downarrow 2} -<{\hat{z}},|x|>\\&\quad =<z^*,|x|>-<{\hat{z}},|x|> =<z^{* \downarrow },|x^{\downarrow }|>\\&\qquad -<{\hat{z}},|x|> > 0. \end{aligned}$$

The last equality is due to $z^*\in {\mathcal {D}}(x)$, and the last inequality is from Proposition 4. Thus, ${\hat{z}}$ is not an optimal argument for $f_2$, and all optimal arguments of $f_2$ must be in ${\mathcal {D}}(x)$.

Furthermore, thus it suffices to study $\sup _{z\in {\mathbb {R}}^N_{\ge 0}\in {\mathcal {D}}(z)}f_2(z,|x|)$, and from (60), we can rather study $f_1$, and construct all supremum arguments of $f_2$ from $f_1$.

$$\begin{aligned} \mathop {{\mathrm{arg \, sup}}}\limits _{z\in {\mathbb {R}}^N_{\ge 0}} f_2(z,|x|) = P^{(x)^{-1}}{\hat{z}} \end{aligned}$$

(65)

where ${\hat{z}}$ is defined in (64). $\square $

1.3 A.3 Calculation of Proximal Operator of $\zeta (x)$

As preliminary results, we state and prove the two following lemmas 6 and 7.

Lemma 6

Let $j:{\mathbb {R}}\rightarrow {\mathbb {R}}$ be a strictly convex and coercive function, let $w=\mathop {{\mathrm{arg\, min}}}\limits _t j(t)$, and let us suppose that j is symmetric with respect to its minimum, i.e., $j(w-t)=j(w+t)\, \forall t \in {\mathbb {R}}$. The problem

$$\begin{aligned} z=\mathop {{\mathrm{arg\, min}}}\limits _{b\le |t|\le a} j(t) \end{aligned}$$

with a and b positive, has the following solution:

$$\begin{aligned} z= {\left\{ \begin{array}{ll} w &{}\text { if } b\le |w|\le a\\ {{\,\mathrm{sign}\,}}^*(w)a &{}\text { if } |w|\ge a\\ {{\,\mathrm{sign}\,}}^*(w)b &{}\text { if } |w|\le b. \end{array}\right. } \end{aligned}$$

Proof

However, j is symmetric with respect to its minimum $j(w+t_1)\le j(w+t_2)\,\forall |t_1|\le |t_2|$. Assume that $0<w\le b$. We can write $j(b)=j(w+\alpha )$, $\alpha > 0$ and $j(-b)=j(w+\beta ), \beta <0$. Since $w>0$, then $|\alpha |<|\beta |$, and thus, the minimum is reached with $z=b$ on the interval [b, a]. Similar reasoning can be used to prove the other cases. $\square $

Lemma 7

Let $g_i:{\mathbb {R}}\rightarrow {\mathbb {R}}\, , i\in [1..N]$ be strictly convex and coercive. Let $w=(w_1,w_2,\dots w_N)^T=\mathop {{\mathrm{arg\, min}}}\limits _{t_i} \sum g_i(t_i)$, i.e., $w_i=\mathop {{\mathrm{arg\, min}}}\limits _{t_i}g_i(t_i)$. Assume that $|w_1|\ge |w_2|\ge \dots \ge |w_k|$ and $|w_{k+1}|\ge |w_{k+2}|\ge \dots \ge |w_N|$. Let $g_i$ be symmetric with respect to its minimum. Consider the following problem:

$$\begin{aligned} \mathop {{\mathrm{arg\, min}}}\limits _{|t_1|\ge \cdots \ge |t_N|} \sum _i^N g_i(t_i). \end{aligned}$$

(66)

The optimal solution is

$$\begin{aligned} t_i(\tau )= {\left\{ \begin{array}{ll} {{\,\mathrm{sign}\,}}^*(w_i) \max (|w_i|,\tau ) &{}\text { if } 1\le i\le k \\ {{\,\mathrm{sign}\,}}^*(w_i)\min (|w_i|,\tau ) &{}\text { if } i> k \end{array}\right. } \end{aligned}$$

(67)

where $\tau \in {\mathbb {R}}$ is in $[\min (|w_k|,|w_{k+1}|),\max (|w_k|,|w_{k+1}|)]$ and is the value that minimizes $\sum g_i(t_i(\tau ))$.

Proof

Note that this proof is inspired by [20, Theorem 2], with some modifications. First, if $|w_k|\ge |w_{k+1}|$, then w satisfies the constraints in Problem (66), and thus, w is the optimal solution. If $|w_k|<|w_{k+1}|$, we must search a little more. In both cases, we can, since each $g_i$ is convex and symmetric with respect to its minimum, apply Lemma 6 for $t_i$, and the choices can be limited to the following choices:

$$\begin{aligned} t_i= {\left\{ \begin{array}{ll} w_i &{}\text { if } |t_{i-1}|\ge |w_i|\ge |t_{i+1}| \\ {{\,\mathrm{sign}\,}}^*(w_i)|t_{i+1}| &{}\text { if } |w_i|< |t_{i+1}| \\ {{\,\mathrm{sign}\,}}^*(w_i)|t_{i-1}| &{}\text { if } |w_i|> |t_{i-1}| \end{array}\right. } \end{aligned}$$

(68)

This can be rewritten in a shorter form, at first in the case where $i\le k$.

$$\begin{aligned} t_i={{\,\mathrm{sign}\,}}(w_i)^*\max {(|w_i|,|t_{i+1}|)}. \end{aligned}$$

(69)

This can be proved by recursion. In the case of $i=1$, $w_1$ is the optimal argument if $|w_1|\ge |t_2|$; otherwise, ${{\,\mathrm{sign}\,}}^*(w_1)|t_2|$ is optimal. Therefore, $t_1={{\,\mathrm{sign}\,}}^*(w_1)\max (|w_1|,|t_{2}|)$. Assume that this is true for the ith index.

$$\begin{aligned} t_{i+1}= {\left\{ \begin{array}{ll} w_{i+1} &{}\text { if } |t_{i}|\ge |w_{i+1}|\ge |t_{i+2}| \text { and } i+1\le k \\ {{\,\mathrm{sign}\,}}^*(w_{i+1})|t_{i+2}| &{}\text { if }| w_{i+1}|< |t_{i+2}|\text { and } i+1\le k \\ {{\,\mathrm{sign}\,}}^*(w_{i+1})|t_{i}| &{}\text { if } |w_{i+1}|> |t_{i}|\text { and } i+1\le k. \end{array}\right. } \end{aligned}$$

(70)

But $t_{i}={{\,\mathrm{sign}\,}}^*(w_i)\max (|w_{i}|,|t_{i+1}|)$, which yields $|t_{i}|\ge |w_{i}|\ge |w_{i+1}|$ and thus, the third case of (70) can be ignored.

Now assume for an $i\le k$ that $t_i\ne w_i$. This implies that

$$\begin{aligned} |t_i|=|t_{i+1}|>|w_i|. \end{aligned}$$

Since $w_i$ is non-increasing for $i\le k$, the following inequality $|t_{i+1}|>|w_{i+1}|$ is true. Furthermore, $|t_{i+1}|= \max (|w_{i+1}|,|t_{i+2}|) = |t_{i+2}|$. By recursion, we have

$$\begin{aligned} |t_i|=|t_{i+1}|=|t_{i+2}|=\cdots =|t_k|. \end{aligned}$$

To facilitate the notations, $|t_k|=\tau $. The lemma is proved by inserting $\tau $ instead of $|t_{i+1}|$ and $|t_k|$ into Eq. (69)

When $i> k$, a similar proof of recursion gives:

$$\begin{aligned} t_i={{\,\mathrm{sign}\,}}^*(w_i)\min (|t_k|,|w_i|). \end{aligned}$$

(71)

and by adopting the notation $\tau $, we finish the proof. $\square $

Remark 4

Note that if w, defined in Lemma 7 is such that $|w_k|\ge |w_{k+1}|$, then w is solution of (66).

Lemma 8

Let $y\in {\mathbb {R}}^N$. Define $\zeta : {\mathbb {R}}^N\rightarrow {\mathbb {R}}$ as $\zeta (x){:}{=}-(\frac{\rho -1}{\rho })\sum _{i=k+1}^N(x_i)^{\downarrow 2}$. The proximal operator of $\zeta $ is such that

$$\begin{aligned} \text {prox}_{\zeta (\cdot )}(y)^{\downarrow y} = {\left\{ \begin{array}{ll}{{\,\mathrm{sign}\,}}(y^{\downarrow }_i) \max {(|y^{\downarrow }_i|,\tau )} &{}\text { if } i\le k \\ {{\,\mathrm{sign}\,}}(y^{\downarrow }_i)\min (\tau ,|\rho y^{\downarrow }_i|) &{}\text { if } i>k.\\ \end{array}\right. } \end{aligned}$$

(72)

If $|y^{\downarrow }_k|<\rho |y^{\downarrow }_{k+1}|$, then $\tau $ is a value in the interval $[|y^{\downarrow }_k|,\rho |y^{\downarrow }_{k+1}|]$, and is defined as

$$\begin{aligned} \tau =\frac{\rho \sum _{i\in n_1}|y^{\downarrow }_i|+\rho \sum _{i\in n_2}|y^{\downarrow }_i|}{\rho \#n_1+ \#n_2} \end{aligned}$$

(73)

where $n_1$ and $n_2$ are two groups of indices such that $\forall \, i \in n_1, y^{\downarrow }_i<\tau $ and $\forall \, i \in n_2,\, \tau \le \rho |y^{\downarrow }_i|$ for an $\#n_1$ and $\#n_2$ are the sizes of n1 and n2. To go from ${\text {prox}_{\zeta (\cdot )}}(y)^{\downarrow y}$ to $\text {prox}_{\zeta (\cdot )}(y)$, we apply the inverse permutation that sorts y to $y^{\downarrow }$.

Note that we search

$$\begin{aligned}&\text {prox}_{-\left( \frac{\rho -1}{\rho }\right) \sum _{i=k+1}^N(\cdot )^{\downarrow 2}}(y)=\text {arg}\min _x -\frac{1}{2}\sum _{i=k+1}^N x^{\downarrow 2}_i \\&\quad + \frac{\rho }{2(\rho -1)}\left\| x-y\right\| _2^2 \end{aligned}$$

We define two functions, $l_1: {\mathbb {R}}^{N}\times {\mathbb {R}}^{N}\rightarrow {\mathbb {R}} $ and $l_2: {\mathbb {R}}^{N}\times {\mathbb {R}}^{N}\rightarrow {\mathbb {R}}$.

$$\begin{aligned} l_1(z,a)&=\frac{\rho }{2(\rho -1)}\sum _i^N (z_i-|a_i| )^2-\frac{1}{2}\sum _{i=k+1}^N z^{\downarrow 2}_i \end{aligned}$$

(74)

$$\begin{aligned} l_2(z,|a|)&=\frac{\rho }{2(\rho -1)}\sum _i^N (z^{\downarrow }_i-|a^{\downarrow }_i|)^2-\frac{1}{2}\sum _{i=k+1}^N z^{\downarrow 2}_i . \end{aligned}$$

(75)

As in Lemma 1, we can create relations between $l_1$ and $l_2$, where $l_2$ can be solved using Lemma 7.

We omit the proof as it is similar to the one of Lemma 1.

1.4 A.4 The Algorithm

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bechensteen, A.H., Blanc-Féraud, L. & Aubert, G. A Continuous Relaxation of the Constrained $\ell _2-\ell _0$ Problem. J Math Imaging Vis 63, 472–491 (2021). https://doi.org/10.1007/s10851-020-01014-y

Download citation

Received: 22 April 2020
Accepted: 19 December 2020
Published: 09 January 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s10851-020-01014-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Continuous Relaxation of the Constrained \(\ell _2-\ell _0\) Problem

Abstract

Access this article

Similar content being viewed by others

Two-stage convex relaxation approach to least squares loss constrained low-rank plus sparsity optimization problems

Nonuniqueness of Solutions of a Class of $$\ell _{0}$$ -minimization Problems

New Insights on the Optimality Conditions of the $$\ell _2-\ell _0$$ Minimization Problem

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

A Appendix

A Appendix

1.1 A.1 Preliminary Results for Lemma 1

Proof

Definition 4

Remark 3

Proposition 4

Proof

Proposition 5

Lemma 4

Proof

Lemma 5

Proof

1.2 A.2 Proof of Lemma 1

Proof

1.3 A.3 Calculation of Proximal Operator of \(\zeta (x)\)

Lemma 6

Proof

Lemma 7

Proof

Remark 4

Lemma 8

1.4 A.4 The Algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation