A Review of Sparse Statistical Learning and Its Recent Research Progress

ZHANG Hongying; DONG Kezhen

doi:10.13718/j.cnki.xsxb.2023.04.001

2023 Volume 48 Issue 4

Article Contents

Previous Article Next Article

ZHANG Hongying, DONG Kezhen. A Review of Sparse Statistical Learning and Its Recent Research Progress[J]. Journal of Southwest China Normal University(Natural Science Edition), 2023, 48(4): 1-12. doi: 10.13718/j.cnki.xsxb.2023.04.001

Citation:

ZHANG Hongying, DONG Kezhen. A Review of Sparse Statistical Learning and Its Recent Research Progress[J]. Journal of Southwest China Normal University(Natural Science Edition), 2023, 48(4): 1-12. doi: 10.13718/j.cnki.xsxb.2023.04.001

A Review of Sparse Statistical Learning and Its Recent Research Progress

School of Mathematics and Statistics, Xi′an Jiaotong University, Xi′an 710049, China

More Information

Received Date: 17/06/2022
Available Online: 20/04/2023
MSC: TP181

Abstract

Sparsity means that complex physical processes in high-dimensional spaces can be approximated by only a few parameters (characteristic variables) located in low-dimensional subspaces, and is a prevalent property in practical applications. Sparse statistical learning aims to explore the sparsity of high-dimensional data and to perform statistical modeling and inference. The article reviews the sparse statistical learning models with a focus on regression analysis and its recent research progress. It mainly introduces various types of sparse regression models with convex or non-convex regularization terms, especially the algorithms and applications of $L_{\frac{1}{2}}$-regularization framework. In the last decade, deep learning has made revolutionary progress, and the research combining traditional sparse statistical learning models with deep neural networks has gradually received widespread attention. The article mainly introduces the deep learning methods based on sparse modeling and data-driven sparse statistical analysis methods, the former including deep unfolding networks and so on, and the latter including deep hash learning and deep canonical correlation analysis. Finally, the article concludes with a summary and looks at possible future research directions.
- sparsity,
- regularization framework,
- regularization terms,
- $L_{\frac{1}{2}}$-regularization framework,
- deep learning,
- deep unfolding networks

References

[1]	HASTIE T, TIBSHIRANI R, WAINWRIGHT M. Statistical Learning with Sparsity: The Lasso and Generalizations [M]. Boca Raton: CRC Press, 2015: 3-4. Google Scholar
[2]	兰美辉, 范全润, 高炜. 本体稀疏矩阵学习以及在相似度计算中的应用[J]. 西南大学学报(自然科学版), 2020, 42(1): 118-123. doi: 10.13718/j.cnki.xdzk.2020.01.017 CrossRef Google Scholar
[3]	刘春燕, 李川, 齐静. 基于扰动BOMP算法的块稀疏信号重构[J]. 西南师范大学学报(自然科学版), 2020, 45(7): 144-149. doi: 10.13718/j.cnki.xsxb.2020.07.019 CrossRef Google Scholar
[4]	王代丽, 王世元, 张涛, 等. 基于稀疏系统辨识的广义递归核风险敏感算法[J]. 西南大学学报(自然科学版), 2022, 44(4): 196-205. doi: 10.13718/j.cnki.xdzk.2022.04.023 CrossRef Google Scholar
[5]	HOCKING R R, LESLIE R N. Selection of the Best Subset in Regression Analysis [J]. Technometrics, 1967, 9(4): 531-540. doi: 10.1080/00401706.1967.10490502 CrossRef Google Scholar
[6]	NATARAJAN B K. Sparse Approximate Solutions to Linear Systems [J]. SIAM Journal on Computing, 1995, 24(2): 227-234. doi: 10.1137/S0097539792240406 CrossRef Google Scholar
[7]	AKAIKE H. A New Look at the Statistical Model Identification [J]. IEEE Transactions on Automatic Control, 1974, 19(6): 716-723. doi: 10.1109/TAC.1974.1100705 CrossRef Google Scholar
[8]	SCHWARZ G. Estimating the Dimension of a Model [J]. The Annals of Statistics, 1978, 6(2): 461-464. Google Scholar
[9]	HANNAN E J, QUINN B G. The Determination of the Order of an Autoregression [J]. Journal of the Royal Statistical Society: Series B (Methodological), 1979, 41(2): 190-195. doi: 10.1111/j.2517-6161.1979.tb01072.x CrossRef Google Scholar
[10]	HOERL A E, KENNARD R W. Ridge Regression: Biased Estimation for Nonorthogonal Problems [J]. Technometrics, 1970, 12(1): 55-67. doi: 10.1080/00401706.1970.10488634 CrossRef Google Scholar
[11]	BREIMAN L. Better Subset Regression Using the Nonnegative Garrote [J]. Technometrics, 1995, 37(4): 373-384. doi: 10.1080/00401706.1995.10484371 CrossRef Google Scholar
[12]	TIBSHIRANI R. Regression Shrinkage and Selection via the Lasso [J]. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58(1): 267-288. doi: 10.1111/j.2517-6161.1996.tb02080.x CrossRef Google Scholar
[13]	CHEN S S, DONOHO D L, SAUNDERS M A. Atomic Decomposition by Basis Pursuit [J]. SIAM Review, 2001, 43(1): 129-159. doi: 10.1137/S003614450037906X CrossRef Google Scholar
[14]	HASTIE T, TIBSHIRANI R, FRIEDMAN J H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction [M]. 2th ed. New York: Springer, 2016: 33-34. Google Scholar
[15]	ZOU H. The Adaptive Lasso and Its Oracle Properties [J]. Journal of the American Statistical Association, 2006, 101(476): 1418-1429. doi: 10.1198/016214506000000735 CrossRef Google Scholar
[16]	ZOU H, HASTIE T. Regularization and Variable Selection via the Elastic Net [J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67(2): 301-320. doi: 10.1111/j.1467-9868.2005.00503.x CrossRef Google Scholar
[17]	YUAN M, LIN Y. Model Selection and Estimation in Regression with Grouped Variables [J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2006, 68(1): 49-67. doi: 10.1111/j.1467-9868.2005.00532.x CrossRef Google Scholar
[18]	PUIG A T, WIESEL A, HERO A O. A Multidimensional Shrinkage-Thresholding Operator [C]//2009 IEEE/SP 15th Workshop on Statistical Signal Processing. Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2009: 113-116. Google Scholar
[19]	SIMON N, FRIEDMAN J, HASTIE T, et al. A Sparse-Group Lasso [J]. Journal of Computational and Graphical Statistics, 2013, 22(2): 231-245. doi: 10.1080/10618600.2012.681250 CrossRef Google Scholar
[20]	TIBSHIRANI R, SAUNDERS M, ROSSET S, et al. Sparsity and Smoothness via the Fused Lasso [J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67(1): 91-108. doi: 10.1111/j.1467-9868.2005.00490.x CrossRef Google Scholar
[21]	RAVIKUMAR P, LAFFERTY J, LIU H, et al. Sparse Additive Models [J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2009, 71(5): 1009-1030. doi: 10.1111/j.1467-9868.2009.00718.x CrossRef Google Scholar
[22]	BREIMAN L, FRIEDMAN J H. Estimating Optimal Transformations for Multiple Regression and Correlation [J]. Journal of the American Statistical Association, 1985, 80(391): 580-598. doi: 10.1080/01621459.1985.10478157 CrossRef Google Scholar
[23]	CANDES E J, TAO T. Decoding by Linear Programming [J]. IEEE Transactions on Information Theory, 2005, 51(12): 4203-4215. doi: 10.1109/TIT.2005.858979 CrossRef Google Scholar
[24]	MEINSHAUSEN N, BÜHLMANN P. High-Dimensional Graphs and Variable Selection with the Lasso [J]. The Annals of Statistics, 2006, 34(3): 1436-1462. Google Scholar
[25]	ZHAO P, YU B. On Model Selection Consistency of Lasso [J]. The Journal of Machine Learning Research, 2006(7): 2541-2563. Google Scholar
[26]	FRANK L L E, FRIEDMAN J H. A Statistical View of Some Chemometrics Regression Tools [J]. Technometrics, 1993, 35(2): 109-135. doi: 10.1080/00401706.1993.10485033 CrossRef Google Scholar
[27]	CHARTRAND R, STANEVA V. Restricted Isometry Properties and Nonconvex Compressive Sensing [J]. Inverse Problems, 2010, 24(3): 657-682. Google Scholar
[28]	XU Z B, GUO H L, WANG Y, et al. Representative of $L_{\frac{1}{2}}$ Regularization Among L_q(0<q≤1) Regularizations: an Experimental Study Based on Phase Diagram [J]. Acta AutomaticaSinica, 2012, 38(7): 1225-1228. $L_{\frac{1}{2}}$ Regularization Among L_q(0<q≤1) Regularizations: an Experimental Study Based on Phase Diagram" target="_blank">Google Scholar
[29]	DONOHO D, TANNER J. Observed Universality of Phase Transitions in High-Dimensional Geometry, with Implications for Modern Data Analysis and Signal Processing [J]. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2009, 367(1906): 4273-4293. doi: 10.1098/rsta.2009.0152 CrossRef Google Scholar
[30]	KRISHNAN D, FERGUS R. Fast Image Deconvolution Using Hyper-Laplacian Priors [C]//Advances in Neural Information Processing Systems 22 (NeurIPS 2009). Cambridge: MIT Press, 2009: 1033-1041. Google Scholar
[31]	ZENG J, XU Z, ZHANG B, et al. Accelerated $L_{\frac{1}{2}}$ Regularization Based SAR Imaging via BCR and Reduced Newton Skills [J]. Signal Processing, 2013, 93(7): 1831-1844. doi: 10.1016/j.sigpro.2012.12.017 CrossRef $L_{\frac{1}{2}}$ Regularization Based SAR Imaging via BCR and Reduced Newton Skills" target="_blank">Google Scholar
[32]	XU Z B, CHANG X Y, XU F M, et al. $L_{\frac{1}{2}}$Regularization: A Thresholding Representation Theory and a Fast Solver [J]. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(7): 1013-1027. doi: 10.1109/TNNLS.2012.2197412 CrossRef $L_{\frac{1}{2}}$Regularization: A Thresholding Representation Theory and a Fast Solver" target="_blank">Google Scholar
[33]	LI Y Y, FAN S G, YANG J, et al. Musai-$L_{\frac{1}{2}}$: Multiple Sub-Wavelet-Dictionaries-Based Adaptively-Weighted Iterative Half Thresholding Algorithm for Compressive Imaging [J]. IEEE Access, 2018, 6: 16795-16805. doi: 10.1109/ACCESS.2018.2799984 CrossRef $L_{\frac{1}{2}}$: Multiple Sub-Wavelet-Dictionaries-Based Adaptively-Weighted Iterative Half Thresholding Algorithm for Compressive Imaging" target="_blank">Google Scholar
[34]	YUAN L J, LI Y Y, DAI F, et al. Analysis $L_{\frac{1}{2}}$ Regularization: Iterative Half Thresholding Algorithm for CS-MRI [J]. IEEE Access, 2019, 7: 79366-79373. doi: 10.1109/ACCESS.2019.2923171 CrossRef $L_{\frac{1}{2}}$ Regularization: Iterative Half Thresholding Algorithm for CS-MRI" target="_blank">Google Scholar
[35]	CAO W F, SUN J, XU Z B. Fast Image Deconvolution Using Closed-Form Thresholding Formulas of $l_q\left(q=\frac{1}{2}, \frac{2}{3}\right)$ Regularization [J]. Journal of Visual Communication and Image Representation, 2013, 24(1): 31-41. doi: 10.1016/j.jvcir.2012.10.006 CrossRef $l_q\left(q=\frac{1}{2}, \frac{2}{3}\right)$ Regularization" target="_blank">Google Scholar
[36]	饶过, 彭毅, 徐宗本. 基于$S_{\frac{1}{2}}$-建模的稳健稀疏-低秩矩阵分解[J]. 中国科学: 信息科学, 2013, 43(6): 733-748. Google Scholar
[37]	BOYD S, PARIKH N, CHU E, et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers [J]. Foundations and Trends in Machine Learning, 2011, 3(1): 1-122. Google Scholar
[38]	DAUBECHIES I, DEFRISE M, DEMOL C. An Iterative Thresholding Algorithm for Linear Inverse Problems with a Sparsity Constraint [J]. Communications on Pure and Applied Mathematics, 2004, 57(11): 1413-1457. doi: 10.1002/cpa.20042 CrossRef Google Scholar
[39]	JIA S, ZHANG X J, LI Q Q. Spectral-Spatial Hyperspectral Image Classification Using $l_{\frac{1}{2}}$ Regularized Low-Rank Representation and Sparse Representation-Based Graph Cuts [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2015, 8(6): 2473-2484. doi: 10.1109/JSTARS.2015.2423278 CrossRef $l_{\frac{1}{2}}$ Regularized Low-Rank Representation and Sparse Representation-Based Graph Cuts" target="_blank">Google Scholar
[40]	LIN Z C, CHEN M M, MA Y. The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices [EB/OL]. [2022-05-15]. https://arxiv.org/abs/1009.5055. Google Scholar
[41]	PENG D T, XIU N H, YU J. $S_{\frac{1}{2}}$ Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems [J]. Computational Optimization and Applications, 2017, 67(3): 543-569. doi: 10.1007/s10589-017-9898-5 CrossRef $S_{\frac{1}{2}}$ Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems" target="_blank">Google Scholar
[42]	ZHU L, HAO Y, SONG Y. $L_{\frac{1}{2}}$ Norm and Spatial Continuity Regularized Low-Rank Approximation for Moving Object Detection in Dynamic Background [J]. IEEE Signal Processing Letters, 2018, 25(1): 15-19. doi: 10.1109/LSP.2017.2768582 CrossRef $L_{\frac{1}{2}}$ Norm and Spatial Continuity Regularized Low-Rank Approximation for Moving Object Detection in Dynamic Background" target="_blank">Google Scholar
[43]	CHAMBOLLE A. An Algorithm for Total Variation Minimization and Applications [J]. Journal of Mathematical Imaging and Vision, 2004, 20(1): 89-97. Google Scholar
[44]	TOM A J, GEORGE S N. A Three-Way Optimization Technique for Noise Robust Moving Object Detection Using Tensor Low-Rank Approximation, $l_{\frac{1}{2}}$, and TTV Regularizations [J]. IEEE Transactions on Cybernetics, 2021, 51(2): 1004-1014. doi: 10.1109/TCYB.2019.2921827 CrossRef $l_{\frac{1}{2}}$, and TTV Regularizations" target="_blank">Google Scholar
[45]	YANG S, WANG J, FAN W, et al. An Efficient ADMM Algorithm for Multidimensional Anisotropic Total Variation Regularization Problems [C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2013: 641-649. Google Scholar
[46]	LU C Y, FENG J S, CHEN Y D, et al. Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2016: 5249-5257. Google Scholar
[47]	HAO R R, SU Z X. Augmented Lagrangian Alternating Direction Method for Tensor RPCA [J]. Journal of Mathematical Research with Applications, 2017, 37(3): 367-378. Google Scholar
[48]	FAN J Q, LI R Z. Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties [J]. Journal of the American Statistical Association, 2001, 96(456): 1348-1360. doi: 10.1198/016214501753382273 CrossRef Google Scholar
[49]	ZHANG C H. Nearly Unbiased Variable Selection Under Minimax Concave Penalty [J]. The Annals of Statistics, 2010, 38(2): 894-942. Google Scholar
[50]	GREGOR K, LECUN Y. Learning Fast Approximations of Sparse Coding [C]//Proceedings of the 27th International Conference on Machine Learning. Brookline: Journal of Machine Learning Research, 2010: 399-406. Google Scholar
[51]	BECK A, TEBOULLE M. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems [J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183-202. doi: 10.1137/080716542 CrossRef Google Scholar
[52]	YANG Y, SUN J, LI H B, et al. ADMM-CSNet: A Deep Learning Approach for Image Compressive Sensing [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(3): 521-538. doi: 10.1109/TPAMI.2018.2883941 CrossRef Google Scholar
[53]	XIE X, WU J, LIU G, et al. Differentiable Linearized ADMM [C]//Proceedings of the 36th International Conference on Machine Learning. Brookline: Journal of Machine Learning Research, 2019: 6902-6911. Google Scholar
[54]	DING Y, XUE X W, WANG Z Z, et al. Domain Knowledge Driven Deep Unrolling for Rain Removal from Single Image [C]//2018 7th International Conference on Digital Home (ICDH). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2018: 14-19. Google Scholar
[55]	MEINHARDT T, MOELLER M, HAZIRBAS C, et al. Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems [C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2017: 1799-1808. Google Scholar
[56]	YANG D, SUN J. Proximal Dehaze-Net: A Prior Learning-Based Deep Network for Single Image Dehazing [C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham: Springer, 2018: 702-717. Google Scholar
[57]	HOSSEINI S A H, YAMAN B, MOELLER S, et al. Dense Recurrent Neural Networks for Accelerated MRI: History-Cognizant Unrolling of Optimization Algorithms [J]. IEEE Journal of Selected Topics in Signal Processing, 2020, 14(6): 1280-1291. doi: 10.1109/JSTSP.2020.3003170 CrossRef Google Scholar
[58]	SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: A Simple Way to Prevent Neural Networks from Overfitting [J]. Journal of Machine Learning Research, 2014, 15(1): 1929-1958. Google Scholar
[59]	CAVAZZA J, MORERIO P, HAEFFELE B, et al. Dropout as a Low-Rank Regularizer for Matrix Factorization [C]//Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. Brookline: Journal of Machine Learning Research, 2018: 435-444. Google Scholar
[60]	MIANJY P, ARORA R, VIDAL R. On the Implicit Bias of Dropout [C]//Proceedings of the 35th International Conference on Machine Learning. Brookline: Journal of Machine Learning Research, 2018: 3540-3548. Google Scholar
[61]	PAL A, LANE C, VIDAL R, et al. On the Regularization Properties of Structured Dropout [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2020: 7668-7676. Google Scholar
[62]	GLOROT X, BENGIO Y. Understanding the Difficulty of Training Deep Feedforward Neural Networks [C]//Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. Brookline: Journal of Machine Learning Research, 2010: 249-256. Google Scholar
[63]	IOFFE S, SZEGEDY C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [C]//Proceedings of the 32td International Conference on Machine Learning. Brookline: Journal of Machine Learning Research, 2015: 448-456. Google Scholar
[64]	WRIGHT J, MA Y. High-Dimensional Data Analysis with Low-Dimensional Models: Principles, Computation, and Applications [M]. Cambridge: Cambridge University Press, 2022: 537-538. Google Scholar
[65]	ZHAO F, HUANG Y Z, WANGL, et al. Deep Semantic Ranking Based Hashing for Multi-Label Image Retrieval [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2015: 1556-1564. Google Scholar
[66]	LIU H M, WANG R P, SHAN S G, et al. Deep Supervised Hashing for Fast Image Retrieval [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2016: 2064-2072. Google Scholar
[67]	LI W J, WANG S, KANG W C. Feature Learning Based Deep Supervised Hashing with Pairwise Labels [C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 1711-1717. Google Scholar
[68]	LI Q, SUN Z, HE R, et al. Deep Supervised Discrete Hashing [C]//Advances in Neural Information Processing Systems 30 (NeurIPS 2017). San Diego: Neural Information Processing Systems Foundation, 2017: 2479-2488. Google Scholar
[69]	CHEN Y D, LAI Z H, DING Y J, et al. Deep Supervised Hashing with Anchor Graph [C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV 2019). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2019: 9795-9803. Google Scholar
[70]	ANDREW G, ARORA R, BILMES J, et al. Deep Canonical Correlation Analysis [C]//Proceedings of the 30th International Conference on Machine Learning. Brookline: Journal of Machine Learning Research, 2013, 28(3): 1247-1255. Google Scholar
[71]	WANG W R, ARORA R, LIVESCU K, et al. Stochastic Optimization for Deep CCA via Nonlinear Orthogonal Iterations [C]//2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2015: 688-695. Google Scholar
[72]	YAN F, MIKOLAJCZYK K. Deep Correlation for Matching Images and Text [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Cardiff: Institute of Electrical and Electronics Engineers (IEEE), 2015: 3441-3450. Google Scholar
[73]	WANG W R, ARORA R, LIVESCU K, et al. On Deep Multi-View Representation Learning [C]//Proceedings of the 32td International Conference on Machine Learning. Brookline: Journal of Machine Learning Research, 2015: 1083-1092. Google Scholar
[74]	CHANDAR S, KHAPRA M M, LAROCHELLE H, et al. Correlational Neural Networks [J]. Neural Computation, 2016, 28(2): 257-285. doi: 10.1162/NECO_a_00801 CrossRef Google Scholar

Access History

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Export Citation

PDF

XML

Article Metrics

Article views(8784) PDF downloads(2623) Cited by(0)

Name
	Name cannot be empty!
E-mail
	Mailbox cannot be empty! Mailbox cannot be empty!
Telephone
	Mobile number cannot be empty! Please enter a valid mobile number!
Title

Content
Verification Code

Message Board

A Review of Sparse Statistical Learning and Its Recent Research Progress

Abstract

References

Access History

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Access History

Other Articles By Authors