Deep Partial Semi-Supervised Learning Method Based on Consistency Regularization

ZHU Biao; LI Yan; WANG Shuo

doi:10.13718/j.cnki.xdzk.2024.05.003

2024 Volume 46 Issue 5

Article Contents

Previous Article Next Article

ZHU Biao, LI Yan, WANG Shuo. Deep Partial Semi-Supervised Learning Method Based on Consistency Regularization[J]. Journal of Southwest University Natural Science Edition, 2024, 46(5): 27-39. doi: 10.13718/j.cnki.xdzk.2024.05.003

Citation:

ZHU Biao, LI Yan, WANG Shuo. Deep Partial Semi-Supervised Learning Method Based on Consistency Regularization[J]. Journal of Southwest University Natural Science Edition, 2024, 46(5): 27-39. doi: 10.13718/j.cnki.xdzk.2024.05.003

Deep Partial Semi-Supervised Learning Method Based on Consistency Regularization

1.
College of Mathematics and Information Science, Hebei University, Baoding Hebei 071002, China
2.
School of Applied Mathematics, Beijing Normal University at Zhuhai, Zhuhai Guangdong 519000, China

More Information

Corresponding author: LI Yan ;
Received Date: 28/05/2023
Available Online: 20/05/2024
MSC: TP18

Abstract

Most of partial label learning methods assume that all training samples have a set of candidate labels, but there are still a large number of unlabeled data in many real applications. How to construct a learning model by using both the information contained in partial and unlabeled samples is the crucial problem of partial semi-supervised learning. Aiming at image classification problem with only a small number of labeled and partially labeled samples and a large number of unlabeled data, this paper uses the consistency regularization and pseudo-labeled methods to develop the learning model. For partial labeled and unlabeled samples, the pseudo-labels were generated by the corresponding output distributions of their weak augmentations, and those of partial labeled samples were constrained in the candidate label sets. A new loss function including three items was designed, which can simultaneously use the supervised, weak supervised as well as unsupervised information contained in the data. The pseudo-labeled samples with high-confidence were selected to calculate the cross-entropy loss of their two kinds of augmentations to improve the sample reliability involved in the training process. The experiment results in this paper showed that showed that the proposed method (CR-PSSL) had higher accuracy and stability than the existing state-of-the-art semi-supervised learning method (FlexMatch) and representative partial label learning methods, and the convergence speed was also significantly improved.
- partial label learning,
- semi-supervised learning,
- consistency regularization,
- pseudo labeling method,
- image classification,
- deep learning

References

[1]	COUR T, SAPP B, JORDAN C, et al. Learning from Ambiguously Labeled Images[C] //2009 IEEE Conference on Computer Vision and Pattern Recognition, USA, IEEE, 2009: 919-926. Google Scholar
[2]	CHEN C H, PATEL V M, CHELLAPPA R, et al. Learning from Ambiguously Labeled Face Images[J]. IEEE Transactionson Pattern Analysis and Machine Intelligence, 2018, 40(7): 1653-1667. doi: 10.1109/TPAMI.2017.2723401 CrossRef Google Scholar
[3]	ZENG Z N, XIAO S J, JIA K, et al. Learning by Associating Ambiguously Labeled Images[J]. Computer Vision and Pattern Recognition, 2013: 708-715. Google Scholar
[4]	LUO J, FRANCESCO O. Learning from Candidate Labeling Sets[C] //Neural Information Processing Systems, 2010: 1504-1512. Google Scholar
[5]	REN X, HE W Q, QU M, et al. AFET: Automatic Fine-Grained Entity Typing by Hierarchical Partial-Label Embedding[J]. Empirical Methods in Natural Language Processing, 2016, 16, 1369-1378. Google Scholar
[6]	XIANG R, HE W, MENG Q, et al. Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding[J]. Computing Research Repository, 2016: 1825-1834. Google Scholar
[7]	SUN K W, MIN Z J, WANG J. PP-PLL: Probability Propagation for Partial Label Learning[C] //European Conference on Principles of Data Mining and Knowledge Discovery, 2019: 123-137. Google Scholar
[8]	YU F, ZHANG M L. Maximum Margin Partial Label Learning[J]. Asian Conference on Machine Learning, 2017, 106(4): 573-593. doi: 10.1007/s10994-016-5606-4 CrossRef Google Scholar
[9]	NGUYEN N, CARUANA R. Classification with Partial Labels[C] //In Proceedings of the 14th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, 2008: 551-559. Google Scholar
[10]	ZHANG M L, YU F, TANG C Z. Disambiguation-Free Partial Label Learning[C] //IEEE Transactionson Knowledgeand Data Engineering. IEEE, 2017: 2155-2167. Google Scholar
[11]	WANG H B, XIAO R X, LI Y X, et al. PiCO: Contrastive Label Disambiguation for Partial Label Learning[C] //International Conference on Learning Representations, 2022. Google Scholar
[12]	WEN H W, CUI J Y, HANG H Y, et al. Leveraged Weighted Loss for Partial Label Learning[C] International Conference on Machine Learning, 2021, 139: 11091-11100. Google Scholar
[13]	LV J Q, XU M, FENG L, et al. Progressive Identification of True Labels for Partial-Label Learning[C] //Proceedings of the 37th International Conference on Machine Learning. ACM, 2020: 6500-6510. Google Scholar
[14]	FENG L, LYU J Q, HAN B, et al. Provably Consistent Partial-Label Learning[EB/OL]. (2020-10-23)[2023-04-20]. https://arxiv.org/pdf/2007.08929.pdf. Google Scholar
[15]	WU D D, WANG D B, ZHANG M L. Revisiting Consistency Regularization for Deep Partial Label Learning[C] International Conference on Machine Learning, 2022: 24212-24225. Google Scholar
[16]	WANG Q W, LI Y F, ZHOU Z H. Partial Label Learning with Unlabeled Data[C] International Joint Conference on Artificial Intelligence, 2019: 3755-3761. Google Scholar
[17]	LI Y, LIU C, ZHAO S Y, et al. Active Partial Label Learning Based on Adaptive Sample Selection[J]. International Journal of Machine Learning and Cybernetics, 2022, 13(6): 1603-1617. doi: 10.1007/s13042-021-01470-x CrossRef Google Scholar
[18]	KIHYUK S, DAVID B, NICHOLAS C, et al. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence[C] Neural Information Processing Systems, 2020: 596-608. Google Scholar
[19]	ZHANG B W, WANGY D, HOU W X, et al. FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling[C] Neural Information Processing Systems, 2021: 18408-18419. Google Scholar
[20]	LEE D H. PSEUDOL. TheSimple and Efficient Semi-Supervised Learning Method for Deep Neural Networks[C] //Workshop on challenges in representation learning, ICML, 2013, 3(2): 896. Google Scholar
[21]	MIYATO T, MAEDASI, KOYAMAM, et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning[J]. IEEE Transactionson Pattern Analysis and Machine Intelligence, 2019, 41(8): 1979-1993. doi: 10.1109/TPAMI.2018.2858821 CrossRef Google Scholar
[22]	EKIN D C, BARRET Z, JONATHON S, et al. Randaugment: Practical automated data augmentation with a reduced search space[C] Computer Vision and Pattern Recognition, 2020: 3008-3017. Google Scholar
[23]	EKIN D C, BARRET Z, DANDELION M, et al. Auto Augment: Learning Augmentation Strategies From Data[C] Computer Vision and Pattern Recognition, 2019: 113-123. Google Scholar
[24]	LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-Based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. doi: 10.1109/5.726791 CrossRef Google Scholar
[25]	XIAO H, RASUL K, VOLLGRAF R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms[J]. ArXive-Prints, 2017: 07747. Google Scholar
[26]	NETZER Y, WANG T, COATES A, et al. Reading Digits in Natural Images with Unsupervised Feature Learning[J]. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011: 067128. Google Scholar
[27]	KRIZHEVSKY A, HINTON G. Learning Multiple Layers of Features from Tiny Images[J]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4): 18268744. Google Scholar

Access History

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(5) / Tables(3)

Export Citation

PDF

XML

Article Metrics

Article views(11798) PDF downloads(598) Cited by(0)

Access History

Other Articles By Authors

on this site
on Google Scholar

HTML

开放科学(资源服务)标识码(OSID):
监督学习方法利用大量标记的训练样本来构建预测模型，在很多领域获得了较大成功. 但由于数据标注往往需要很高成本，在很多任务上很难获得全部真实标记的强监督信息，因此样本标注可能不完全、不确切和不精确，这些学习任务被称为弱监督学习. 偏标记学习^[1-2]是弱监督学习中的一种，它属于不确切学习的范畴. 偏标记学习任务中训练样本对应一个候选标签集合，集合中只有一个真实标记. 偏标记问题广泛应用于现实世界中的许多场景，例如图像分类^[3]、网络挖掘^[4]和自然语言处理^[5-7]等领域.

现有的偏标记学习策略有很多，总体上有3种类型：基于平均的消歧策略^[8-9]，在训练过程中平等对待所有候选标签；基于辨识的消歧策略，将候选标签集中的真实标记视为潜在变量^[10]；基于流形假设的消歧策略，流形假设认为相似样本的模型输出应该具有相似性，以此对偏标记数据进行消歧训练^[11]. 近年来，偏标记学习研究不断发展，有的偏标记方法不仅可以利用具有人工特征的关系型数据集，也可以利用图像数据集进行模型学习，如近些年的LWS方法^[12]、PRODEN方法^[13]、CC方法^[14]，以及结合一致性正则化的深度偏标签学习方法^[15]. 但这些偏标记学习方法大部分假设全部样本都具有候选标签集的弱监督信息，而在很多实际问题中，获取全部偏标记仍然需要耗费很大成本，而无标记数据则相对容易获得. 对于一部分样本带有偏标记，大部分是无标记数据的学习场景，称为偏标记半监督学习，目前对这类问题的研究较少. 2019年Wang等^[16]提出的PARM模型中通过模型分类器更新无标记数据标签置信度矩阵来处理偏标记半监督问题. 2022年Li等^[17]提出主动偏标记学习，从主动学习的角度同时利用无标记和偏标记数据，用偏标记弱监督信息建立代表性无标记样本的选取策略. 但这些偏标签半监督的研究大多针对人工特征数据集，无法应用于图像数据. 目前针对图像数据的半监督学习方法有很多^[18-19]，其中深度半监督模型甚至取得了与完全监督学习相媲美的结果. 但是传统的半监督方法中的标记样本是带有精确标记的，尚不能处理和利用偏标记信息，在偏标记半监督问题场景下还不能达到理想的效果. 因此，将偏标记和半监督学习两种弱监督框架结合起来，针对少量偏标记样本、大量无标记样本进行有效学习，对于进一步降低标注代价，扩展弱监督学习应用范围有着重要的意义和价值.

本研究基于包含3种损失项的目标函数，结合一致性正则化和伪标记方法提出了一种处理图像数据的偏标记半监督学习算法. 在学习过程中首先对偏标记和无标记数据进行强弱增强处理，偏标记样本的伪标记基于其弱增强生成且被限制于相应的候选标签集合中. 一致性正则化认为同一个样本的不同增强应该具有类似的模型输出，本研究采用高置信度伪标记的样本计算两种增强后的输出交叉熵损失，提高参与训练过程样本的可靠性. 实验结果说明，本研究的方法比现有处理图像数据的半监督学习方法和相关偏标记学习方法具有更高的精度和稳定性，收敛速度也有一定提升.

1. 相关工作

最近有不少研究将偏标记学习同深度学习相结合，深度偏标记学习已成为一种趋势. 其中，LWS算法^[12]是一种能够处理图像数据的深度偏标记学习算法，它通过风险一致性构建损失函数进行模型训练学习. 风险一致意味着分类器是一致的，也就是说偏标签学习产生的最佳分类器与完全监督学习产生的最佳分类器相同. PRODEN^[13]，CC^[14]等算法也是近年被提出的能够处理图像数据的深度偏标签学习算法.

图像分类半监督学习问题近年来得到了广泛的研究，起初Lee等^[20]运用伪标签方法给无标记样本打上伪标签进行训练，随后Miyato等^[21]提出了一致性正则化方法，取得了不错的效果，FixMatch^[18]，FlexMatch^[19]等算法结合了一致性正则化方法和伪标记方法，通过伪标记方法给无标记样本赋予伪标记，根据伪标记利用一致性正则化方法进行模型训练，分类性能达到与完全监督相近的效果.

另外，主动偏标记学习也是一种能较好解决偏标记半监督问题的方法^[17]. 主动偏标记学习的关键问题在于如何利用弱监督信息建立有效的样本选择机制，筛选出无标记样本中最具信息量和代表性的样本进行人工标注，再利用人工标注后的样本进行模型训练. 但是此方法不适用于无法进行人工标注或者成本太高的情况.

以上工作可分别适用于偏标记学习、半监督学习以及人工特征的主动偏标记学习等场景，但对于本研究所关注的图像分类问题中的偏标记半监督学习场景，仍有待进一步研究和改进.

4. 结论

本研究在拥有极少量标记样本、少量偏标记样本和大量无标记样本的图像分类问题上，运用一致性正则化方法和伪标签方法提出了一种新的图像分类偏标记半监督学习算法(CR-SSPL)，CR-SSPL在45种不同情况下数据集的分类精度都优于其他对比算法，同时在模型收敛速度上也有提升. 本研究主要贡献在于：①将弱监督和无监督学习结合起来，设计了包含3个损失项的新目标函数；②利用一致性正则化方法和伪标签方法充分利用了样本中的3种监督信息，通过置信度阈值考虑了参与训练的伪标记样本的可靠性；③ CR-SSPL在细粒度的大数据分类问题中显示出了显著优势. 未来将在本研究基础上进行扩展，研究偏多标签半监督学习问题.

Figure (5) Table (3) Reference (27)

Name
	Name cannot be empty!
E-mail
	Mailbox cannot be empty! Mailbox cannot be empty!
Telephone
	Mobile number cannot be empty! Please enter a valid mobile number!
Title

Content
Verification Code

Message Board

Deep Partial Semi-Supervised Learning Method Based on Consistency Regularization

Abstract

References

Access History

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Access History

Other Articles By Authors