Lightweight Study of a Multi-Target Recognition Model for Tomato Leaf Diseases

HU Junjie; ZHANG Cong; TAO Zhangfa; LIANG Hongrui; ZHANG Ruite; WANG Zheng

doi:10.13718/j.cnki.xdzk.2026.05.018

2026 Volume 48 Issue 5

Article Contents

Previous Article Next Article

HU Junjie, ZHANG Cong, TAO Zhangfa, et al. Lightweight Study of a Multi-Target Recognition Model for Tomato Leaf Diseases[J]. Journal of Southwest University Natural Science Edition, 2026, 48(5): 222-234. doi: 10.13718/j.cnki.xdzk.2026.05.018

Citation:

HU Junjie, ZHANG Cong, TAO Zhangfa, et al. Lightweight Study of a Multi-Target Recognition Model for Tomato Leaf Diseases[J]. Journal of Southwest University Natural Science Edition, 2026, 48(5): 222-234. doi: 10.13718/j.cnki.xdzk.2026.05.018

Lightweight Study of a Multi-Target Recognition Model for Tomato Leaf Diseases

1.
School of Electrical and Electronic Engineering, Wuhan Polytechnic University, Wuhan Hubei 430048, China
2.
School of Mathematics and Computer Science, Wuhan PolytechnicUniversity, Wuhan Hubei 430048, China
3.
School of Computer Science, Wuhan University, Wuhan Hubei 430072, China

More Information

Corresponding author: ZHANG Cong ;
Received Date: 05/10/2025
Available Online: 20/05/2026
MSC: TP391.41

Abstract

Addressing challenges such as limited detection targets and high computational complexity in existing tomato leaf disease recognition models, this paper proposed an improved target recognition model integrating composite convolutions. For the backbone network, the Dil-FasterNet block was introduced by modifying the FasterNet Block. This lightweight, multi-scale feature extraction module combines Dilated Convolution and Depth-Separable Convolution to enhance local details while reducing computational overhead, improving the capture of image details. Additionally, Group-Shuffle Convolution replaced the original convolutional layers. For the neck component, the Slim-Neck module was adopted as the neck network. GS bottleneck and the Cross Stage Partial Net (CSP Net) module were utilized to control the temporal complexity of the model while preserving hidden connections ofeach channel. FocalWise-IoU was designed as the loss function to strengthen the prediction of medium-quality anchor boxes by the detection model, reduce the over-optimization of high-quality anchor boxes, and retain the information from low-quality anchors. Experimental results demonstrated that the improved model achieved a 2.1 G reduction in FLOPs and a 1.1 M decrease in parameters on the dataset, while simultaneously increasing the mAP by 0.5% and reducing the weight file size by 27%.
- tomato leaf diseases,
- GSConv,
- non-monotonic dynamic focusing mechanism,
- multi-target recognition,
- depth-separable convolution

References

[1]	汝刚, 刘慧, 沈桂龙. 用人工智能改造中国农业: 理论阐释与制度创新[J]. 经济学家, 2020(4): 110-118. Google Scholar
[2]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot MultiBox Detector[C] //Computer Vision-ECCV 2016. Cham: Springer, 2016: 21-37. Google Scholar
[3]	REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-Time Object Detection[C] //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE Press, 2016: 779-788. Google Scholar
[4]	马一鸣, 尹爽, 郭瑞, 等. 烟叶烘烤阶段不同YOLO算法模型的实时判别性能比较[J/OL]. 河南农业大学学报, (2025-11-04)[2025-12-21]. https://doi.org/10.16445/j.cnki.1000-2340.20251104.001. Google Scholar
[5]	LI H L, LI J, WEI H B, et al. Slim-Neck by GSConv: A Lightweight-Design for Real-Time Detector Architectures[J]. Journal of Real-Time Image Processing, 2024, 21(3): 62. doi: 10.1007/s11554-024-01436-6 CrossRef Google Scholar
[6]	TONG Z J, CHEN Y H, XU Z W, et al. Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism[EB/OL]. (2023-01-24)[2025-10-11]. https://arxiv.org/abs/2301.10051. Google Scholar
[7]	ZHANG Y F, REN W Q, ZHANG Z, et al. Focal and Efficient IoU Loss for Accurate Bounding Box Regression[J]. Neurocomputing, 2022, 506: 146-157. doi: 10.1016/j.neucom.2022.07.042 CrossRef Google Scholar
[8]	刘洋, 宫志宏, 黎贞发, 等. 基于改进YOLOv5的番茄成熟度检测方法[J]. 中国农业气象, 2024, 45(12): 1521-1532. Google Scholar
[9]	HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C] //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE Press, 2016: 770-778. Google Scholar
[10]	TAN M X, LE Q V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[EB/OL]. (2019-05-28)[2025-10-11]. https://arxiv.org/abs/1905.11946. Google Scholar
[11]	LIU S, QI L, QIN H F, et al. Path Aggregation Network for Instance Segmentation[C] //2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 8759-8768. Google Scholar
[12]	刘伯红, 郝文瑞. 面向交通目标的多尺度轻量化检测模型[J]. 重庆邮电大学学报(自然科学版), 2025, 37(2): 185-195. Google Scholar
[13]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031 CrossRef Google Scholar
[14]	LI J F, WEN Y, HE L H. SCConv: Spatial and Channel Reconstruction Convolution for Feature Redundancy[C] //2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE Press, 2023: 6153-6162. Google Scholar
[15]	YU F, KOLTUN V. Multi-Scale Context Aggregation by Dilated Convolutions[EB/OL]. (2015-12-23)[2025-10-11]. https://arxiv.org/abs/1511.07122. Google Scholar
[16]	袁泉, 杨清泉, 袁亚隆, 等. 改进YOLOv8的水下目标检测算法[J]. 重庆邮电大学学报(自然科学版), 2025, 37(5): 729-740. Google Scholar
[17]	REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression[C] //2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE Press, 658-666. Google Scholar
[18]	ZHENG Z H, WANG P, LIU W, et al. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000. Google Scholar
[19]	杨鹏. 离群检测及其优化算法研究[D]. 重庆: 重庆大学, 2010. Google Scholar
[20]	胡佳乐, 周敏, 申飞. 面向无人机小目标的RTDETR改进检测算法[J]. 计算机工程与应用, 2024, 60(20): 198-206. Google Scholar
[21]	SAKIB S N, HAQUE N, HOSSAIN MZ, et al. PlantVillageVQA: A Visual Question Answering Dataset for Benchmarking Vision-Language Models in Plant Science[EB/OL]. (2025-08-23)[2025-10-11]. https://arxiv.org/abs/2508.17117. Google Scholar
[22]	SELVARAJU RR, COGSWELL M, DAS A, et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization[J]. International Journal of Computer Vision, 2020, 128(2): 336-359. doi: 10.1007/s11263-019-01228-7 CrossRef Google Scholar

Access History

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(9) / Tables(7)

Export Citation

PDF

XML

Article Metrics

Article views(51) PDF downloads(7) Cited by(0)

Access History

Other Articles By Authors

on this site
on Google Scholar

HTML

开放科学（资源服务）标识码（OSID）：
近年来，以卷积神经网络为核心的深度学习技术在计算机视觉领域取得了突破性进展，在病害特征提取、分类识别与目标定位任务中具备远超传统人工识别方法的精度与鲁棒性，为作物病害智能检测提供了全新的技术方案^[1]。随着智能农业的快速发展，利用深度学习技术对番茄叶片病害进行实时检测可显著提升番茄生产效率。

当前，国内外主流的基于深度学习的目标识别模型有单阶段目标检测算法和双阶段目标检测算法。单阶段目标检测算法包括SSD^[2]、YOLO^[3]等，其中YOLO模型因检测速度快、准确度高而备受关注^[4]。然而，在农田的叶片病害检测场景中，叶片病害目标普遍存在小目标占比较高、背景复杂以及现有识别模型难以兼顾检测精度与模型轻量化的问题。

针对上述问题，本研究围绕番茄叶片病害识别任务设计了Dil-YOLO(YOLOv8n-DilatedConv-Slim-Neck)番茄叶片病害识别模型，从3个方面进行改进：①在YOLOv8主干网络设计了Dil-FasterNet模块，替换原C2f模块，该模块将标准卷积拆分为单通道空间特征提取的深度卷积(Depth Wise Convolution，DWConv)与跨通道特征融合的逐点卷积(Point Wise Convolution，PWConv)，在不破坏特征传递逻辑的前提下大幅降低单卷积层的参数量与浮点运算量；②引入高效特征颈部网络Slim-Neck模块^[5]，该模块通过分组卷积(Grouped Convolution，GConv)与通道混洗操作的协同作用，在降低卷积运算参数量与计算复杂度的同时，保留了通道间的特征交互能力；③设计了FocalWise-IoU损失函数，结合Wise-IoU函数的动态聚焦机制^[6]和Focal-EIoU v1函数的动态权重分配机制^[7]，加强对低质量锚框的识别能力。通过上述改进整体提升番茄叶片病害识别的准确性与鲁棒性。

3. 结论

随着智慧农业的高速发展，农田环境下对于高精度、低延时、低算力的目标识别模型提出了更高的需求。当前主流识别模型在目标密集、尺度差异大的农田环境中常出现识别偏差，难以兼顾模型大小等问题。基于此，本研究提出了一种基于YOLOv8的番茄叶片病害识别模型。在YOLOv8的骨干网络中设计了Dil-FasterNet模块，通过空洞卷积可在不增加参数量的同时捕捉多尺度特征，通过深度可分离卷积能将标准卷积拆解为深度卷积与逐点卷积以大幅削减参数量和计算量。之后在颈部网络引入Slim-Neck模块，通过GSConv来替代标准卷积，搭配VoV-GSCSP瓶颈结构优化特征融合，在大幅降低参数量与计算开销的同时，强化多尺度特征的高效交互与表达。最后设计了FocalWise-IoU损失函数，使模型聚焦低IoU样本，解决锚框质量导致的梯度不平衡的问题，通过对特征维度的精细化约束，提升模型对复杂目标的判别能力。与基线模型相比，Dil-YOLO模型大幅减少了模型的计算量和参数量，Params降低了1.1 M，浮点运算量降低了2.1 G，且模型mAP相比于基线模型提高了0.5个百分点。

在后续研究中，将进一步增加数据集中病害种类和图像数量，强化数据的多样性，并与更多先进模型进行对比分析。同时补充不同地域、不同种植模式(设施大棚、露天种植)、不同作物生育期、极端光照/阴雨/叶片严重遮挡等田间复杂环境下的病害样本，构建更完备的番茄病害专用数据集。

Figure (9) Table (7) Reference (22)

Name
	Name cannot be empty!
E-mail
	Mailbox cannot be empty! Mailbox cannot be empty!
Telephone
	Mobile number cannot be empty! Please enter a valid mobile number!
Title

Content
Verification Code

Message Board

Lightweight Study of a Multi-Target Recognition Model for Tomato Leaf Diseases

Abstract

References

Access History

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Access History

Other Articles By Authors