Study on the Influence of Image Data Set on Identification of Silkworm Cultivar

YANG Chuang; SHI Hongkang; CHEN Yu; MA Yan; BAI Juan; JIANG Meng

doi:10.13718/j.cnki.xdzk.2023.04.011

2023 Volume 45 Issue 4

Article Contents

Previous Article Next Article

YANG Chuang, SHI Hongkang, CHEN Yu, et al. Study on the Influence of Image Data Set on Identification of Silkworm Cultivar[J]. Journal of Southwest University Natural Science Edition, 2023, 45(4): 110-118. doi: 10.13718/j.cnki.xdzk.2023.04.011

Citation:

YANG Chuang, SHI Hongkang, CHEN Yu, et al. Study on the Influence of Image Data Set on Identification of Silkworm Cultivar[J]. Journal of Southwest University Natural Science Edition, 2023, 45(4): 110-118. doi: 10.13718/j.cnki.xdzk.2023.04.011

Study on the Influence of Image Data Set on Identification of Silkworm Cultivar

1.
College of Engineering and Technology, Southwest University, Chongqing 400715, China
2.
Chongqing Yubei District Rural Cooperative Economic Development Service Center, Chongqing 401120, China
3.
Chongqing Yubei District Cash Crop Technology Extension Station, Chongqing 401120, China

More Information

Corresponding author: JIANG Meng
Received Date: 29/04/2022
Available Online: 20/04/2023
MSC: S882.2

Abstract

Aiming at the contradiction between the recognition accuracy rate of machine vision and the number of images, the number of varieties, data processing methods, costs and other elements of the data set in the identification of silkworm varieties, this paper collects the growth image of the real production environment of 20 silkworm cultivars on 3rd day of 4th age, and uses the lightweight convolutional neural network GhostNet to carry out model training on different training sets, and discusses the influence of image quantity, variety number, image data enhancement and transfer learning methods on the recognition accuracy. The results show that the dataset constituted with at least 400 of original images, and 10-12 selected varieties, can make the recognition accuracy up to more than 98%. If the number of original images is less than 100, it has no practical significance to improve the recognition accuracy through the way of image data enhancement. When the number of varieties is less than 12, the use of transfer learning methods can effectively improve the recognition rate, when the number of varieties is more than 14, the use of transfer learning methods will reduce the recognition accuracy, and the larger the number, the faster the decline.
- silkworm,
- species recognition,
- dataset

References

[1]	徐安英, 钱荷英, 孙平江, 等. 家蚕抗血液型脓病新品种华康3号的育成[J]. 蚕业科学, 2019, 45(2): 201-211. doi: 10.13441/j.cnki.cykx.2019.02.007 CrossRef Google Scholar
[2]	张友洪, 肖金树, 肖文福, 等. 春用多丝量家蚕品种金·兰×铭·晖的育成[J]. 蚕业科学, 2019, 45(1): 144-148. Google Scholar
[3]	陈惠蓉, 杨忠生, 李俊. 浅析桑蚕种质资源的保存与利用[J]. 四川蚕业, 2016, 44(2): 42-43. doi: 10.3969/j.issn.1006-1185.2016.02.018 CrossRef Google Scholar
[4]	肖阳, 李庆荣, 邢东旭, 等. 抗BmNPV家蚕新品种粤蚕11号的育成[J]. 广东农业科学, 2020, 47(8): 118-126. Google Scholar
[5]	何锐敏, 郑可锋, 张俊, 等. 工厂化养蚕精准饲喂信息系统的研究与开发[J]. 浙江农业科学, 2022, 63(2): 371-374, 380. Google Scholar
[6]	石洪康, 田涯涯, 杨创, 等. 基于卷积神经网络的家蚕幼虫品种智能识别研究[J]. 西南大学学报(自然科学版), 2020, 42(12): 34-45. Google Scholar
[7]	孙红, 李松, 李民赞, 等. 农业信息成像感知与深度学习应用研究进展[J]. 农业机械学报, 2020, 51(5): 1-17. Google Scholar
[8]	王鹏新, 田惠仁, 张悦, 等. 基于深度学习的作物长势监测和产量估测研究进展[J]. 农业机械学报, 2022, 53(2): 1-14. Google Scholar
[9]	KAMILARIS A, PRENAFETA-BOLDÚ F X. Deep Learning in Agriculture: a Survey[J]. Computers and Electronics in Agriculture, 2018, 147: 70-90. Google Scholar
[10]	王超. 基于机器视觉的蚕茧图像识别研究[D]. 柳州: 广西科技大学, 2019. Google Scholar
[11]	于业达, 高鹏飞, 赵一舟, 等. 基于深度卷积神经网络的蚕蛹雌雄自动识别[J]. 蚕业科学, 2020, 46(2): 197-203. Google Scholar
[12]	陶丹. 基于机器视觉的家蚕蛹雌雄识别研究[D]. 重庆: 西南大学, 2019. Google Scholar
[13]	陶丹, 王峥荣, 李光林, 等. 基于解模糊算法的蚕蛹图像恢复及雌雄识别[J]. 农业工程学报, 2016, 32(16): 168-174. Google Scholar
[14]	石洪康, 肖文福, 黄亮, 等. 基于卷积神经网络的家蚕病害识别研究[J]. 中国农机化学报, 2022, 43(1): 150-157. Google Scholar
[15]	石洪康. 基于卷积神经网络的家蚕脓病检测研究与预警软件开发[D]. 重庆: 西南大学, 2021. Google Scholar
[16]	樊湘鹏, 周建平, 许燕, 等. 数据集对基于深度学习的作物病害识别有效性影响[J]. 中国农机化学报, 2021, 42(1): 192-200. Google Scholar
[17]	ARNAL B. Impact of Dataset Size and Variety on the Effectiveness of Deep Learning and Transfer Learning for Plant Disease Classification[J]. Computers and Electronics in Agriculture, 2018, 153: 46-53. Google Scholar
[18]	HAN K, WANG YH, TIAN Q, et al. GhostNet: More Features from Cheap Operations[C] //2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA. IEEE, 2020: 1577-1586. Google Scholar
[19]	HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C] //2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South). IEEE, 2020: 1314-1324. Google Scholar
[20]	HU J, SHEN L, ALBANIE S, et al. Squeeze-and-Excitation Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. Google Scholar

Access History

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(8) / Tables(2)

Export Citation

PDF

XML

Article Metrics

Article views(804) PDF downloads(140) Cited by(0)

Access History

Other Articles By Authors

on this site
- YANG Chuang
- SHI Hongkang
- CHEN Yu
- MA Yan
- BAI Juan
- JIANG Meng
on Google Scholar
- YANG Chuang
- SHI Hongkang
- CHEN Yu
- MA Yan
- BAI Juan
- JIANG Meng

HTML

开放科学(资源服务)标志码(OSID):
家蚕品种资源保存和杂交育种是蚕业科研中的核心内容，在进行资源保存和育种试验前，需同步饲养多个品种. 确保品种纯净度是杂交试验和资源保存准确进行的前提^[1-2]. 由于家蚕自身的活动、蚕具交叉使用以及管理疏漏等因素，容易造成不同品种家蚕的混杂，对杂交和资源保存造成不利影响^[3-5]；同时家蚕品种多，不同品种之间的差异较小，传统人工识别容易产生混淆. 前期有研究表明使用深度学习识别家蚕品种具有较强的可行性^[6]. 深度学习需要大量的图像进行模型训练后才能形成识别能力，但构建大型的家蚕品种数据集却面临耗时长、成本高、采集条件受限等难题，因此有必要探明数据集对家蚕品种识别的影响.

近年来，深度学习在农业视觉领域广泛应用，成为当前的研究热点和主流趋势^[7-9]. 在家蚕识别领域，课题组前期使用MobileNet对10个家蚕品种在4龄第3 d和5龄第3 d的生长图像进行了识别研究，研究结果表明，深度学习可高效准确识别家蚕品种，且在4龄数据集上的识别准确率最高. 王超^[10]使用SE-GoogLeNet模型开展了蚕茧品质分选研究，对3类蚕茧的识别取得了较佳效果. 于业达等^[11]、陶丹等^[12-13]使用经典卷积神经网络开展了蚕蛹雌雄鉴别研究，也获得了较高的识别准确率. 石洪康等^[14-15]使用ResNet-50开展家蚕病害分类识别研究，实现了壮蚕期5种常见病害的准确识别；使用YOLO v3开展家蚕脓病的检测研究，实现了健康蚕与病蚕混杂的条件下对家蚕脓病的准确检测，为病害精准防治提供了依据.

现有研究表明，深度学习在家蚕识别领域具有广阔的应用前景，但大多基于固定的数据集，而当数据集中的图像数量、品种数量和数据增强方法发生变化时均可能会得到不同的识别结果^[16-17]. 为探明数据集对家蚕品种识别的影响，本文采集20个家蚕品种4龄第3 d真实生产环境的生长图像构建数据集，利用轻量级卷积神经网络GhostNet在不同的训练集上开展模型训练，探讨图像数量、品种数量和数据增强方法对识别率的影响.

3. 结论

针对当前我国家蚕品种数量多，开展基于深度学习的家蚕品种识别研究时，数据采集及数据集构建面临耗时长、成本高、采集条件受限等诸多问题，本文开展了数据集对家蚕品种识别的影响. 在实际生产环境条件下，采集了20个家蚕品种在4龄第3 d的真实生长图像，构建出家蚕品种图像数据集，采用GhostNet的家蚕品种识别模型，分别开展了图像数量、品种数量、图像数据增强和迁移学习方法对品种识别准确率的影响研究，结果表明：

1) 增加构成数据集的图像数量有助于提升识别准确率. 当单个品种训练集图像数量为400张时，识别准确率可达98.15%，达到行业标准要求；当图像数量在800张时，识别准确率高达99.30%. 图像数量增加纵然会提高识别准确率，但也会大大增加成本、降低运行效率、提升硬件要求等. 综合各项因素，在构成数据集时，单个品种的图像数量选取400张即可.

2) 品种数量会对识别准确率造成一定的影响. 品种数量过多或过少，都会使识别准确率降低，不能满足行业标准要求，只有当品种数量在10~12个时，识别准确率超过98%，能够满足行业标准要求，因此，建议在构成数据集中的品种数量选择在10~12个.

3) 构成数据集的原始图像数量低于100张，采用图像数据增强的方法对识别准确率的提升作用非常有限且无实际意义，因此，建议在构建数据集时尽可能增加原始图像数量.

4) 当品种数量低于12个时，采用迁移学习方法，可有效提升识别准确率，并取得较好的效果，品种数量越少，表现越好. 当品种数量大于14个时，迁移学习方法反而会使识别准确率下降，品种数量越多，下降的速度越快.

Figure (8) Table (2) Reference (20)

Name
	Name cannot be empty!
E-mail
	Mailbox cannot be empty! Mailbox cannot be empty!
Telephone
	Mobile number cannot be empty! Please enter a valid mobile number!
Title

Content
Verification Code

Message Board

Study on the Influence of Image Data Set on Identification of Silkworm Cultivar

Abstract

References

Access History

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Access History

Other Articles By Authors