An NLP Migration Learning for Improving Cross-lingual Understanding

WANG Kun; SHENG Hongyu

doi:10.13718/j.cnki.xdzk.2024.04.015

2024 Volume 46 Issue 4

Article Contents

Previous Article Next Article

WANG Kun, SHENG Hongyu. An NLP Migration Learning for Improving Cross-lingual Understanding[J]. Journal of Southwest University Natural Science Edition, 2024, 46(4): 153-163. doi: 10.13718/j.cnki.xdzk.2024.04.015

Citation:

WANG Kun, SHENG Hongyu. An NLP Migration Learning for Improving Cross-lingual Understanding[J]. Journal of Southwest University Natural Science Edition, 2024, 46(4): 153-163. doi: 10.13718/j.cnki.xdzk.2024.04.015

An NLP Migration Learning for Improving Cross-lingual Understanding

WANG Kun¹,
SHENG Hongyu^2,

1.
Sichuan Vocational College of Information Technology, Guangyuan Sichuan 628017, China
2.
College of Robotics, Beijing Union University, Beijing 100101, China

More Information

Corresponding author: SHENG Hongyu
Received Date: 27/06/2023
Available Online: 20/04/2024
MSC: TP393

Abstract

With the development of internet-based information, effectively representing the information contained in different languages has become an important task in the field of Natural Language Processing (NLP). However, many traditional machine learning models rely on training in high-resource languages and cannot be used in low-resource languages. To address this issue, this paper proposes a migration learning method called Multi-lingual Bidirectional Encoder Representations from Transformers (M-BERT) that combines migration learning with deep learning models. This method utilizes M-BERT as a feature extractor to transform features between the source language domain and the target language domain, thereby reducing the differences between different language domains and improving the generalization ability of the target task across domains. First, the BERT model was constructed. Then, the construction of the M-BERT model was completed through pre-training operations such as data collection and processing, training setup, parameter estimation, and model training. Fine-tuning was performed on the target task. Finally, migration learning was employed to apply the M-BERT model in cross-lingual text analysis. The cross-lingual migration experiments from English to French and German demonstrated that the model proposed in this paper exhibited high performance quality and required minimal computational effort, achieved an accuracy of 96.2% in the joint training scheme. The research results indicate that this model achieved cross-lingual data migration, validated its effectiveness and innovation in the field of cross-lingual NLP.
- NLP,
- M-BERT,
- migration learning,
- cross-lingual,
- deep learning

References

[1]	KHURANA D, KOLI A, KHATTER K, et al. Natural Language Processing: State of the Art, Current Trends and Challenges[J]. Multimedia Tools and Applications, 2023, 82(3): 3713-3744. doi: 10.1007/s11042-022-13428-4 CrossRef Google Scholar
[2]	赵京胜, 宋梦雪, 高祥, 等. 自然语言处理中的文本表示研究[J]. 软件学报, 2022, 33(1): 102-128. Google Scholar
[3]	张博, 董瑞海. 自然语言处理技术赋能教育智能发展——人工智能科学家的视角[J]. 华东师范大学学报(教育科学版), 2022, 40(9): 19-31. Google Scholar
[4]	江洋洋, 金伯, 张宝昌. 深度学习在自然语言处理领域的研究进展[J]. 计算机工程与应用, 2021, 57(22): 1-14. doi: 10.3778/j.issn.1002-8331.2106-0166 CrossRef Google Scholar
[5]	陆金梁, 张家俊. 基于多语言预训练语言模型的译文质量估计方法[J]. 厦门大学学报(自然科学版), 2020, 59(2): 151-158. Google Scholar
[6]	鲍小异, 姜晓彤, 王中卿, 等. 基于跨语言图神经网络模型的属性级情感分类[J]. 软件学报, 2023, 34(2): 676-689. Google Scholar
[7]	SORIN V, BARASH Y, KONEN E, et al. Deep Learning for Natural Language Processing in Radiology-Fundamentals and a Systematic Review[J]. Journal of the American College of Radiology: JACR, 2020, 17(5): 639-648. doi: 10.1016/j.jacr.2019.12.026 CrossRef Google Scholar
[8]	WU L F, CHEN Y, SHEN K, et al. Graph Neural Networks for Natural Language Processing: a Survey[J]. Foundations and Trends © in Machine Learning, 2023, 16(2): 119-328. doi: 10.1561/2200000096 CrossRef Google Scholar
[9]	ZHANG W E, SHENG Q Z, ALHAZMI A, et al. Adversarial Attacks on Deep-Learning Models in Natural Language Processing: a Survey[J]. ACM Transactions on Intelligent Systems and Technology, 11(3): 1-41. Google Scholar
[10]	VÁZQUEZ R, RAGANATO A, CREUTZ M, et al. A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation[J]. Computational Linguistics, 2020, 46(2): 387-424. doi: 10.1162/coli_a_00377 CrossRef Google Scholar
[11]	VERMA V K, PANDEY M, JAIN T, et al. Dissecting Word Embeddings and Language Models in Natural Language Processing[J]. Journal of Discrete Mathematical Sciences and Cryptography, 2021, 24(5): 1509-1515. doi: 10.1080/09720529.2021.1968108 CrossRef Google Scholar
[12]	GU Y, TINN R, CHENG H, et al. Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing[J]. ACM Transactions on Computing for Healthcare, 2021, 3(1): 1-23. Google Scholar
[13]	岳增营, 叶霞, 刘睿珩. 基于语言模型的预训练技术研究综述[J]. 中文信息学报, 2021, 35(9): 15-29. doi: 10.3969/j.issn.1003-0077.2021.09.002 CrossRef Google Scholar
[14]	MOON J, PARK G, JEONG J. POP-ON: Prediction of Process Using One-Way Language Model Based on NLP Approach[J]. Applied Sciences, 2021, 11(2): 864-882. doi: 10.3390/app11020864 CrossRef Google Scholar
[15]	AGGARWAL A, CHAUHAN A, KUMAR D, et al. Classification of Fake News by Fine-Tuning Deep Bidirectional Transformers Based Language Model[J]. ICST Transactions on Scalable Information Systems, 2018, 27(7): 163973. Google Scholar
[16]	PELICON A, PRANJIĆ M, MILJKOVIĆ D, et al. Zero-Shot Learning for Cross-Lingual News Sentiment Classification[J]. Applied Sciences, 2020, 10(17): 5993-6013. doi: 10.3390/app10175993 CrossRef Google Scholar
[17]	KIM B, YANG Y, PARK J S, et al. Machine Learning Based Representative Spatio-Temporal Event Documents Classification[J]. Applied Sciences, 2023, 13(7): 4230-4241. doi: 10.3390/app13074230 CrossRef Google Scholar
[18]	贾明华, 王秀利. 基于BERT和互信息的金融风险逻辑关系量化方法[J]. 数据分析与知识发现, 2022, 6(10): 68-78. doi: 10.11925/infotech.2096-3467.2022.0009 CrossRef Google Scholar

Access History

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(7) / Tables(5)

Export Citation

PDF

XML

Article Metrics

Article views(5491) PDF downloads(640) Cited by(0)

Access History

Other Articles By Authors

on this site
- WANG Kun
- SHENG Hongyu
on Google Scholar
- WANG Kun
- SHENG Hongyu

HTML

开放科学（资源服务）标识码（OSID）：
随着全球化不断推进和信息技术的迅猛发展，跨语言理解在自然语言处理(Natural Language Processing，NLP)领域扮演着重要的角色^[1]. 自然语言处理(NLP)是数据挖掘的一个前沿方向，融合了机器学习与统计学、数学、语言学等学科，近年来发展迅速^[2-4]. 通过统计和机器学习方法，计算机能够快速地处理、分析并运用文本的深层语义信息，像人类一样理解并生成自然语言. “自然语言”的含义是自然进化形成的人类语言，如中文、英文、拉丁语等，有别于Java、C++等程序语言.

在互联网时代，人们可以通过网络轻松地获取来自不同语言和文化背景的信息，这使得跨语言NLP任务变得尤为关键. 例如，机器翻译、情感分析和命名实体识别等任务都需要处理多种语言之间的转换和理解. 然而，不同语言之间存在结构、词汇、语法和文化等方面的差异，这给跨语言理解带来了巨大的挑战.

传统的机器学习方法在处理跨语言任务时往往需要大量的人工特征工程和领域知识^[5-6]，这些方法通常依赖手工设计特征来捕捉不同语言之间的差异和共性，然后使用分类器或回归模型进行训练和推理. 然而，这种方法面临多个问题. 首先，人工特征工程耗时耗力，并且对不同语言之间的差异和数据稀缺性处理困难. 其次，这种方法可能无法充分利用深层语义和上下文信息，导致在跨语言理解任务中的性能不尽如人意.

神经网络和深度学习模型在NLP领域取得了显著的突破和成功^[7-9]，相比于传统的机器学习方法，神经网络和深度学习模型具有更强的表示能力和泛化能力，能够从大规模数据中自动学习特征和模式，并且能够处理复杂的语言结构和语义关系. 在跨语言NLP领域，神经网络和深度学习模型的应用也取得了一定的成果，通过迁移学习和跨语言数据的利用，能够有效地解决语言差异和数据稀缺性带来的挑战. 其中，BERT(Bidirectional Encoder Representations from Transformers)是一种最具代表性的深度学习模型. BERT是跨语言NLP领域中基于模型的迁移学习方法，它在多个跨语言NLP任务上取得了最先进的性能，并成为了跨语言NLP任务的基准模型.

在跨语言学习领域，Vázquez等^[10]通过重用多语言神经机器翻译的编码器进行零样本二元情感分类，他们使用特定于任务的分类器组件扩展了该编码器，并用新语言执行文本分类. Verma等^[11]提出了ULMFiT模型，该模型通过在通用领域语料库上预训练通用语言模型，使用判别式微调对目标任务数据上的模型进行微调，从而应用于任何NLP任务. Gu等^[12]使用针对特定任务训练的双向LSTM，通过查看整个句子来呈现词嵌入中词的上下文敏感表示. 还有一些学者研究生成了两种基于Transformer的语言模型，分别是OpenAI GPT和BERT^[13]. OpenAI GPT是一种单向语言模型；而BERT是第一个深度双向、无监督的语言表示模型，仅使用纯文本语料库进行预训练^[14-15]. Pelicon等^[16]使用BERT通过在斯洛文尼亚语中训练分类器，并使用其他语言的文本进行推理来执行情感分类. Kim等^[17]将预训练好的嵌入向量迁移到LSTM结构建立了Docbert模型，在实现参数压缩的同时保持了BERT在文本分类任务中的准确性. 贾明华等^[18]定性地研究了BERT-Large中每层Transformer在不同NLP任务中的贡献.

基于以上研究，本文结合迁移学习和深度学习模型，提出一种M-BERT迁移学习模型，用于解决跨语言NLP任务中的关键问题. 通过实验对比了本文提出的方法与其他先进算法在多个跨语言NLP任务上的性能差异，结果证明本文方法在各项任务中都取得了显著优于现有算法的结果，具有较高的性能. 本文的目的与研究意义旨在探索一种基于NLP迁移学习的方法，结合深度学习模型(M-BERT模型)，用于提高跨语言理解的性能. 通过迁移学习，可以利用源语言的丰富资源和知识来改善目标语言的学习能力，从而解决数据稀缺和语言差异带来的问题. 同时，通过引入深度学习模型，可以利用其强大的表示学习能力和上下文理解能力，进一步提高跨语言理解的准确性和泛化能力.

4. 结论

跨语言NLP任务面临着语言差异和数据稀缺的挑战. 为了解决这些挑战，研究者们提出了一系列方法，包括基于迁移学习和基于神经网络的方法. 本文结合迁移学习和深度学习模型，提出一种新的方法来提高跨语言理解效果. 首先，构建BERT模型；然后，通过一系列预训练操作完成M-BERT模型构建，并在目标任务上进行微调；最后，为了将所学到的知识应用于目标语言任务并提高目标语言的理解能力，本文采用迁移学习策略，将M-BERT作为特征提取器用于源语言领域和目标语言领域之间的特征转换. 这种迁移学习的方式能够在不同语言之间实现知识共享和迁移，提高目标语言任务的性能. 在实验部分，我们选择多个常见的跨语言NLP任务，并与其他先进算法进行了比较. 实验结果表明，本文提出的方法在这些任务上优于所有对比的先进算法. 通过对比实验和定量评估实验，验证了本文方法的可行性和优越性，该方法能够将源语言上学到的语义知识迁移到目标语言上，从而弥补目标语言中的数据稀缺性. 下一阶段将深入研究在M-BERT中的多层次语言提取编码，以便正确理解和分析该模型对不同信息的获取. 此外，还将评估其他BERT架构，如DeeBERT、MobileBERT、SpanBERT和AlBERT，进一步探究更先进的NLP处理模型，从而更好地应用于跨语言研究领域.

Figure (7) Table (5) Reference (18)

Name
	Name cannot be empty!
E-mail
	Mailbox cannot be empty! Mailbox cannot be empty!
Telephone
	Mobile number cannot be empty! Please enter a valid mobile number!
Title

Content
Verification Code

Message Board

An NLP Migration Learning for Improving Cross-lingual Understanding

Abstract

References

Access History

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Access History

Other Articles By Authors