DDoS Attack Detection Based on Parallel Accumulation Ranker Algorithm and Active Learning

WANG Hui; ZHANG Xue-jun

doi:10.13718/j.cnki.xsxb.2021.01.005

To classify accurately and quickly large capacity network traffic in high-speed network environment to detect distributed denial of service (DDoS) attacks, a DDoS attack detection algorithm based on parallel cumulative ranker algorithm and active learning has been proposed in this paper. In this technique, the parallel accumulation ranker algorithm has been used to accumulate and rank the traffic features to select the best feature subset, and the expert module selects the appropriate examples in an unsupervised way to train the support vector machine binary classifier for detecting DDoS attack traffic, so as to select a small number of training samples from the data set to generate high precision network traffic classification. Experiments show that compared with the existing methods, the proposed algorithm is superior to the existing method performance in classification accuracy and execution speed.

HTML

分布式拒绝服务(Distributed Denial of Service，DDoS)攻击由于攻击签名不断变化而很难防御，对各种业务和企业构成了严重威胁^[1-2].快速有效的网络流量识别和分类可以显著提高网络安全，由于传输数据的大小不断增加以及可用的应用程序的多样性，必须通过流量分析进行流量优先级排序和诊断监控^[3-5].信息多样性或传播对网络流量分类来说是一个很大的挑战，信息传播意味着每种类型的流量都可以具有独特的特征或统计属性.集体分类指使用所有可能的信息对一组相互关联的对象进行分类，为了执行集体分类任务，需要为流量实例的初始群体检索类别标签，并在下一轮分类中使用这类标签.因此，在对整个业务进行分类之前需要用所需的信息来标记部分被选择的实例，并且确定它们对于不同类别的归属.基于初始信息可以成批对网络流量的所有其他剩余实例进行分类^[6].

主动学习是半监督机器学习的一种特例^[7-8]，其中学习算法能够交互式地查询用户(或某些其他信息源)以获得新数据点上的期望输出，被称为最佳实验设计^[9].虽然存在未标记数据丰富的情况，但是手动标记这些数据成本非常昂贵，而学习算法可以主动地向用户、教师或专家查询标签，这种类型的迭代监督学习称为主动学习.由于学习者选择示例，因此用于学习概念的示例数量通常会远远低于正常监督学习所需的数量.本文使用具有较少训练实例的主动学习法来处理大量的网络流量.

能够正确和快速地检测DDoS攻击是网络安全需要解决的关键技术.近年来，有关DDoS攻击检测系统的研究已取得若干成果.文献[10]提出了一种基于多级自动编码器特征学习的高效DDoS攻击检测技术，该技术以无监督方式学习多层次的浅层和深层自动编码器来对训练和测试数据进行编码，以用于特征再生，通过使用有效的多核学习算法组合多级特征来学习最终的统一检测模型.文献[11]比较了集中式和分布式特征选择方法，该方法垂直或水平地划分数据集，可以在显著减少运行时间的情况下获得更高的分类性能.文献[12]提出了一个快速最小冗余最大相关性算法，并在几个不同的平台上得到了实现，即用于顺序执行的中央处理器(CPU)、用于并行计算的图形处理器以及用于使用大数据技术进行分布式计算的Apache Spark.

结合文献[11]划分数据集的方法和文献[12]快速最小冗余最大相关性算法的优点，本文提出一种基于并行积累排序算法和主动学习的DDoS攻击检测技术.该技术在GPU的核心之间分配海量网络流量数据集的计算负载，并将特征选择方法局部应用于每个核心，可以处理大型数据集，并在不影响质量的情况下近乎实时地对其进行处理.本文首先在并行计算环境中以积累排序方式来对网络流量进行排序，以此选择最佳特征子集.为了大量处理网络流量，通过专家模块以无监督的方式选择适当的实例来训练SVM二值分类器，从而实现从大容量网络流量选择小批量训练样本产生高精度网络流量的分类目的.实验结果显示，本文算法在执行时间和分类准确度性能方面优于其他方法.

4. 结语

为了对大量的网络流量进行正确和快速地分类以检测DDoS攻击，本文采用基于并行积累排序算法和主动学习的DDoS攻击检测方法.该方法通过并行积累排序算法对数据集的属性进行排序来寻找最佳可能的特征，使用并行计算方法来处理大量的网络流量，并讨论了主动学习的重要性，通过专家模块以无监督的方式选择适当的实例来训练用于检测DDoS攻击流量的SVM二值分类器，以此实现从数据集中选择小批量训练样本来产生高精度的网络流量分类.实验结果表明，本文算法在处理大流量数据分类时，在训练样本较少的情况下提供了更好的分类准确率和更快的速度.未来的工作是通过结合软计算和其他技术开发一种由主动学习支持的模糊推理来扩展PCR，以便能够对大量数据空间的特征进行排序以建立其通用性.

Figure (6) Table (1) Reference (14)

Name
	Name cannot be empty!
E-mail
	Mailbox cannot be empty! Mailbox cannot be empty!
Telephone
	Mobile number cannot be empty! Please enter a valid mobile number!
Title

Content
Verification Code

[1]	汪洋, 伍忠东, 朱婧.基于深度序列加权核极限学习的入侵检测算法[J].计算机应用研究, 2020, 37(3): 829-832. Google Scholar
[2]	SINGH P K, JHA S K, NANDI S K, et al. ML-Based Approach to Detect DDoS Attack in V2I Communication Under SDN Architecture[C]//TENCON 2018-2018 IEEE Region 10 Conference. Jeju: IEEE, 2018. Google Scholar
[3]	PACHECO F, EXPOSITO E, GINESTE M, et al. Towards the Deployment of Machine Learning Solutions in Network Traffic Classification: a Systematic Survey [J]. IEEE Communications Surveys & Tutorials, 2019, 21(2): 1988-2014. Google Scholar
[4]	燕昺昊, 韩国栋, 黄雅静, 等.非平衡网络流量识别方法[J].计算机应用, 2018, 38(1): 20-25. Google Scholar
[5]	WANG P, YE F, CHEN X J, et al. Datanet: Deep Learning Based Encrypted Network Traffic Classification in SDN Home Gateway [J]. IEEE Access, 2018, 6: 55380-55391. doi: 10.1109/ACCESS.2018.2872430 CrossRef Google Scholar
[6]	刘敏, 滕华, 何先波.基于核函数的软件定义网络DDo S实时安全系统[J].计算机应用研究, 2020, 37(3): 843-846, 850. Google Scholar
[7]	BARTHOLOMEW J B, JOWERS E M, ROBERTS G, et al. Active Learning Increases Children's Physical Activity across Demographic Subgroups [J]. Translational Journal of the American College of Sports Medicine, 2018, 3(1): 1-9. doi: 10.1249/TJX.0000000000000051 CrossRef Google Scholar
[8]	SHEKHAR P, PRINCE M, FINELLI C, et al. Integrating Quantitative and Qualitative Research Methods to Examine Student Resistance to Active Learning [J]. European Journal of Engineering Education, 2019, 44(1/2): 6-18. Google Scholar
[9]	MELNIKOV A A, POULSEN NAUTRUP H, KRENN M, et al. Active Learning Machine Learns to Create New Quantum Experiments [J]. PNAS, 2018, 115(6): 1221-1226. doi: 10.1073/pnas.1714936115 CrossRef Google Scholar
[10]	YAN B H, HAN G D. Effective Feature Extraction via Stacked Sparse Autoencoder to Improve Intrusion Detection System [J]. IEEE Access, 2018, 6: 41238-41248. doi: 10.1109/ACCESS.2018.2858277 CrossRef Google Scholar
[11]	MORÁN-FERNÁNDEZ L, BOLÓN-CANEDO V, ALONSO-BETANZOS A. Centralized Vs. Distributed Feature Selection Methods Based on Data Complexity Measures [J]. Knowledge-Based Systems, 2017, 117: 27-45. doi: 10.1016/j.knosys.2016.09.022 CrossRef Google Scholar
[12]	RAMÍREZ-GALLEGO S, LASTRA I, MARTÍNEZ-REGO D, et al. Fast-mRMR: Fast Minimum Redundancy Maximum Relevance Algorithm for High-Dimensional Big Data [J]. International Journal of Intelligent Systems, 2017, 32(2): 134-152. doi: 10.1002/int.21833 CrossRef Google Scholar
[13]	ASHFAQ R A R, WANG X Z, HUANG J Z, et al. Fuzziness Based Semi-supervised Learning Approach for Intrusion Detection System [J]. Information Sciences, 2017, 378: 484-497. doi: 10.1016/j.ins.2016.04.019 CrossRef Google Scholar
[14]	CAO J, FANG Z, QU G, et al. An Accurate Traffic Classification Model Based on Support Vector Machines[J]. International Journal of Network Management, 2017, 27(1): 1-15. Google Scholar

Message Board

DDoS Attack Detection Based on Parallel Accumulation Ranker Algorithm and Active Learning

Abstract

References

Access History

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Access History

Other Articles By Authors