Research on Intertemporal Arbitrage Based on Machine Learning and Empirical Mode Decomposition

ZHOU Liang; CHEN Chen; LI Ning

doi:10.13718/j.cnki.xdzk.2022.01.014

2022 Volume 44 Issue 1

Article Contents

Previous Article Next Article

ZHOU Liang, CHEN Chen, LI Ning. Research on Intertemporal Arbitrage Based on Machine Learning and Empirical Mode Decomposition[J]. Journal of Southwest University Natural Science Edition, 2022, 44(1): 148-159. doi: 10.13718/j.cnki.xdzk.2022.01.014

Citation:

ZHOU Liang, CHEN Chen, LI Ning. Research on Intertemporal Arbitrage Based on Machine Learning and Empirical Mode Decomposition[J]. Journal of Southwest University Natural Science Edition, 2022, 44(1): 148-159. doi: 10.13718/j.cnki.xdzk.2022.01.014

Research on Intertemporal Arbitrage Based on Machine Learning and Empirical Mode Decomposition

1.
School of Finance, Hunan University of Finance and Economics, Changsha 410205, China
2.
School of Finance, Southwestern University of Finance and Economic, Chengdu 611130, China

More Information

Received Date: 04/01/2021
Available Online: 20/01/2022
MSC: F830.9

Abstract

This paper used rolling EMD(Empirical Mode Decomposition) method to decompose the price gap of the CSI 300 stock index futures contract of the current month and the next month, and used three machine learning models (Elman network, RF, SVM) and ARIMA model to analyze and synthesize signals of different frequencies, and designed intertemporal arbitrage strategies based on the forecast results. The research results show that: the prediction accuracy of SVM, RF and ARIMA models is higher than that of Elman network. All models can achieve higher arbitrage returns, and the use of model fusion which combines liner and nonliner models can improve the risk control ability of the model. The combination of machine learning prediction and EMD decomposition technology can greatly increase the profitability of the model without increasing the risk, so that the Sharpe ratio and the Sotino ratio of the model are both larger. Sub-sample test, full IMF signal prediction and arbitrage analysis based on the commodity futures market have all proved that the machine learning model integrated with EMD can achieve better arbitrage effects than pure machine learning models. The research conclusions help to promote the cross-integration research of artificial intelligence and finance, and also provide theoretical and practical references for futures investment.
- machine learning,
- empirical mode decomposition,
- intertemporal arbitrage,
- futures investment,
- artificial intelligence

References

[1]	杨云飞, 鲍玉昆, 胡忠义, 等. 基于EMD和SVMs的原油价格预测方法[J]. 管理学报, 2010, 7(12): 1884-1889. doi: 10.3969/j.issn.1672-884X.2010.12.023 CrossRef Google Scholar
[2]	JACOBS H, WEBER M. On the Determinants of Pairs Trading Profitability[J]. Journal of Financial Markets, 2015, 23: 75-97. doi: 10.1016/j.finmar.2014.12.001 CrossRef Google Scholar
[3]	张波, 刘晓倩. 基于EGARCH-M模型的沪深300股指期货跨期套利研究——一种修正的协整关系[J]. 统计与信息论坛, 2017, 32(4): 34-40. doi: 10.3969/j.issn.1007-3116.2017.04.006 CrossRef Google Scholar
[4]	刘海飞, 李伟, 李冬昕, 等. 股指期货跨期套利自适应机制理论与实证——基于沪深300股指期货高频数据的证据[J]. 华东经济管理, 2018, 32(11): 102-111. Google Scholar
[5]	KRAUSS C, DO X A, HUCK N. Deep Neural Networks, Gradient-Boosted Trees, Random Forests: Statistical Arbitrage on the S&P 500[J]. European Journal of Operational Research, 2017, 259(2): 689-702. doi: 10.1016/j.ejor.2016.10.031 CrossRef Google Scholar
[6]	HAIN M, HESS J, UHRIG-HOMBURG M. Relative Value Arbitrage in European Commodity Markets[J]. Energy Economics, 2018, 69: 140-154. doi: 10.1016/j.eneco.2017.11.005 CrossRef Google Scholar
[7]	邢亚丹, 劳兰珺, 孙谦. 跨期套利收益与风险来源探究——基于沪深300股指期货高频跨期套利策略[J]. 投资研究, 2015, 34(10): 98-109. Google Scholar
[8]	DUNIS C L, LAWS J, EVANS B. Modelling and Trading the Soybean-Oil Crush Spread with Recurrent and Higher Order Networks: a Comparative Analysis[J]. Neural Network World, 2006, 16(3): 193-213. Google Scholar
[9]	HUCK N. Pairs Selection and Outranking: an Application to the S&P 100 Index[J]. European Journal of Operational Research, 2009, 196(2): 819-825. doi: 10.1016/j.ejor.2008.03.025 CrossRef Google Scholar
[10]	WILES P S, ENKE D. Nonlinear Modeling Using Neural Networks for Trading the Soybean Complex[J]. Procedia Computer Science, 2014, 36: 234-239. doi: 10.1016/j.procs.2014.09.085 CrossRef Google Scholar
[11]	王文波, 费浦生, 羿旭明. 基于EMD与神经网络的中国股票市场预测[J]. 系统工程理论与实践, 2010, 30(6): 1027-1033. Google Scholar
[12]	刘建和, 梁仁方, 王玉斌, 等. 大豆期货合约均值回归套利策略和Elman神经网络套利策略对比研究[J]. 湖南财政经济学院学报, 2016(3): 8-15. Google Scholar
[13]	邓亚东, 王波. 基于高斯核支持向量机的商品期货市场套利研究[J]. 经济数学, 2018, 35(1): 27-30. doi: 10.3969/j.issn.1007-1660.2018.01.007 CrossRef Google Scholar
[14]	周亮. 基于价差预测的商品期货跨期套利研究[J]. 金融理论与实践, 2019(7): 84-92. doi: 10.3969/j.issn.1003-4625.2019.07.012 CrossRef Google Scholar
[15]	HUCK N. Large Data Sets and Machine Learning: Applications to Statistical Arbitrage[J]. European Journal of Operational Research, 2019, 278(1): 330-342. doi: 10.1016/j.ejor.2019.04.013 CrossRef Google Scholar
[16]	熊志斌. ARIMA融合神经网络的人民币汇率预测模型研究[J]. 数量经济技术经济研究, 2011, 28(6): 64-76. Google Scholar
[17]	周亮. 机器学习融合ARIMA模型的离岸人民币汇率预测[J]. 统计学报, 2020, 1(2): 48-56. Google Scholar
[18]	HUANG N E, SHEN Z, LONG S R, et al. The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and Non-Stationary Time Series Analysis[J]. Proceedings of the Royal Society of London Series A: Mathematical, Physical and Engineering Sciences, 1998, 454(1971): 903-995. doi: 10.1098/rspa.1998.0193 CrossRef Google Scholar
[19]	ZHANG X, LAI K K, WANG S Y. A New Approach for Crude Oil Price Analysis Based on Empirical Mode Decomposition[J]. Energy Economics, 2008, 30(3): 905-918. doi: 10.1016/j.eneco.2007.02.012 CrossRef Google Scholar
[20]	杨云飞, 鲍玉昆, 胡忠义, 等. 基于EMD和SVMs的原油价格预测方法[J]. 管理学报, 2010, 7(12): 1884-1889. doi: 10.3969/j.issn.1672-884X.2010.12.023 CrossRef Google Scholar
[21]	米子川, 姜天英. 煤炭大数据指数编制及经验模态分解模型研究[J]. 统计与信息论坛, 2016, 31(8): 71-77. doi: 10.3969/j.issn.1007-3116.2016.08.013 CrossRef Google Scholar
[22]	LI H T, BAI J C, CUI X, et al. A New Secondary Decomposition-Ensemble Approach with Cuckoo Search Optimization for Air Cargo Forecasting[J]. Applied Soft Computing, 2020, 90(1): 1-19. Google Scholar
[23]	SUN S L, WANG S Y, WEI Y J. A New Multiscale Decomposition Ensemble Approach for Forecasting Exchange Rates[J]. Economic Modelling, 2019, 81: 49-58. doi: 10.1016/j.econmod.2018.12.013 CrossRef Google Scholar
[24]	吴曼曼, 徐建新. 基于EMD改进的Elman神经网络对股票的短期预测模型[J]. 计算机工程与科学, 2019, 41(6): 1119-1127. doi: 10.3969/j.issn.1007-130X.2019.06.022 CrossRef Google Scholar
[25]	HUANG N E, WU M L C, LONG S R, et al. A Confidence Limit for the Empirical Mode Decomposition and Hilbert Spectral Analysis[J]. Proceedings of the Royal Society of London Series A: Mathematical, Physical and Engineering Sciences, 2003, 459(2037): 2317-2345. doi: 10.1098/rspa.2003.1123 CrossRef Google Scholar

Access History

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(4) / Tables(6)

Export Citation

PDF

XML

Article Metrics

Article views(2982) PDF downloads(951) Cited by(0)

Access History

Other Articles By Authors

on this site
on Google Scholar

HTML

开放科学（资源服务）标志码（OSID）：
跨期套利是利用同一种期货品种、不同到期时间合约间价差的不寻常变动，进而实施反向交易，在两个合约间价差回归常态时进行平仓获利的投资方式. 相对于股票等金融工具的买入并持有策略而言，跨期套利由于交易的是同一种期货品种不同合约之间的价差，相对风险更低. 相对于跨品种或者跨市场套利，跨期套利的合约价差更为稳定，因此投资的稳定性更高，风险也相对较低. 跨期套利在价差超过正常值较远的时候进行反向交易，单笔利润相对于买入持有的趋势投资策略往往更低，由于期货市场具有较高的杠杆属性，且T+0的交易模式使得交易频率可以更高，致使套利交易的风险调整后收益往往更高^[1-4]，致使越来越多的基金公司在实践中引入套利交易. 同时，套利交易与买入持有策略间的相关性极低甚至为负，因此是分散投资风险及规避尾部风险的重要手段，如2020年年初新冠肺炎疫情导致全球股票市场、债券市场、商品市场均发生了大幅回撤，如果在投资组合中加入套利交易，则可以对尾部风险进行极为有效的控制.

对价差的准确预测是跨期套利成功实施的关键所在，现有绝大部分文献及实际投资者均是利用价差均值回复原理的标准距离法设计策略，即当价差超过合理范围(常见的为均值±1倍或多倍标准差)的时候进行反向交易，待价差回到均值附近时进行平仓^[5-7]. 随着机器学习模型在金融预测领域应用得越来越广泛、且预测精度高，众多学者和投资者利用机器学习模型对价差进行预测，并在预测价差超过一定阈值后进行交易，从而获得套利收益. 常用来进行套利交易的机器学习模型包括人工神经网络^[8-12]、支持向量机^[13-14]和随机森林^[15]等.

但是，直接对价差进行预测无疑丧失了许多细节信息，如熊志斌^[16]和周亮^[17]对人民币汇率的研究均发现，用ARIMA模型预测线性部分、用机器学习模型预测非线性部分或残差部分能够实现对离岸人民币汇率更精准的预测. Huang等^[18]提出的经验模态分解(EMD)模型在工程信号领域有着广泛的应用，该模型可以将信号分解为多个本征模函数(IMF)及残差项，每个本征模函数及残差项均有自身的特征益于分析及预测. 自EMD模型提出后，众多学者将该模型应用于经济问题分析，包括原油价格分析^[19-20]、环境问题分析^[21-23]等，相对于对原始数据的直接分析，利用分解信号进行分析的研究结果更为准确和稳健.

本文拟采用EMD模型对沪深300股指期货当月合约与下月合约的价差进行分解，并利用神经网络、支持向量机、随机森林以及ARIMA模型分别对高频和低频信号进行预测，再从预测准确性及套利绩效两个方面来评估模型的优劣. 相较于已有期货跨期套利的文献，本文的主要创新之处在于：①通过EMD模型对原始价差变动序列进行滚动分解，再利用各机器学习模型对分信号进行预测，相对于纯机器学习预测模型，对序列信号考虑得更加周全和完整，也大幅提高了模型的预测精度及套利绩效；②通过将多个机器学习模型及线性的时间序列模型进行比较及综合，既挑选出了更适用于跨期套利的模型，同时也将线性模型和非线性模型整合，在增加模型套利绩效的同时，也增加了机器学习模型的经济解释能力.

3. 结论与讨论

选择IF当月连续和下月连续合约2010年4月16日-2020年7月31日的所有日数据，利用3种机器学习方法(Elman，RF，SVM)及ARIMA模型对两个合约的价差变动序列进行预测并构建套利模型. 研究结果发现：① SVM和ARIMA模型的预测精确度相对较高，Elman模型表现较差，而RF模型由于集成了多个弱分类器，表现出的结果较为稳健. ②所有模型在任何阈值下均能取得较高的套利收益，同时绝大部分模型最大回撤均能控制在20%以内，波动率均低于33%，下行波动率均低于16%，说明套利模型风险控制较好；相对于仅采用RF或ARIMA进行预测，混合模型(将预测值进行平均或作为并列条件)的风险控制更好，表现为更低的波动率、下行波动率及最大回撤，说明将非线性模型和线性模型融合使用能够改善模型的风险控制能力. ③将机器学习预测与EMD分解技术相融合可以在不提高风险的同时大幅提高模型的收益率，从而使得模型的夏普比率和索提诺比率均有较大幅度上升，表现最好的是EMD-ARIMA模型，其年化收益率高达96.52%，夏普比率和索提诺比率分别高达2.854 9和8.271 1. ④分样本检验、全IMF信号预测及基于商品期货市场的套利分析，均证明融合EMD的机器学习模型可以获得比纯机器学习模型更优异的套利效果.

本文的研究结论不仅是对期货投资理论及人工智能方法在金融领域中应用的补充，同时也具有较强的实践价值：①跨期套利是一种有效的投资策略，相对于买入持有等基于价格预测的投资策略，套利策略的风险更低，如果方法得当，收益却反而可能获得提高. 同时，大量理论研究及实践均证明，商品期货策略(尤其是套利策略)与股市等投资策略的相关性极低甚至为负，因此在股票投资策略中增加跨期套利策略，可以有效降低整体投资组合的风险，从而提高投资收益率，并且可以在极端的市场风险下保护资产的安全性. ②机器学习模型在对非线性金融时间序列数据进行预测时具有较好的效果，但是机器学习模型完全由数据驱动，其经济基础较为薄弱，因此将其与经济基础更为稳健的线性预测模型相结合，可以在提升模型预测能力的同时，增加模型的经济解释能力. ③金融时间序列具有较高的复杂性及噪声比率，采用单一模型进行预测无疑会丧失很多信息，通过EMD等信号分解模型将金融时间序列进行分解，通过趋势成分或波动成分的提取分别进行预测，可以实现对金融时间序列更为准确的预测，并进而提升跨期套利成功的几率.

Figure (4) Table (6) Reference (25)

Name
	Name cannot be empty!
E-mail
	Mailbox cannot be empty! Mailbox cannot be empty!
Telephone
	Mobile number cannot be empty! Please enter a valid mobile number!
Title

Content
Verification Code

Message Board

Research on Intertemporal Arbitrage Based on Machine Learning and Empirical Mode Decomposition

Abstract

References

Access History

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Access History

Other Articles By Authors