Ontology Sparse Matrix Learning and Its Application in Similarity Computation

Mei-hui LAN; Quan-run FAN; Wei GAO

doi:10.13718/j.cnki.xdzk.2020.01.017

2020 Volume 42 Issue 1

Article Contents

Previous Article Next Article

Mei-hui LAN, Quan-run FAN, Wei GAO. Ontology Sparse Matrix Learning and Its Application in Similarity Computation[J]. Journal of Southwest University Natural Science Edition, 2020, 42(1): 118-123. doi: 10.13718/j.cnki.xdzk.2020.01.017

Citation:

Mei-hui LAN, Quan-run FAN, Wei GAO. Ontology Sparse Matrix Learning and Its Application in Similarity Computation[J]. Journal of Southwest University Natural Science Edition, 2020, 42(1): 118-123. doi: 10.13718/j.cnki.xdzk.2020.01.017

Ontology Sparse Matrix Learning and Its Application in Similarity Computation

1.
School of Information Engineering, Qujing Normal University, Qujing Yunnan 655011, China
2.
School of Information, Yunnan Normal University, Kunming 650500, China

More Information

Received Date: 15/07/2017
Available Online: 20/01/2020
MSC: TP391

Abstract

Under the background of big data, ontology contains more and more concepts, and thus its structure becomes more complex. Therefore, it is required that the corresponding ontology algorithm be able to reduce the computational dimension efficiently, so as to reduce the computation complexity. In this paper, the original ontology sparse vector learning model is extended, and an ontology sparse matrix learning model is proposed to obtain the optimal approximation solution. An iterative algorithm is designed to get this solution by means of matrix derivative computation. Two experiments verify that the new algorithm has higher efficiency in specific ontology specific applications.
- ontology,
- similarity measure,
- ontology mapping,
- sparse matrix

References

[1]	兰美辉, 任友俊, 徐坚, 等. k-部排序本体相似度计算[J].计算机应用, 2012, 32(4):1094-1096. Google Scholar
[2]	兰美辉, 甘健侯, 任友俊, 等. k-部排序学习算法的可学习性分析[J].西南大学学报(自然科学版), 2016, 38(3):177-183. Google Scholar
[3]	张太华, 顾新建, 何二宝.产品知识模块本体的评价指标体系[J].贵州师范大学学报(自然科学版), 2012, 30(1):94-99. doi: 10.3969/j.issn.1004-5570.2012.01.021 CrossRef Google Scholar
[4]	张鹏, 王国胤, 陶春梅, 等.基于本体粗糙集的程序代码相似度度量方法[J].重庆邮电大学学报(自然科学版), 2008, 20(6):737-741. Google Scholar
[5]	GAO W, BAIG A Q, ALI H, et al. Margin Based Ontology Sparse Vector Learning Algorithm and Applied in Biology Science[J]. Saudi Journal of Biological Sciences, 2017, 24(1):132-138. doi: 10.1016/j.sjbs.2016.09.001 CrossRef Google Scholar
[6]	GAO W, GUO Y, WANG K. Y. Ontology Algorithm Using Singular Value Decomposition and Applied in Multidisciplinary[J]. Cluster Computing-The Journal of Networks Software Tools and Applications, 2016, 19(4):2201-2210. Google Scholar
[7]	GAO W, ZHU L L, WANG K Y. Ranking Based Ontology Scheming Using Eigenpair Computation[J]. Journal of Intelligent & Fuzzy Systems, 2016, 31(4):2411-2419. Google Scholar
[8]	吴剑章, 朱林立, 高炜.本体算法中相似度矩阵的学习[J].小型微型计算机系统, 2015, 36(4):773-777. doi: 10.3969/j.issn.1000-1220.2015.04.025 CrossRef Google Scholar
[9]	YAN L, LI Y J, YANG X, et al. Gradient Descent Technology for Sparse Vector Learning in Ontology Algorithms[J]. Journal of Discrete Mathematical Sciences & Cryptography, 2016, 19(3):753-775. Google Scholar
[10]	WU J Z, YU X, GAO W. Similarity Matrix Learning for Ontology Application[J]. International Journal of Information Technology and Management, 2016, 15(1):1-13. Google Scholar
[11]	高炜, 梁立, 徐天伟.基于正则化瑞利系数的半监督k-部排序学习算法及应用[J].西南师范大学学报(自然科学版), 2014, 39(4):124-128. Google Scholar
[12]	高炜, 朱林立, 梁立.基于图正则化模型的本体映射算法[J].西南大学学报(自然科学版), 2012, 34(3):118-121. Google Scholar
[13]	朱林立, 戴国洪, 高炜.成对排序本体学习算法[J].西南师范大学学报(自然科学版), 2013, 38(12):101-106. Google Scholar
[14]	吴剑章, 余晓, 高炜.基于Mahalanobis矩阵学习的本体算法[J].西南大学学报(自然科学版), 2015, 37(2):117-122. Google Scholar
[15]	CRASWELL N, HAWKING D. Overview of the TREC 2003 Web Track[C]//Proceedings of the Twelfth Text Retrieval Conference. Maryland: NIST Special Publication, 2003: 78-92. Google Scholar
[16]	GAO W, ZHU L L, GUO Y. Multi-Dividing Infinite Push Ontology Algorithm[J]. Engineering Letters, 2015, 23(3):132-139. Google Scholar
[17]	GAO W, LIANG L, XU T W. New Multi-Dividing Ontology Learning Algorithm Using Special Loss Functions[J]. The Open Cybernetics & Systemics Journal, 2014(8):259-268. Google Scholar
[18]	GAO W, WU J Z, ZHU L L. Ontology Optimization Strategies for Sparse Vector Learning Using Gradient Descent Tricks[J]. Journal of Computational Information Systems, 2015, 11(17):6393-6402. Google Scholar
[19]	GAO W, ZHU L L. Gradient Learning Algorithms for Ontology Computing[J]. Computational Intelligence and Neuroscience, 2014, 2014:1-12. Google Scholar

Access History

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(3) / Tables(2)

Export Citation

PDF

XML

Article Metrics

Article views(1607) PDF downloads(356) Cited by(0)

Access History

Other Articles By Authors

on this site
on Google Scholar

HTML

本体作为一种结构化数据存储、表示、计算的模型，越来越受到广大研究者的重视^[1-4].首先，作为结构化模型，在本体中数据的存放不是单纯的记录形式，而是图的形式结构化存储数据，图中的边表示数据之间的内在联系.其次，事实证明通过图模型并利用统计和图论的知识，对处理本体中的数据信息有一定的优势.

最近，针对本体的特殊框架和应用背景，涌现了诸多本体学习算法^[5-14]，其中稀疏向量学习算法被广泛关注并应用于本体学习中.基于稀疏向量学习的本体算法通过本体稀疏向量对高维本体顶点向量进行有效信息提取，获取最有价值的信息，并将每个本体顶点映射成实数.

本文提出一种扩展的本体稀疏向量计算方法，并将该算法应用于两个特殊的工程领域来验证算法的有效性.

1. 基于稀疏矩阵学习的本体算法描述

在本体图建模后，需要将每个顶点对应概念的信息用一个统一维度的向量来表示.设v=(v¹，…，v^p)是顶点对应的p维向量.

本体函数通过本体稀疏向量可表示为

其中：β=(β¹，…，β^p)表示本体稀疏向量，其大部分分量均为0；β₀是一个表示误差的项.经典本体稀疏向量β学习模型为：

其中：l(β)为亏损项，Q(β)为控制本体稀疏向量β稀疏度的项.

设{(v_i，y_i)}_i=1ⁿ⊂ℝ^p×ℝ为本体训练样本，其中v_i和y_i分别表示输入和输出.设关系矩阵W∈ℝ^n×n，其系数[W]_i，j=r_ij≥0表示本体概念(v_i，y_i)和(v_j，y_j)之间的语义关系，且有W=W^T和r₁₁=r₂₂=…=r_nn=0(对角线上元素均为0)成立.本文考虑由扩展的稀疏向量来得到本体函数的计算模型如下：

其中：β_i=(β_i¹，…，β_i^p)∈ℝ^p，偏移量e_i∈ℝ服从N(0，σ²)分布.对于每个v_i，计算其对应的y_i，式(3)和式(1)的差别在于以下两点：

1) 式(1)中对于所有v_i，所求内积都是用同一个稀疏向量β，而式(3)中不同的v_i对应不同的稀疏向量β_i；

2) 式(1)中对于所有v_i，其误差项都是相同的，而在式(3)中，不同的v_i对应不同的误差e_i.由此可知，计算模型(3)是计算模型(1)的推广，当β₁=β₂=…=β_n=β且e₁=e₂=…=e_n=β₀时，式(3)退化为式(1).

通过计算式(3)，我们要学习的不止是单个稀疏向量β，而是一组稀疏向量：β₁，β₂，…，β_n.将这组稀疏向量进行合并得到稀疏矩阵Ω=[β₁^T，β₂^T，…，β_n^T]∈ℝ^n×p，进而学习的目标从本体稀疏向量学习转化为本体稀疏矩阵的学习.本文考虑的本体学习算法可以表示为：

其中：$\sum\limits_{i = 1}^n {{{\left( {{y_i} - {\mathit{\boldsymbol{v}}_i}\mathit{\boldsymbol{\beta }}_i^{\rm{T}}} \right)}^2}} $用来表示误差项，Λ(Ω，W，λ₁，λ₂)用来控制本体稀疏矩阵Ω的稀疏程度.在本体工程中，一种常见的Λ(Ω，W，λ₁，λ₂)设置方法如下：

其中λ₁和λ₂为平衡调节参数.当‖β_i-β_j‖₂的值比较小时，可以认为v_i和v_j属于本体图的同一个团，或者从数据的角度看属于同一个聚类中.此时，r_ij＞0.此外，易知(5)式是凸的且存在全局最优解.

设V=[v₁^T，…，v_n^T]=[u₁，…，u_p]^T为本体信息矩阵，其中u_i∈ℝⁿ，(5)式可写成

其中：Ξ=[diag(u₁)|diag(u₂)|…|diag(u_p)]∈ℝ^n×(pn)，diag(u_i)表示对角线元素为u_i中对应元素的对角矩阵，i∈{1，…，p}；向量化算子

设I_p∈ℝ^p×p为单位矩阵，⊗表示克罗内克乘积，I_i，l∈{0，1}定义如下：如果vec(Ω)的第l个元素[vec(Ω)]_l属于β_i中的元素，则I_i，l=1；否则I_i，l=0.矩阵C定义为：

对角矩阵F_e对角线上的元素为

令F_g=I_p⊗C，通过计算J(Ω)关于vec(Ω)的导数，可得

由于(4)式为凸优化问题，Ω成为问题的全局最优解当且仅当其满足条件(7).然而矩阵F_g和F_e依赖于Ω，在Ω未知的情况下，这两个矩阵无法计算.因此，通过优化如下目标函数来解本体问题(4)：

其中F_g^(t)=I_p⊗C^(t)是块对角矩阵，F_g^(t)∈ℝ^pn×pn.

F_e^(t)∈ℝ^pn×pn是对角矩阵，定义为

用迭代平方最小策略可以得到最小化(8)式的逼近最优解：给定F_g^(t)和F_e^(t)，Ω的最优解可通过解

得到.设

可知

得到Ω^(t+1)后，再更新F_g^(t+1)和F_e^(t+1).

整个本体学习算法概括起来描述如下.

算法A 基于本体稀疏矩阵学习的本体相似度计算和本体映射算法

步骤1：输入本体图(对于本体映射，则输入多本体图)，将每个本体概念对应顶点的所有语义信息用一个p维矩阵表示.

步骤2：确定本体样本集合{(v_i，y_i)}_i=1ⁿ，从而得到本体信息矩阵V，Ξ，目标向量y，关系矩阵W以及两个平衡参数λ₁和λ₂.

步骤3：初始化计数变量t=0，并设置F_g⁽⁰⁾和F_e⁽⁰⁾的值；

步骤4：重复以下迭代直到收敛：

计算vec(Ω^(t+1))=(H^(t))^-1Ξ^T(I_n+Ξ(H^(t))^-1Ξ^T)^-1y；

更新F_g^(t+1)=I_p⊗C^(t+1)以及[F_e^(t+1)]_l，l=${\left[ {\mathit{\boldsymbol{F}}_e^{(t + 1)}} \right]_{l, l}} = \sum\limits_{i = 1}^n {\frac{{{I_{i.l}}{{\left\| {\mathit{\boldsymbol{\beta }}_i^{(t + 1)}} \right\|}_1}}}{{{{\left[ {{\rm{vec}}\left( {\left| {{\mathit{\boldsymbol{ \boldsymbol{\varOmega} }}^{(t + 1)}}} \right|} \right)} \right]}_l}}}} $；

t=t+1.

步骤5：输出本体稀疏矩阵Ω，并根据(3)式计算本体图中每个顶点对应的实数.

步骤6：通过两个顶点对应实数的一维距离来判断顶点对应本体概念之间的相似程度：距离越短则相似度越高，距离越远则相似度越小.对于本体映射，只计算不同本体概念之间的相似度.

步骤7：选择合理的策略，给每个本体概念一个高相似度概念列表，并返回给用户.

3. 结束语

本文主要研究本体稀疏向量学习在本体相似度计算和本体映射中的应用.与以往文章的差别在于，考虑给每个概念一个特定的本体稀疏向量，进而整个算法归结于学习一个本体稀疏矩阵.从矩阵的导数计算出发，得到一个迭代策略来计算优化模型的逼近解.

Figure (3) Table (2) Reference (19)

Name
	Name cannot be empty!
E-mail
	Mailbox cannot be empty! Mailbox cannot be empty!
Telephone
	Mobile number cannot be empty! Please enter a valid mobile number!
Title

Content
Verification Code

Message Board

Ontology Sparse Matrix Learning and Its Application in Similarity Computation

Abstract

References

Access History

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Access History

Other Articles By Authors