Learning Knowledge Embeddings by Combining Limit-based Scoring Loss阅读笔记

此篇文章最为重要的就是作者设计的 margin-based ranking loss 的改进,对两个超参数$\lambda$和$\gamma$的实验,对于实验结果有很多值得分析与思考的地方。

论文下载地址

Problem Statement

The margin-based ranking loss function cannot ensure the fact that the scoring of correct triplets must be low enough to fulfill the translation.

research objective

reduce the scoring of correct triplets to fulfill the translation by mending the margin-based ranking loss function

Contributions

  • proposing a limit-based ranking loss item combined with margin-based ranking loss
  • extending TransE and TransH to TransE-RS and TransH-RS

Model

Margin-based Tanking Loss

formula:

  • The margin-based ranking loss function aims to make the score $f_{r}\left(h^{\prime}, t^{\prime}\right)$ of corrupted triplet higher by at least $\gamma_{1}$ than of positive triplet.

  • cannot be proved $f_{r}(h, t)<\varepsilon$

Limit-based Scoring Loss

formula:

Finally loss

formula:

detail is :

Experiments

dataset

20190603155952634332494.jpg

思考

作者只是对表格的数据进行了陈述,有一些问题并没有进行分析解释

  • 并没有分析比如说为什么改进loss后的transE为什么会比TransH(R、D)效果要好?
  • 为什么在n-to-1中的表现效果没有达到最好(其他的都达到了最好)?
  • 通过这种改进可以发现,transH相比于TransE并没有显著提升,原因是什么?

Triple Classification

  • TransE-RS and TransH-RS have same parameter and operation complexities as TransE and TransH, which is lower than TransR and TransD.
  • Our models randomly initial the entities, not use the learned embeddings by TransE as TransR and TransD.
    • It means that our models have much better ability to overcome the problem of overfitting

Distributions of Triplets’ Scores

aim

analyze the difference between $L_R$ Loss and our $L_RS$ Loss

Parameters

20190603155952784132260.jpg

思考:

  • 对于我自己正在做的实验:是不是我自己用的间隔太小了

result

20190603155952821850690.jpg

20190603155952848841252.jpg

思考

  • 这部分的实验值得借鉴,它可以相对于直观的可以展示出为什么效果会好。
  • 比如对于上述为什么改进后的transE的效果会更好
    • 看到最后的分数分布transE-RS的分布效果和Trans-H的十分接近,
    • 而transE的模型较为简单,可能最终loss最小化会使得模型充分表达,而其他模型引入了更多的假设可能会带来更多的噪声
    • 也可能当loss很小时,其他的假设条件发挥作用的很小(至少从实验结果来看是的,但是还有待于进一步设计实验验证)

Discussion of Parameters

Discussion on γ1 and γ2.

20190603155952911595962.jpg

  • We find that γ2 = 3γ1 or γ2 = 4γ1 is better for link prediction, but for triplet classification there are not obvious characteristics on γ1 and γ2.
  • a lower γ2 is expected to ensure the golden condition $\mathbf{h}+\mathbf{r} \approx \mathbf{t}$ for positive triplets, but an entity needs to satisfy many golden coditions at the same time.

思考

  • 既然如作者说,那么理论上transH的效果应该很好才对,但是结果并不是这样的,这又产生矛盾。

Discussion on λ

20190603155953002076189.jpg

思考

  • 看到λ并没有对模型影响并没很大
  • λ在1左右是效果会比较好
  • λ和margin会不会产生关联?

 上一篇
Incorporating Literals into Knowledge Graph Embeddings阅读笔记 Incorporating Literals into Knowledge Graph Embeddings阅读笔记
读完了前两章,简单的看了一下作者提出的模型,感觉并没有太大价值,就是给实体输入多加入了一个literal的信息(加入方法可以采用线性、非线性或者神经网络)。 读论文前需要先熟悉DistMult、ComlLEx和ConvE模型,此论文方法是
下一篇 
Knowledge Graph Embedding by Translating on Hyperplanes阅读笔记 Knowledge Graph Embedding by Translating on Hyperplanes阅读笔记
作为trans系列经典文献,必读。文章主要精华在于这种超平面想法的由来解决了同一实体的多关系问题。 Authors proposed TransH which models a relation as a hyperplane toget
  目录