Table of Links

Abstract and 1. Introduction

Related Work
Proposed Dataset
SymTax Model

4.1 Prefetcher

4.2 Enricher

4.3 Reranker
Experiments and Results
Analysis

6.1 Ablation Study

6.2 Quantitative Analysis and 6.3 Qualitative Analysis
Conclusion
Limitations
Ethics Statement and References

Appendix

4.3 Reranker

Taxonomy Fusion. The inclusion of taxonomy fusion is an important and careful design choice. Intuitively, a flat-level taxonomy (arXiv concepts) does not have a rich semantic structure in comparison to a hierarchically structured taxonomy like ACM. In a hierarchical taxonomy, we have a semantic relationship in terms of generalisation, specialisation and containment. Mapping the flat concepts into hierarchical taxonomy infuses a structure into the flat taxonomy. It also enriches the hierarchical taxonomy as we get equivalent concepts from the flat taxonomy. Each article in our proposed dataset ArSyTa consists of a feature category that represents the arXiv taxonomy[7] class it belongs to. Since ArSyTa contains papers from the CS domain, so we have a flat arXiv taxonomy. e.g. cs.LG and cs.CV represents Machine Learning and Computer Vision classes, respectively. We now propose the fusion of flat-level arXiv taxonomy with ACM tree taxonomy[8] to obtain rich feature representations for the category classes. We mainly utilise the subject class mapping information mentioned in the arXiv taxonomy and domain knowledge to create a class taxonomy mapping from arXiv to ACM. e.g. cs.CV is mapped to ACM classes I.2.10, I.4 and I.5 (as shown in Fig. 2). Also, we release the mapping config file in the data release phase. We employ two fusion strategies, namely vector-based and graph-based. In vector-based fusion, the classes are passed through LM and their conical vector is obtained by averaging out class vectors in feature space. In graph-based fusion, we first form a graph by injecting arXiv classes into the ACM tree and creating directed edges between them. We initialise node embeddings using LM and run Graph Neural Network (GNN) algorithm to learn fused representations. We consider GAT(Velickovic et al., 2018) and APPNP(Gasteiger et al., 2019) as GNN algorithms and observe their performance as the same. The final representations of cs.{} nodes represent the fused representations learnt. Empirically, we can clearly observe that the fusion of concepts helps to attain significant performance gains (as shown in Table 3).

Authors:

(1) Karan Goyal, IIIT Delhi, India (karang@iiitd.ac.in);

(2) Mayank Goel, NSUT Delhi, India (mayank.co19@nsut.ac.in);

(3) Vikram Goyal, IIIT Delhi, India (vikram@iiitd.ac.in);

(4) Mukesh Mohania, IIIT Delhi, India (mukesh@iiitd.ac.in).

This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

[7] https://arxiv.org/category_taxonomy

[8] https://tinyurl.com/22t2b43v

Reranker

About Author

Topics

Around The Web

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps