This story draft by @escholar has not been reviewed by an editor, YET.

Quantitative Analysis and Qualitative Analysis

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
0-item

Table of Links

Abstract and 1. Introduction

  1. Related Work

  2. Proposed Dataset

  3. SymTax Model

    4.1 Prefetcher

    4.2 Enricher

    4.3 Reranker

  4. Experiments and Results

  5. Analysis

    6.1 Ablation Study

    6.2 Quantitative Analysis and 6.3 Qualitative Analysis

  6. Conclusion

  7. Limitations

  8. Ethics Statement and References

Appendix

6.2 Quantitative Analysis

We consider two available LMs, i.e. SciBERT and SPECTER, and the two types of taxonomy fusion, i.e. graph-based and vector-based. This results in four variants, as shown in Table 4. As evident from the results, SciBERT_vector and SPECTER_graph are the best-performing variants. So, the combinatorial choice of LM and taxonomy fusion plays a vital role in model performance. The above observations can be attributed to SciBERT being a LM trained on plain scientific text. In contrast, SPECTER is a LM trained with Triplet loss using 1-hop neighbours of the positive sample from the citation graph as hard negative samples. So, SPECTER embodies graph information inside itself, whereas SciBERT does not.

6.3 Qualitative Analysis

We assess the quality of recommendations given by different algorithms by randomly choosing an example. Though random, we choose the example that has multiple citations in a given context so that we can present the qualitative analysis well by investigating the top-10 ranked predictions. As shown in Table 5, we consider an excerpt from Liu et al. (2020) that contains five citations. As we can see that Symtax correctly recommend three citations in the top-10, whereas HAtten only recommend one citation correctly at rank 1 and BM25 only suggest one correct citation at rank 10. The use of title is crucial to performance, as we can see that many recommendations consist of the words “BERT" and “Pretraining", which are the keywords present in the title. One more observation is that the taxonomy plays a vital role in recommendations. The taxonomy category of the query is ‘Computation and Language‘, and most of the recommended articles are from the same category. SymTax gives only one recommendation (Deep Residual Learning for Image Recognition) from a different category, i.e.“Computer Vision", whereas HAtten recommends three citations from different categories, i.e. (Deep Residual Learning for Image Recognition) from “Computer Vision" and (Batch Normalization, and Adam) from “Machine Learning".


Table 5: The table shows the top-10 citation recommendations given by various algorithms for a randomly chosen example from ArSyTa. Valid predictions are highlighted in bold. It clearly shows that SymTax (SciBERT_vector) is able to recommend three valid articles in the top-10. In contrast, each of the HAtten and BM25 could recommend only one valid article for the given citation context. # denotes the rank of the recommended citations.


Authors:

(1) Karan Goyal, IIIT Delhi, India ([email protected]);

(2) Mayank Goel, NSUT Delhi, India ([email protected]);

(3) Vikram Goyal, IIIT Delhi, India ([email protected]);

(4) Mukesh Mohania, IIIT Delhi, India ([email protected]).


This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.


L O A D I N G
. . . comments & more!

About Author

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars@escholar
We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

Topics

Around The Web...

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks