Table of Links

Abstract and 1. Introduction

Related work
HypNF Model

3.1 HypNF Model

3.2 The S1/H2 model

3.3 Assigning labels to nodes
HypNF benchmarking framework
Experiments

5.1 Parameter Space

5.2 Machine learning models
Results
Conclusion, Acknowledgments and Disclosure of Funding, and References

A. Empirical validation of HypNF

B. Degree distribution and clustering control in HypNF

C. Hyperparameters of the machine learning models

D. Fluctuations in the performance of machine learning models

E. Homophily in the synthetic networks

F. Exploring the parameters’ space

2 Related work

With the continuous evolution of graph machine learning, there is a growing necessity to comprehend and evaluate the performance of GNN architectures. In this respect, benchmarking can provide a fair and standardized way to compare different models. The Open Graph Benchmark (OGB) [15] stands as a versatile tool to assess the performance of the GNNs. Yet, its emphasis on a limited range of actual networks indicates that it does not encompass all network characteristics and falls short in terms of parameter manipulation. Consequently, this highlights the necessity for creating benchmarking tools based on synthetic data. Such tools would allow for the assessment of GNNs in a controlled environment and across a more extensive array of network properties [34, 24, 23]. One of them is GraphWorld [27], which is a synthetic network generator utilizing the Stochastic Block Model (SBM) to generate graphs with communities. It employs a parametrized community distribution and an edge probability matrix to randomly assign nodes to clusters and establish connections. Node features are also generated using within-cluster multivariate normal distributions. A fixed edge probability matrix in SBM prevents GraphWorld from faithfully replicating a predefined degree sequence and generating graphs with true power-law distributions. To overcome this limitation, Ref. [38] integrates Graphworld with two other generators: Lancichinetti-Fortunato-Radicchi (LFR) [20] and CABAM (Class-Assortative graphs via the Barabási-Albert Model) [32]. This integration broadens the coverage of the graph space, specifically for the NC task. In this paper, we propose an alternative network generators: a framework based on the geometric soft configuration model. This model’s underlying geometry straightforwardly couples the network topology with node features and labels. This

capability enables independent control over the clustering coefficient in both the unipartite network of nodes and the bipartite network of nodes and features, irrespective of the degree distributions of nodes and features (see Fig. 6 in Appendix B). Table 1 presents a comparison between HypNF and several state-of-the-art benchmarking frameworks, highlighting the properties each can control.

Authors:

(1) Roya Aliakbarisani, this author contributed equally from Universitat de Barcelona & UBICS (roya_aliakbarisani@ub.edu);

(2) Robert Jankowski, this author contributed equally from Universitat de Barcelona & UBICS (robert.jankowski@ub.edu);

(3) M. Ángeles Serrano, Universitat de Barcelona, UBICS & ICREA (marian.serrano@ub.edu);

(4) Marián Boguñá, Universitat de Barcelona & UBICS (marian.boguna@ub.edu).

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

Related work

About Author

Topics

Around The Web

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps