Title:
Seeing the Forest for the Trees
Abstract:
Recent papers in the graph machine learning literature have introduced a number of approaches for hyperbolic representation learning. The asserted benefits are improved performance on a variety of graph tasks, node classification and link prediction included. Claims have also been made about the geometric suitability of particular hierarchical graph datasets to representation in hyperbolic space. Despite these claims, our work makes a surprising discovery: when simple Euclidean models with comparable numbers of parameters are properly trained in the same environment, in most cases, they perform as well, if not better, than all introduced hyperbolic graph representation learning models, even on graph datasets previously claimed to be the most hyperbolic as measured by Gromov delta-hyperbolicity (i.e., perfect trees).
This observation gives rise to a simple question: how can this be? We answer this question by taking a careful look at the field of hyperbolic graph representation learning as it stands today, and find that a number of results do not diligently present baselines, make faulty modelling assumptions when constructing algorithms, and use misleading metrics to quantify geometry of graph datasets. We take a closer look at each of these three problems, elucidate the issues, perform an analysis of methods, and introduce a parametric family of benchmark datasets to ascertain the applicability of (hyperbolic) graph neural networks.
Unfortunately, these problems are not specific to hyperbolic graph neural nets but are indicative of more fundamental problems in graph machine learning more generally. We show, surprisingly, that node features are oftentimes more-than-sufficient for many common graph benchmarks, breaking this critical assumption. When comparing against a well-tuned feature-only MLP baseline on seven of the most commonly used graph learning datasets, one gains little benefit from using graph structure on five datasets. We posit that these datasets do not benefit considerably from graph learning because the features themselves already contain enough graph information to obviate or substantially reduce the need for the graph.
Bio:
Anna Gilbert received her S.B. degree from the University of Chicago and a Ph.D. from Princeton University, both in Mathematics. In 1997, she was a postdoctoral fellow at Yale University and AT&T Labs-Research. From 1998 to 2004, she was a member of technical staff at AT&T Labs-Research in Florham Park, NJ. From 2004 to 2020, she was with the Department of Mathematics (with a secondary appointment in Electrical and Computer Engineering) at the University of Michigan, where she eventually became the Herman H. Goldstine Collegiate Professor. In 2020, she moved to Yale University as the John C. Malone Professor of Mathematics and Professor of Statistics & Data Science. In 2023, she left the Mathematics Department and is now in Statistics & Data Science. Dr. Gilbert has received several awards, including a Sloan Research Fellowship (2006), an NSF CAREER award (2006), the National Academy of Sciences Award for Initiatives in Research (2008), the Association of Computing Machinery (ACM) Douglas Engelbart Best Paper award (2008), the EURASIP Signal Processing Best Paper award (2010), and the SIAM Ralph E. Kleinman Prize (2013).
Dr. Gilbert's research interests include analysis, probability, discrete mathematics, and algorithms. She is especially interested in randomized algorithms with applications to harmonic analysis, signal and image processing, and massive datasets.
Website: https://annacgilbert.github.io/cv/