HOW POWERFUL ARE GRAPH NEURAL NETWORKS?

100%

HOW POWERFUL ARE GRAPH NEURAL NETWORKS?

ABSTRACT

Graph Neural Networks are an effective framework for representation learning of graphs. Graph Neural Networks follow a neighborhood aggregation scheme, where the representation vector of a node is computed by recursively aggregating and transforming representation vectors of its neighboring nodes. Many Graph Neural Network variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks. However, despite Graph Neural Networks revolutionizing graph representation learning, there is limited understanding of their representational properties and limitations. Here, we present a theoretical framework for analyzing the expressive power of Graph Neural Networks to capture different graph structures. Our results characterize the discriminative power of popular Graph Neural Network variants, such as Graph Convolutional Networks and GraphSAGE, and show that they cannot learn to distinguish certain simple graph structures. We then develop a simple architecture that is provably the most expressive among the class of Graph Neural Networks and is as powerful as the Weisfeiler-Lehman graph isomorphism test. We empirically validate our theoretical findings on a number of graph classification benchmarks, and demonstrate that our model achieves state-of-the-art performance.

One INTRODUCTION

Learning with graph structured data, such as molecules, social, biological, and financial networks, requires effective representation of their graph structure. Recently, there has been a surge of interest in Graph Neural Network approaches for representation learning of graphs. Graph Neural Networks broadly follow a recursive neighborhood aggregation (or message passing) scheme, where each node aggregates feature vectors of its neighbors to compute its new feature vector. After k iterations of aggregation, a node is represented by its transformed feature vector, which captures the structural information within the node's k-hop neighborhood. The representation of an entire graph can then be obtained through pooling, for example, by summing the representation vectors of all nodes in the graph.

Many Graph Neural Network variants with different neighborhood aggregation and graph-level pooling schemes have been proposed. Empirically, these Graph Neural Networks have achieved state-of-the-art performance in many tasks such as node classification, link prediction, and graph classification. However, the design of new Graph Neural Networks is mostly based on empirical intuition, heuristics, and experimental trial-and-error. There is little theoretical understanding of the properties and limitations of Graph Neural Networks, and formal analysis of Graph Neural Networks' representational capacity is limited.

Here, we present a theoretical framework for analyzing the representational power of Graph Neural Networks. We formally characterize how expressive different Graph Neural Network variants are in learning to represent and distinguish between different graph structures. Our framework is inspired by the close connection between Graph Neural Networks and the Weisfeiler-Lehman graph isomorphism test, a powerful test known to distinguish a broad class of graphs. Similar to Graph Neural Networks, the Weisfeiler-Lehman test iteratively updates a given node's feature vector by aggregating feature vectors of its network neighbors. What makes the Weisfeiler-Lehman test so powerful is its injective aggregation update that maps different node neighborhoods to different feature vectors. Our key insight is that a Graph Neural Network can have as large discriminative power as the Weisfeiler-Lehman test if the Graph Neural Network's aggregation scheme is highly expressive and can model injective functions.

To mathematically formalize the above insight, our framework first represents the set of feature vectors of a given node's neighbors as a multiset, i.e., a set with possibly repeating elements. Then, the neighbor aggregation in Graph Neural Networks can be thought of as an aggregation function over the multiset. Hence, to have strong representational power, a Graph Neural Network must be able to aggregate different multisets into different representations. We rigorously study several variants of multiset functions and theoretically characterize their discriminative power, i.e., how well different aggregation functions can distinguish different multisets. The more discriminative the multiset function is, the more powerful the representational power of the underlying Graph Neural Network.

Our main results are summarized as follows:

One) We show that Graph Neural Networks are at most as powerful as the Weisfeiler-Lehman test in distinguishing graph structures.

Two) We establish conditions on the neighbor aggregation and graph readout functions under which the resulting Graph Neural Network is as powerful as the Weisfeiler-Lehman test.

Three) We identify graph structures that cannot be distinguished by popular Graph Neural Network variants, such as Graph Convolutional Networks and GraphSAGE, and we precisely characterize the kinds of graph structures such Graph Neural Network-based models can capture.

Four) We develop a simple neural architecture, Graph Isomorphism Network, and show that its discriminative/representational power is equal to the power of the Weisfeiler-Lehman test.

We validate our theory via experiments on graph classification datasets, where the expressive power of Graph Neural Networks is crucial to capture graph structures. In particular, we compare the performance of Graph Neural Networks with various aggregation functions. Our results confirm that the most powerful Graph Neural Network by our theory, i.e., Graph Isomorphism Network, also empirically has high representational power as it almost perfectly fits the training data, whereas the less powerful Graph Neural Network variants often severely underfit the training data. In addition, the representationally more powerful Graph Neural Networks outperform the others by test set accuracy and achieve state-of-the-art performance on many graph classification benchmarks.

Two PRELIMINARIES

Three THEORETICAL FRAMEWORK: OVERVIEW

Four BUILDING POWERFUL GRAPH NEURAL NETWORKS

Four point one GRAPH ISOMORPHISM NETWORK (GIN)

Four point two GRAPH-LEVEL READOUT OF GIN

Five LESS POWERFUL BUT STILL INTERESTING GNNS

Five point one ONE-LAYER PERCEPTRONS ARE NOT SUFFICIENT

Five point two STRUCTURES THAT CONFUSE MEAN AND MAX-POOLING

Five point three MEAN LEARNS DISTRIBUTIONS

Five point four MAX-POOLING LEARNS SETS WITH DISTINCT ELEMENTS

Five point five REMARKS ON OTHER AGGREGATORS

Six OTHER RELATED WORK

Seven EXPERIMENTS

Seven point one RESULTS

Eight CONCLUSION

A PROOF FOR LEMMA TWO

B PROOF FOR THEOREM THREE

C PROOF FOR LEMMA FOUR

D PROOF FOR LEMMA FIVE

E PROOF OF COROLLARY SIX

F PROOF FOR LEMMA 7

G PROOF FOR COROLLARY 8

H PROOF FOR COROLLARY 9

I DETAILS OF DATASETS

Overview

This document explores the representational strengths and limitations of Graph Neural Networks (GNNs). It establishes a theoretical foundation and develops a model with superior discriminative power, validated against various graph classification benchmarks.

Key Points

1GNNs use a neighborhood aggregation scheme for representation learning
2The study introduces the Graph Isomorphism Network (GIN), shown to be highly expressive
3The discriminative power of GNNs is characterized using multiset aggregation
4Many popular GNN variants cannot distinguish certain simple graph structures
5Empirical results validate the theoretical framework on graph classification tasks.

Details

Authors: Keyulu Xu, Weihua Hu, Jure Leskovec, Stefanie Jegelka
Category: Technology and Engineering

PDF
Group Decision Support Systems and Executive Support Systems
This document presents an overview of Group Decision Support Systems (GDSS) and Executive Support Systems (ESS), detailing their functions, benefits, limitations, and characteristics to aid in collective decision-making processes in business contexts.
PDF
Information Management and Decision Making
This document provides an overview of Decision Support Systems (DSS), discussing their components, types, and the importance of information management in decision-making processes for business executives.
PDF
Electronic Communication Systems
The document provides an overview of electronic communication systems, discussing various types such as electronic conferencing, meeting systems, and publishing, while also addressing benefits and risks associated with electronic publishing.
PDF
End-User Computing
This document provides a comprehensive overview of End-User Computing (EUC), discussing its definition, types, benefits, risks, and the tools used in EUC environments. It aims to explain how non-programmers can effectively participate in computing processes and develop their own applications.
PDF
FLASHATTENTION: Fast and Memory-Efficient Exact Attention with IO-Awareness
This document presents FLASHATTENTION, an IO-aware exact attention algorithm that significantly improves the speed and memory efficiency of Transformers on long sequences. It introduces methods to reduce memory reads and writes, resulting in faster training and better model performance compared to existing methods.