Workshop
New Frontiers in Graph Learning (GLFrontiers)
Jiaxuan You · Rex Ying · Hanjun Dai · Ge Liu · Azalia Mirhoseini · Smita Krishnaswamy · Chaoran Cheng
Hall C2 (level 1 gate 9 south of food court)
Overview: Graph learning has grown into an established sub-field of machine learning in recent years. Researchers have been focusing on developing novel model architectures, theoretical understandings, scalable algorithms and systems, and successful applications across industry and science regarding graph learning. With the success of the New Frontiers in Graph Learning (GLFrontiers) Workshop in NeurIPS 2022, we hope to continue to promote the exchange of discussions and ideas regarding the future of graph learning in NeurIPS 2023.Challenges: Despite the success of graph learning in various applications, the recent machine learning research trends, especially the research towards foundation models and large language models, have posed challenges for the graph learning field. For example, regarding the model architecture, Transformer-based models have been shown to be superior to graph neural networks in certain small graph learning benchmarks. In terms of usability, with language as a generic user interface, it is still a research frontier to explore whether natural language can also interact with ubiquitous graph-structured data and whether it is feasible to build generic foundation models for graphs. Lastly, while graph learning has achieved recent exciting results in molecule and protein design, exploring how graph learning can accelerate scientific discoveries in other disciplines remains an open question.Goal: The primary goal of this workshop is to expand the impact of graph learning beyond the current boundaries. We believe that graph, or relation data, is a universal language that can be used to describe the complex world. Ultimately, we hope graph learning will become a generic tool for learning and understanding any type of (structured) data. In GLFrontiers 2023, we specifically aim to discuss the future of graph learning in the era of foundation models and envision how graph learning can contribute to scientific discoveries.
Schedule
Fri 7:00 a.m. - 7:01 a.m.
|
Opening remarks
(
Opening remarks
)
>
SlidesLive Video |
Jiaxuan You 🔗 |
Fri 7:00 a.m. - 7:30 a.m.
|
Keynote talk: Retrieval Augmentation for LLMs
(
Keynote talk
)
>
SlidesLive Video |
Mohammad Shoeybi 🔗 |
Fri 7:30 a.m. - 8:00 a.m.
|
LLM Scaling: From Power Law to Sparsity
(
Keynote talk
)
>
SlidesLive Video |
Yanqi Zhou 🔗 |
Fri 8:00 a.m. - 8:30 a.m.
|
Coffee Break
(
Coffee Break
)
>
|
🔗 |
Fri 8:30 a.m. - 9:30 a.m.
|
Contributed talks from accepted papers
(
Contributed talks
)
>
SlidesLive Video |
🔗 |
Fri 9:30 a.m. - 11:30 a.m.
|
Poster session 1
(
Poster session
)
>
|
🔗 |
Fri 10:00 a.m. - 11:00 a.m.
|
Lunch break
(
Lunch break
)
>
|
🔗 |
Fri 11:30 a.m. - 12:00 p.m.
|
Keynote talk: Network Construction from Massive Text: Exploring the Power of Language Models
(
Keynote talk
)
>
SlidesLive Video |
Jiawei Han 🔗 |
Fri 12:00 p.m. - 12:30 p.m.
|
Keynote talk
(
Keynote talk
)
>
SlidesLive Video |
Marinka Zitnik 🔗 |
Fri 12:30 p.m. - 1:00 p.m.
|
Keynote talk: Towards Knowledge Foundation Models: Reasoning via Graph Schema Induction
(
Keynote talk
)
>
SlidesLive Video |
Manling Li 🔗 |
Fri 1:00 p.m. - 1:30 p.m.
|
Coffee Break
(
Coffee break
)
>
|
🔗 |
Fri 1:30 p.m. - 2:30 p.m.
|
Panel Discussion: Graphs in the Era of Foundation Models
(
Panel Discussion
)
>
SlidesLive Video |
Jiawei Han · Marinka Zitnik · Manling Li · Yanqi Zhou · Bryan Perozzi · Jiaxuan You · Rex Ying · Hanjun Dai 🔗 |
Fri 2:30 p.m. - 3:00 p.m.
|
Poster session 2
(
Poster session
)
>
|
🔗 |
-
|
Learning Genomic Sequence Representations using Graph Neural Networks over De Bruijn Graphs
(
Poster
)
>
link
The rapid expansion of genomic sequence data calls for new methods to achieve robust sequence representations. Existing techniques often neglect intricate structural details, emphasizing mainly contextual information. To address this, we developed k-mer embeddings that merge contextual and structural string information by enhancing De Bruijn graphs with structural similarity connections. Subsequently, we crafted a self-supervised method based on Contrastive Learning that employs a heterogeneous Graph Convolutional Network encoder and constructs positive pairs based on node similarities. Our embeddings consistently outperform prior techniques for Edit Distance Approximation and Closest String Retrieval tasks. |
Kacper Kapusniak · Manuel Burger · Gunnar Rätsch · Amir Joudaki 🔗 |
-
|
On the Adversarial Robustness of Graph Contrastive Learning Methods
(
Poster
)
>
link
Contrastive learning (CL) has emerged as a powerful framework for learning representations of images and text in a self-supervised manner while enhancing model robustness against adversarial attacks. More recently, researchers have extended the principles of contrastive learning to graph-structured data, giving birth to the field of graph contrastive learning (GCL). However, whether GCL methods can deliver the same advantages in adversarial robustness as their counterparts in the image and text domains remains an open question.In this paper, we introduce a comprehensive $\textit{robustness evaluation protocol}$ tailored to assess the robustness of GCL models. We subject these models to $\textit{adaptive}$ adversarial attacks targeting the graph structure, specifically in the evasion scenario. We evaluate node and graph classification tasks using diverse real-world datasets and attack strategies. With our work, we aim to offer insights into the robustness of GCL methods and hope to open avenues for potential future research directions.
|
Filippo Guerranti · Zinuo Yi · Anna Starovoit · Rafiq Kamel · Simon Geisler · Stephan Günnemann 🔗 |
-
|
A Simple and Scalable Representation for Graph Generation
(
Poster
)
>
link
Recently, there has been a surge of interest in employing neural networks for graph generation, a fundamental statistical learning problem with critical applications like molecule design and community analysis. However, most approaches encounter significant limitations when generating large-scale graphs. This is due to their requirement to output the full adjacency matrices whose size grows quadratically with the number of nodes. In response to this challenge, we introduce a new, simple, and scalable graph representation named gap encoded edge list (GEEL) that has a small representation size that aligns with the number of edges. In addition, GEEL significantly reduces the vocabulary size by incorporating the gap encoding and bandwidth restriction schemes. GEEL can be autoregressively generated with the incorporation of node positional encoding, and we further extend GEEL to deal with attributed graphs by designing a new grammar. Our findings reveal that the adoption of this compact representation not only enhances scalability but also bolsters performance by simplifying the graph generation process. We conduct a comprehensive evaluation across ten non-attributed and two molecular graph generation tasks, demonstrating the effectiveness of GEEL. |
Yunhui Jang · Seul Lee · Sungsoo Ahn 🔗 |
-
|
Double Equivariance for Inductive Link Prediction for Both New Nodes and New Relation Types
(
Spotlight
)
>
link
The task of inductive link prediction in discrete attributed multigraphs (e.g., knowledge graphs, multilayer networks, heterogeneous networks, etc.) generally focuses on test predictions with solely new nodes but not both new nodes and new relation types. In this work, we formally define the task of predicting (completely) new nodes and new relation types in test as a doubly inductive link prediction task and introduce a theoretical framework for the solution. We start by defining the concept of double permutation-equivariant representations that are equivariant to permutations of both node identities and edge relation types. We then propose a general blueprint to design neural architectures that impose a structural representation of relations that can inductively generalize from training nodes and relations to arbitrarily new test nodes and relations without the need for adaptation, side information, or retraining. We also introduce the concept of distributionally double equivariant positional embeddings designed to perform the same task. Finally, we empirically demonstrate the capability of the two proposed models on a set of novel real-world benchmarks, showcasing relative performance gains of up to 41.40% on predicting new relations types compared to baselines. |
Jianfei Gao · Yangze Zhou · Jincheng Zhou · Bruno Ribeiro 🔗 |
-
|
Local Differential Privacy in Graph Neural Networks: a Reconstruction Approach
(
Poster
)
>
link
Graph Neural Networks have achieved tremendous success in modeling complex graph data in a variety of applications. However, there are limited studies investigating privacy protection in GNNs. In this work, we propose a learning framework that can provide node-level privacy, while incurring low utility loss. We focus on a decentralized notion of Differential Privacy, namely Local Differential Privacy, and apply randomization mechanisms to perturb both feature and label data before being collected by a central server for model training. Specifically, we investigate the application of randomization mechanisms in high-dimensional feature settings and propose an LDP protocol with strict privacy guarantees. Based on frequency estimation in statistical analysis of randomized data, we develop reconstruction methods to approximate features and labels from perturbed data. We also formulate this learning framework to utilize frequency estimates of graph clusters to supervise the training procedure at a sub-graph level. Extensive experiments on real-world and semi-synthetic datasets demonstrate the validity of our proposed model. |
Karuna Bhaila · Wen Huang · Yongkai Wu · Xintao Wu 🔗 |
-
|
GenTKG: Generative Forecasting on Temporal Knowledge Graph
(
Poster
)
>
link
The rapid advancements in large language models (LLMs) have ignited interest in the realm of the temporal knowledge graph (TKG) domain, where conventional carefully designed embedding-based and rule-based models dominate. The question remains open of whether pre-trained LLMs can understand structured temporal relational data and replace them as the foundation model for temporal relational forecasting. Besides, challenges occur in the huge chasms between complex graph data structure and linear natural expressions LLMs can handle, and between the enormous data volume of TKGs and heavy computation costs of finetuning LLMs. To address these challenges, we bring temporal knowledge forecasting into the generative setting and propose a novel retrieval augmented generation framework named GenTKG combining a temporal logical rule-based retrieval strategy and lightweight few-shot parameter-efficient instruction tuning to solve the above challenges. Extensive experiments have shown that GenTKG is a simple but effective, efficient, and generalizable approach that outperforms conventional methods on temporal relational forecasting with extremely limited computation. Our work opens a new frontier for the temporal knowledge graph domain. |
Ruotong Liao · Xu Jia · Yunpu Ma · Volker Tresp 🔗 |
-
|
The Graph Lottery Ticket Hypothesis: Finding Sparse, Informative Graph Structure
(
Poster
)
>
link
Graph learning methods help utilize implicit relationships among data items, thereby reducing training label requirements and improving task performance. However, determining the optimal graph structure for a particular learning task remains a challenging research problem.In this work, we introduce the Graph Lottery Ticket (GLT) Hypothesis – that there is an extremely sparse backbone for every graph, and that graph learning algorithms attain comparable performance when trained on that subgraph as on the full graph.We identify and systematically study 8 key metrics of interest that directly influence the performance of graph learning algorithms. Subsequently, we define the notion of a "winning ticket" for graph structure – an extremely sparse subset of edges that can deliver a robust approximation of the entire graph's performance. We propose a straightforward and efficient algorithm for finding these GLTs in arbitrary graphs. Empirically, we observe that performance of different graph learning algorithms can be matched or even exceeded on graphs with the average degree as low as 5. |
Anton Tsitsulin · Bryan Perozzi 🔗 |
-
|
Node Mutual Information: Enhancing Graph Neural Networks for Heterophily
(
Poster
)
>
link
Graph neural networks (GNNs) have achieved great success in graph analysis by leveraging homophily, where connected nodes share similar properties. However, GNNs struggle on heterophilic graphs where connected nodes tend to differ. Some of the existing methods use neighborhood expansion which is intractable for large graphs.This paper proposes utilizing node mutual information (MI) to capture dependencies between nodes in heterophilic graphs for use in GNNs. We first define a probability space associated with the graph and introduce $k^{th}$ node random variables to partition the graph based on node distances. The MI between two nodes' random variables then quantifies their dependency regardless of distance by considering both direct and indirect connections. We propose $k^{th}$ MIGNN where the $k^{th}$ MI values are used as weights in the message aggregation function. Experiments on real-world datasets with varying heterophily ratios show the proposed method achieves competitive performance compared to baseline GNNs. The results demonstrate that leveraging node mutual information effectively captures complex node dependencies in heterophilic graphs.
|
Seongjin Choi · Gahee Kim · Se-Young Yun 🔗 |
-
|
Low-Width Approximations and Sparsification for Scaling Graph Transformers
(
Poster
)
>
link
Graph Transformers have shown excellent results on a diverse set of datasets. However, memory limitations prohibit these models from scaling to larger graphs. With standard single-GPU setups, even training on medium-sized graphs is impossible for most Graph Transformers. While the $\mathcal{O}(n^2d^2)$ complexity of each layer can be reduced to $\mathcal{O}((n+m)d^2)$ using sparse attention models such as Exphormer for graphs with $n$ nodes and $m$ edges, these models are still infeasible to train on training on small-memory devices even for medium-sized datasets. Here, we propose to sparsify the Exphormer model even further,by using a small ``pilot'' network to estimate attention scores along the graph edges, then training a larger model only using $\mathcal O(n)$ edges deemed important by the small network. We show empirically that attention scores from smaller networks provide a good estimate of the attention scores in larger networks, and that this process can yield a large-width sparse model nearly as good as the large-width non-sparse model.
|
Hamed Shirzad · Balaji Venkatachalam · Ameya Velingker · Danica J. Sutherland · David Woodruff 🔗 |
-
|
Knowledge Graphs are not Created Equal: Exploring the Properties and Structure of Real KGs
(
Poster
)
>
link
Despite the recent popularity of knowledge graph (KG) related tasks and bench- marks such as KG embeddings, link prediction, entity alignment and evaluation of the reasoning abilities of pretrained language models as KGs, the structure and properties of real KGs are not well studied. In this paper, we perform a large scale comparative study of 29 real KG datasets from diverse domains such as the natural sciences, medicine, and NLP to analyze their properties and structural patterns. Based on our findings we make recommendations regarding KG-based model development and evaluation. We believe that the rich structural information contained in KGs can benefit the development of better KG models across fields (e.g., ML, NLP, AI for sciences) and we hope this study will contribute to breaking the existing data silos between different areas of research. |
Nedelina Teneva · Estevam Hruschka 🔗 |
-
|
FedGKD: Unleashing the Power of Collaboration in Federated Graph Neural Networks
(
Poster
)
>
link
Federated training of Graph Neural Networks (GNN) has become popular in recent years due to its ability to perform graph-related tasks under data isolation scenarios while preserving data privacy. However, graph heterogeneity issues in federated GNN systems continue to pose challenges. Existing frameworks address the problem by representing local tasks using different statistics and relating them through a simple aggregation mechanism. However, these approaches suffer from limited efficiency from two aspects: low quality of task-relatedness quantification and inefficacy of exploiting the collaboration structure. To address these issues, we propose FedGKD, a novel federated GNN framework that utilizes a novel client-side graph dataset distillation method to extract task features that better describe task-relatedness, and introduces a novel server-side aggregation mechanism that is aware of the global collaboration structure. We conduct extensive experiments on six real-world datasets of different scales, demonstrating our framework's outperformance. |
Qiying Pan · Ruofan Wu · Tengfei LIU · Tianyi Zhang · Yifei Zhu · Weiqiang Wang 🔗 |
-
|
PAN: Expressiveness of GNNs with Paths
(
Poster
)
>
link
Motivated by the lack of theoretical investigation into the discriminative power of paths, we characterize classes of graphs where paths are sufficient to identify every graph. We formally justify the use of paths based on finite-variable counting logic and prove the effectiveness of paths to recognize graph structural features related to cycles and connectivity. We show that paths are able to identify graphs for which higher-order models fail. Building on this, we propose PAN, a novel graph neural network that replaces topological neighborhoods with paths in the aggregation step of the message-passing procedure. This modification leads to an algorithm that is strictly more expressive than the Weisfeiler-Leman graph isomorphism test, at the cost of a polynomial-time preprocessing step for every fixed path length. We support our theoretical findings by empirically evaluating PAN on synthetic datasets. |
Caterina Graziani · Tamara Drucks · Monica Bianchini · Franco Scarselli · Thomas Gärtner 🔗 |
-
|
Poisoning $\times$ Evasion: Symbiotic Adversarial Robustness for Graph Neural Networks
(
Poster
)
>
link
It is well-known that deep learning models are vulnerable w.r.t. small input perturbations. Such perturbed instances are called adversarial examples. Adversarial examples are commonly crafted to fool a model either at training time (poisoning) or test time (evasion). In this work, we study the symbiosis of poisoning and evasion. We show that combining both threat models can substantially improve the devastating efficacy of adversarial attacks. Specifically, we study the robustness of Graph Neural Networks (GNNs) under structure perturbations and devise a memory-efficient adaptive end-to-end attack for the novel threat model using first-order optimization. |
Ege Erdogan · Simon Geisler · Stephan Günnemann 🔗 |
-
|
Order Agnostic Autoregressive Graph Generation
(
Poster
)
>
link
Graph generation is a fundamental problem in various domains, including chemistry and social networks. Recent work has shown that molecular graph generation using recurrent neural networks (RNNs) is advantageous compared to traditional generative approaches which require converting continuous latent representations into graphs. One issue which arises when treating graph generation as sequential generation is the arbitrary order of the sequence which results from a particular choice of graph flattening method. In this work we propose using RNNs, taking into account the non-sequential nature of graphs by adding an Orderless Regularization (OLR) term that encourages the hidden state of the recurrent model to be invariant to different valid orderings present under the training distribution. We demonstrate that sequential graph generation models benefit from our proposed regularization scheme, especially when data is scarce. Our findings contribute to the growing body of research on graph generation and provide a valuable tool for various applications requiring the synthesis of realistic and diverse graph structures. |
Edo Cohen-Karlik · Eyal Rozenberg · Daniel Freedman 🔗 |
-
|
GraphRNN Revisited: An Ablation Study and Extensions for Directed Acyclic Graphs
(
Poster
)
>
link
GraphRNN is a deep learning-based architecture proposed by You et al. for learning generative models for graphs. We replicate the results of You et al. using a reproduced implementation of the GraphRNN architecture and evaluate this against baseline models using new metrics. Through an ablation study, we find that the BFS traversal suggested by You et al. to collapse representations of isomorphic graphs contributes significantly to model performance. Additionally, we extend GraphRNN to generate directed acyclic graphs by replacing the BFS traversal with a topological sort. We demonstrate that this method improves significantly over a directed-multiclass variant of GraphRNN on a real-world dataset. |
Maya Ravichandran · Mark Koch · Taniya Das · Nikhil Khatri 🔗 |
-
|
Knowledge Graph Prompting for Multi-Document Question Answering
(
Spotlight
)
>
link
The 'pre-train, prompt, predict' paradigm of large language models (LLMs) has achieved remarkable success in open-domain question answering (OD-QA). However, few works explore this paradigm in the scenario of multi-document question answering (MD-QA), a task demanding a thorough understanding of the logical associations among the contents and structures of different documents. To fill this crucial gap, we propose a Knowledge Graph Prompting (KGP) method to formulate the right context in prompting LLMs for MD-QA, which consists of a graph construction module and a graph traversal module. For graph construction, we create a knowledge graph (KG) over multiple documents with nodes symbolizing passages or document structures (e.g., pages/tables), and edges denoting the semantic/lexical similarity between passages or intra-document structural relations. For graph traversal, we design an LM-guided graph traverser that navigates across nodes and gathers supporting passages assisting LLMs in MD-QA. The constructed graph serves as the global ruler that regulates the transitional space among passages and reduces retrieval latency. Concurrently, the LM-guided traverser acts as a local navigator that gathers pertinent context to progressively approach the question and guarantee retrieval quality. Extensive experiments underscore the efficacy of KGP for MD-QA, signifying the potential of leveraging graphs in enhancing the prompt design for LLMs. Our code will be released upon publication. |
Yu Wang · Nedim Lipka · Ryan Rossi · Alexa Siu · Ruiyi Zhang · Tyler Derr 🔗 |
-
|
DiP-GNN: Discriminative Pre-Training of Graph Neural Networks
(
Poster
)
>
link
Graph neural network (GNN) pre-training methods have been proposed to enhance the power of GNNs. Specifically, a GNN is first pre-trained on a large-scale unlabeled graph and then fine-tuned on a separate small labeled graph for downstream applications, such as node classification. One popular pre-training method is to mask out a proportion of the edges, and a GNN is trained to recover them. However, such a generative method suffers from graph mismatch. That is, the masked graph input to the GNN deviates from the original graph. To alleviate this issue, we propose DiP-GNN (Discriminative Pre-training of Graph Neural Networks). Specifically, we train a generator to recover identities of the masked edges, and simultaneously, we train a discriminator to distinguish the generated edges from the original graph's edges. The discriminator is subsequently used for downstream fine-tuning. In our pre-training framework, the graph seen by the discriminator better matches the original graph because the generator can recover a proportion of the masked edges. Extensive experiments on large-scale homogeneous and heterogeneous graphs demonstrate the effectiveness of DiP-GNN. Our code will be publicly available. |
Simiao Zuo · Haoming Jiang · Qingyu Yin · Xianfeng Tang · Bing Yin · Tuo Zhao 🔗 |
-
|
Coupling Graph Neural Networks with Non-Integer Order Dynamics: A Robustness Study
(
Poster
)
>
link
In this work, we rigorously investigate the robustness of graph neural fractional-order differential equation (FDE) models. This framework extends beyond traditional graph neural ordinary differential equation (ODE) models by implementing the time-fractional Caputo derivative. Utilizing fractional calculus allows our model to consider long-term dependencies during the feature updating process, diverging from the Markovian updates seen in traditional graph neural ODE models. The efficacy of FDE models in surpassing ODE models has been confirmed in a different submitted work, particularly in environments free from attacks or perturbations.While traditional graph neural ODE models have been verified to possess a degree of stability and resilience in the presence of adversarial attacks in existing literature, the robustness of graph neural FDE models, especially under adversarial conditions, remains largely unexplored. This paper undertakes a detailed assessment of the robustness of graph neural FDE models. We establish a theoretical foundation outlining the robustness features of graph neural FDE models, highlighting that they maintain more stringent output perturbation bounds in the face of input and functional disturbances, relative to their integer-order counterparts. Through rigorous experimental assessments, which include graph alteration scenarios and adversarial attack contexts, we empirically validate the improved robustness of graph neural FDE models against their conventional graph neural ODE counterparts. |
Qiyu Kang · Kai Zhao · Yang Song · Yihang Xie · Yanan Zhao · Sijie Wang · Rui She · Wee Peng Tay 🔗 |
-
|
Graph Pooling Provably Improves Expressivity
(
Poster
)
>
link
In the domain of graph neural networks (GNNs), pooling operators are fundamental to reduce the size of the graph by simplifying graph structures and vertex features. Recent advances have shown that well-designed pooling operators, coupled with message-passing layers, can endow hierarchical GNNs with an expressive power regarding the graph isomorphism test that is equal to the Weisfeiler-Leman test. However, the ability of hierarchical GNNs to increase expressive power by utilizing graph coarsening was not yet explored. This results in uncertainties about the benefits of pooling operators and a lack of sufficient properties to guide their design.In this work, we identify conditions for pooling operators to generate WL-distinguishable coarsened graphs from originally WL-indistinguishable but non-isomorphic graphs. Our conditions are versatile and can be tailored to specific tasks and data characteristics, offering a promising avenue for further research. |
Veronica Lachi · Alice Moallemy-Oureh · Andreas Roth · Pascal Welke 🔗 |
-
|
Can LLMs Effectively Leverage Graph Structural Information: When and Why
(
Poster
)
>
link
This paper studies Large Language Models (LLMs) augmented with structured data--particularly graphs--a crucial data modality that remains underexplored in the LLM literature. We aim to understand when and why the incorporation of structural information inherent in graph data can improve the prediction performance of LLMs on node classification tasks with textual features. To address the "when" question, we examine a variety of prompting methods for encoding structural information, in settings where textual node features are either rich or scarce. For the "why" questions, we probe into two potential contributing factors to the LLM performance: data leakage and homophily. Our exploration of these questions reveals that (i) LLMs can benefit from structural information, especially when textual node features are scarce; (ii) there is no substantial evidence indicating that the performance of LLMs is significantly attributed to data leakage; and (iii) the performance of LLMs on a target node is strongly positively related to the local homophily ratio of the node. |
Jin Huang · Xingjian Zhang · Qiaozhu Mei · Jiaqi Ma 🔗 |
-
|
Semi-Supervised Graph Imbalanced Regression
(
Poster
)
>
link
Data imbalance is easily found in annotated data when the observations of certain continuous label values are difficult to collect for regression tasks. When they come to molecule and polymer property predictions, the annotated graph datasets are often small because labeling them requires expensive equipment and effort. To address the lack of examples of rare label values in graph regression tasks, we propose a semi-supervised framework to progressively balance training data and reduce model bias via self-training. The training data balance is achieved by (1) pseudo-labeling more graphs for under-represented labels with a novel regression confidence measurement and (2) augmenting graph examples in latent space for remaining rare labels after data balancing with pseudo-labels. The former is to identify quality examples from unlabeled data whose labels are confidently predicted and sample a subset of them with a reverse distribution from the imbalanced annotated data. The latter collaborates with the former to target a perfect balance using a novel label-anchored mixup algorithm. We perform experiments in seven regression tasks on graph datasets. Results demonstrate that the proposed framework significantly reduces the error of predicted graph properties, especially in under-represented label areas. |
Gang Liu · Tong Zhao · Eric Inae · Tengfei Luo · Meng Jiang 🔗 |
-
|
Beyond Text: A Deep Dive into Large Language Models' Ability on Understanding Graph Data
(
Poster
)
>
link
Large language models (LLMs) have achieved impressive performance on many natural language processing tasks. However, their capabilities on graph-structured data remain relatively unexplored. In this paper, we conduct a series of experiments benchmarking leading LLMs on diverse graph prediction tasks spanning node, edge, and graph levels. We aim to assess whether LLMs can effectively process graph data and leverage topological structures to enhance performance, compared to specialized graph neural networks. Through varied prompt formatting and task/dataset selection, we analyze how well LLMs can interpret and utilize graph structures. By comparing LLMs' performance with specialized graph models, we offer insights into the strengths and limitations of employing LLMs for graph analytics. Our findings provide insights into LLMs' capabilities and suggest avenues for further exploration in applying them to graph analytics. |
Yuntong Hu · Zheng Zhang · Liang Zhao 🔗 |
-
|
Its All Graph To Me: Single-Model Graph Representation Learning on Multiple Domains
(
Poster
)
>
link
Graph neural networks (GNNs) have revolutionised the field of graph representation learning and plays a critical role in graph-based research. Recent work explores applying GNNs to pre-training and fine-tuning, where a model is trained on a large dataset and its learnt representations are then transferred to a smaller dataset. However, current work only explore pre-training on a single domain; for example, a model pre-trained on molecular graphs is fine-tuned on other molecular graphs. This leads to poor generalisability of pre-trained models to novel domains and tasks.In this work, we curate a multi-graph-domain dataset and apply state-of-the-art Graph Adversarial Contrastive Learning (GACL) methods. We present a pre-trained graph model that may have the capability of acting as a foundational graph model. We will evaluate the efficacy of its learnt representations on various downstream tasks against baseline models pre-trained on single domains. In addition, we aim to compare our model to un-trained and non-transferred models, and show that performance using our foundational model is capable of achieving equal or better than task-specific methodology. |
Alex O. Davies · Riku Green · Nirav Ajmeri · Telmo Silva Filho 🔗 |
-
|
Long-Range Neural Atom Learning for Molecular Graphs
(
Poster
)
>
link
Graph Neural Networks (GNNs) have been widely adopted for drug discovery with molecular graphs. Nevertheless, current GNNs are mainly good at leveraging short-range interactions (SRI) but struggle to capture long-range interactions (LRI), both of which are crucial for determining molecular properties. To tackle this issue, we propose a method that implicitly projects all original atoms into a few \textit{Neural Atoms}, which abstracts the collective information of atomic groups within a molecule. Specifically, we explicitly exchange the information among neural atoms and project them back to the atoms’ representations as an enhancement. With this mechanism, neural atoms establish the communication channels among distant nodes, effectively reducing the interaction scope of arbitrary node pairs into a single hop. To provide an inspection of our method from a physical perspective, we reveal its connection with the traditional LRI calculation method, Ewald Summation. We conduct extensive experiments on three long-range graph benchmarks, covering both graph-level and link-level tasks on molecular graphs. We empirically justify that our method can be equipped with an arbitrary GNN and help to capture LRI. |
Xuan Li · Zhanke Zhou · Jiangchao Yao · Yu Rong · Lu Zhang · Bo Han 🔗 |
-
|
Maximally Expressive GNNs for Outerplanar Graphs
(
Spotlight
)
>
link
We propose a linear time graph transformation that enables the Weisfeiler-Leman (WL) test and message passing graph neural networks (MPNNs) to be maximally expressive on outerplanar graphs. Our approach is motivated by the fact that most pharmaceutical molecules correspond to outerplanar graphs. Existing research predominantly enhances the expressivity of graph neural networks without specific graph families in mind. This often leads to methods that are impractical due to their computational complexity. In contrast, the restriction to outerplanar graphs enables us to encode the Hamiltonian cycle of each biconnected component in linear time. As the main contribution of the paper we prove that our method achieves maximum expressivity on outerplanar graphs. Experiments confirm that our graph transformation improves the predictive performance of MPNNs on molecular benchmark datasets at negligible computational overhead. |
Franka Bause · Fabian Jogl · Patrick Indri · Tamara Drucks · David Penz · Nils M. Kriege · Thomas Gärtner · Pascal Welke · Maximilian Thiessen 🔗 |
-
|
Privacy-Utility Trade-offs in Neural Networks for Medical Population Graphs: Insights from Differential Privacy and Graph Structure
(
Poster
)
>
link
Differential privacy (DP) is the gold-standard for protecting individuals' data while enabling deep learning. It is well-established and frequently used for applications in medicine and healthcare to protect sensitive patient data. When using graph deep learning on so-called population graphs, however, the application of DP becomes more challenging compared to grid-like data structures like images or tables. In this work, we initiate an empirical investigation of differentially private graph neural networks on population graphs in the medical domain by examining privacy-utility trade-offs under different graph learning methods on both real-world and synthetic datasets. We compare two state-of-the-art methods for differentially private graph deep learning and empirically audit privacy guarantees through node membership inference and link stealing attacks. We hereby focus on the impact of the graph structure, one of the most important inherent challenges of medical population graphs. Our findings highlight the potential and the challenges of this specific DP application area. Moreover, we find evidence that the underlying graph structure constitutes a potential factor for larger performance gaps on one of the explored methods by showing a correlation between the degree of graph homophily and the accuracy of the trained model. |
Tamara Mueller · Maulik Chevli · Ameya Daigavane · Daniel Rueckert · Georgios Kaissis 🔗 |
-
|
Unsupervised Representation Learning of Brain Activity via Bridging Voxel Activity and Functional Connectivity
(
Poster
)
>
link
Effective brain representation learning is a key step toward revealing the understanding of cognitive processes and unlocking detecting and potential therapeutic interventions for neurological diseases/disorders. Existing studies have focused on either (1) voxel-level activity, where only a single beta weight for each voxel (i.e., aggregation of voxel activity over a time window) is considered, missing their temporal dynamics, or (2) functional connectivity of the brain in the level of region of interests, missing voxel-level activities. In this paper, we bridge this gap and design BrainMixer, an unsupervised learning framework that effectively utilizes both functional connectivity and associated time series of voxels to learn voxel-level representation in an unsupervised manner. BrainMixer employs two simple yet effective MLP-based encoders to simultaneously learn the dynamics of voxel-level signals and their functional correlations. To encode voxel activity, BrainMixer fuses information across both time and voxel dimensions via a dynamic self-attention mechanism. To learn the structure of the functional connectivity graph, BrainMixer presents a temporal graph patching and encodes each patch by combining its nodes' features via a new adaptive temporal pooling. Our experiments show that BrainMixer attains outstanding performance and outperforms 13 baselines in different downstream tasks and experimental setups. |
Ali Behrouz · Parsa Delavari · Farnoosh Hashemi 🔗 |
-
|
HoloNets: Spectral Convolutions do extend to Directed Graphs
(
Spotlight
)
>
link
Within the graph learning community, conventional wisdom dictates that spectral convolutional networks may only be deployed on undirected graphs: Only there could the existence of a well-defined graph Fourier transform be guaranteed, so that information may be translated between spatial- and spectral domains. Here we show this traditional reliance on the graph Fourier transform to be superfluous: Making use of certain advanced tools from complex analysis and spectral theory, we extend spectral convolutions to directed graphs. We provide a frequency- response interpretation of newly developed filters, investigate the influence of the basis’ used to express filters and discuss the interplay with characteristic operators on which networks are based. In order to thoroughly test the developed general theory, we conduct experiments in real world settings, showcasing that directed spectral convolutional networks provide new state of the art results for heterophilic node classification and – as opposed to baselines – may be rendered stable to resolution-scale varying topological perturbations. |
Christian Koke · Daniel Cremers 🔗 |
-
|
Explaining Drug Repositioning: A Case-Based Reasoning Graph Neural Network Approach
(
Poster
)
>
link
Drug repositioning, the identification of novel uses of existing therapies, has become an attractive strategy to accelerate drug development. Recently, knowledge graphs (KGs) have emerged as a powerful representation of interconnected data within the biomedical domain. While biomedical KGs can be used to predict new connections between compounds and diseases, most approaches only state whether two nodes are related. Yet, they fail to explain why two nodes are related. In this project, we introduce an implementation of the semi-parametric Case-Based Reasoning over subgraphs approach (CBR-SUBG), designed to derive the underlying mechanisms for a drug query by gathering graph patterns of similar entities. We show that our adaptation outperforms existing KG link prediction models on a drug repositioning task. Furthermore, our findings demonstrate that CBR-SUBG strategy can not only rank potential repositioning candidates but also provide interpretable biological paths, leading to more informed decisions. |
Adriana Carolina Gonzalez Cavazos 🔗 |
-
|
Subgraphormer: Subgraph GNNs meet Graph Transformers
(
Poster
)
>
link
In the realm of Graph Neural Network (GNNs), two intriguing research directions have recently emerged: Subgraph GNNs and Graph Transformers. These approaches have distinct origins -- Subgraph GNNs aim to address the limitations of message passing, while Graph Transformers seek to build on the success of sequential transformers in language and vision tasks. In this paper, we propose a model that integrates both approaches, dubbed Subgraphormer, which combines the message passing and global aggregation schemes from Subgraph GNNs with attention mechanisms and positional and structural encodings, which are arguably the most important components in Graph Transformers. Our preliminary experimental results demonstrate significant performance improvements over both Subgraph GNNs and Graph Transformers. |
Guy Bar Shalom · Beatrice Bevilacqua · Haggai Maron 🔗 |
-
|
Everybody Needs a Little HELP: Explaining Graphs via Hierarchical Concepts
(
Poster
)
>
link
Graph neural networks (GNNs) have led to major breakthroughs in a variety of domains such as drug discovery, social network analysis, and travel time estimation. However, they lack interpretability which hinders human trust and thereby deployment to settings with high-stakes decisions. A line of interpretable methods approach this by discovering a small set of relevant concepts as subgraphs in the last GNN layer that together explain the prediction. This can yield oversimplified explanations, failing to explain the interaction between GNN layers. To address this oversight, we provide HELP (Hierarchical Explainable Latent Pooling), a novel, inherently interpretable graph pooling approach that reveals how concepts from different GNN layers compose to new ones in later steps. HELP is more than 1-WL expressive and is the first non-spectral, end-to-end-learnable, hierarchical graph pooling method that can learn to pool a variable number of arbitrary connected components. We empirically demonstrate that it performs on-par with standard GCNs and popular pooling methods in terms of accuracy while yielding explanations that are aligned with expert knowledge in the domains of chemistry and social networks. In addition to a qualitative analysis, we employ concept completeness scores as well as concept conformity, a novel metric to measure the noise in discovered concepts, quantitatively verifying that the discovered concepts are significantly easier to fully understand than those from previous work. Our work represents a first step towards an understanding of graph neural networks that goes beyond a set of concepts from the final layer and instead explains the complex interplay of concepts on different levels. |
Jonas Jürß · Lucie Charlotte Magister · Pietro Barbiero · Pietro Lió · Nikola Simidjievski 🔗 |
-
|
ProtoHG: Prototype-Enhanced Hypergraph Learning for Heterogeneous Information Networks
(
Poster
)
>
link
The variety and complexity of relations in real-world data lead to Heterogeneous Information Networks (HINs). Capturing the semantics from such networks requires approaches capable of utilizing the full richness of the HINs. Existing methods for modeling HINs employ techniques originally designed for graph neural networks in combination with HIN decomposition analysis, especially using manually predefined metapaths. In this paper, we introduce a novel hypergraph learning approach for node classification in HINs. Using hypergraphs instead of graphs, our method captures higher-order relationships among heterogeneous nodes and extracts semantic information without relying on metapaths. Our method leverages the power of prototypes to improve the robustness of the hypergraph learning process, and we further discuss the advantages that our method can bring to scalability, which due to their expressiveness is an important issue for hypergraphs. Extensive experiments on three real-world HINs demonstrate the effectiveness of our method. |
Shuai Wang · Jiayi Shen · Athanasios Efthymiou · Stevan Rudinac · Monika Kackovic · Nachoem Wijnberg · Marcel Worring 🔗 |
-
|
Learning Multiplex Embeddings on Text-rich Networks with One Text Encoder
(
Poster
)
>
link
In real-world scenarios, texts in a network are often linked by multiple semantic relations (e.g., papers in an academic network are referenced by other publications, written by the same author, or published in the same venue). Text documents and their relations form a multiplex text-rich network. Mainstream text representation learning methods use pretrained language models (PLMs) to generate one embedding for each text unit, expecting that all types of relations between texts can be captured by these single-view embeddings. However, this presumption does not hold particularly in multiplex text-rich networks. Along another line of work, multiplex graph neural networks (GNNs) directly initialize node attributes as a feature vector for node representation learning, but they cannot fully capture the semantics of the nodes' associated texts. In this paper, we propose METERN, a new framework for learning Multiplex Embeddings on TExt-Rich Networks. In contrast to existing methods, METERN uses one text encoder to model the shared knowledge across relations and leverages a small number of parameters per relation to derive relation-specific representations. This allows the encoder to effectively capture the multiplex structures in the network while also preserving parameter efficiency. We conduct experiments on nine downstream tasks in five networks from both academic and e-commerce domains, where METERN outperforms baselines significantly and consistently. The code is available at https://anonymous.4open.science/r/METER-submit-2C7B. |
Bowen Jin · Wentao Zhang · Yu Zhang · Yu Meng · Han Zhao · Jiawei Han 🔗 |
-
|
Non-backtracking Graph Neural Networks
(
Spotlight
)
>
link
The celebrated message-passing updates for graph neural networks allow the representation of large-scale graphs with local and computationally tractable updates. However, the local updates suffer from backtracking, i.e., a message flows through the same edge twice and revisits the previously visited node. Since the number of message flows increases exponentially with the number of updates, the redundancy in local updates prevents the graph neural network from accurately recognizing a particular message flow for downstream tasks. In this work, we propose to resolve such a redundancy via the non-backtracking graph neural network (NBA-GNN) that updates a message without incorporating the message from the previously visited node. We further investigate how NBA-GNN alleviates the over-squashing of GNNs, and establish a connection between NBA-GNN and the impressive performance of non-backtracking updates for stochastic block model recovery. We empirically verify the effectiveness of our NBA-GNN on long-range graph benchmark and transductive node classification problems. |
Seonghyun Park · Narae Ryu · Gahee Kim · Dongyeop Woo · Se-Young Yun · Sungsoo Ahn 🔗 |
-
|
Large Graph Models: A Perspective
(
Poster
)
>
link
Large models have emerged as the most recent groundbreaking achievements in artificial intelligence, and particularly machine learning. However, when it comes to graphs, large models have not achieved the same level of success as in other fields, such as natural language processing and computer vision. In order to promote applying large models for graphs forward, we present a perspective paper to discuss the challenges and opportunities associated with developing large graph models. First, we discuss the desired characteristics of large graph models. Then, we present detailed discussions from three key perspectives: representation basis, graph data, and graph models. In each category, we provide a brief overview of recent advances and highlight the remaining challenges together with our visions. Finally, we discuss valuable applications of large graph models. We believe this perspective paper is able to encourage further investigations into large graph models, ultimately pushing us one step closer towards artificial general intelligence (AGI). |
Ziwei Zhang · Haoyang Li · Zeyang Zhang · Yijian Qin · Xin Wang · Wenwu Zhu 🔗 |
-
|
Privacy-preserving design of graph neural networks with applications to vertical federated learning
(
Poster
)
>
link
The paradigm of vertical federated learning (VFL), where institutions collaboratively train machine learning models via combining each other's local feature or label information, has achieved great success in applications to financial risk management (FRM). The surging developments of graph representation learning (GRL) have opened up new opportunities for FRM applications under FL via efficiently utilizing the graph-structured data generated from underlying transaction networks. Meanwhile, transaction information is often considered highly sensitive. To prevent data leakage during training, it is critical to develop FL protocols with formal privacy guarantees. In this paper, we present an end-to-end GRL framework in the VFL setting called VESPER, which is built upon a general privatization scheme termed perturbed message passing (PMP) that allows the privatization of many popular graph neural architectures. Based on PMP, we discuss the strengths and weaknesses of specific design choices of concrete graph neural architectures and provide solutions and improvements for both dense and sparse graphs. Extensive empirical evaluations over both public datasets and an industry dataset demonstrate that VESPER is capable of training high-performance GNN models over both sparse and dense graphs under reasonable privacy budgets. |
Ruofan Wu · Mingyang Zhang · Lingjuan Lyu · Xiaolong Xu · Xiuquan Hao · xinyi fu · Tengfei LIU · Tianyi Zhang · Weiqiang Wang 🔗 |
-
|
PowerGraph: A power grid benchmark dataset for graph neural networks
(
Poster
)
>
link
Public Graph Neural Networks (GNN) benchmark datasets facilitate the use of GNN and enhance GNN applicability to diverse disciplines. The community currently lacks public datasets of electrical power grids for GNN applications. Indeed, GNNs have the potential for capturing complex power grid phenomena over alternative machine learning techniques. Power grids are complex engineered networks that are naturally amenable to graph representations. Therefore, GNN have the potential for capturing the behaviour of power grids over alternative machine learning techniques. To this aim, we develop a graph dataset for cascading failure events, which are the major cause of blackouts in electric power grids. Historical blackout datasets are scarce and incomplete. The assessment of vulnerability and the identification of critical components are usually conducted via computationally expensive offline simulations of cascading failures. Instead, we propose the use of machine learning models for the online detection of cascading failures leveraging the knowledge of the system state at the onset of the cascade. We develop PowerGraph, a graph dataset modelling cascading failures in power grids, designed for two purposes, namely, i) training GNN models for different graph-level tasks including multi-class classification, binary classification, and regression, and ii) explaining GNN models. The dataset generated via physics-based cascading failure model ensures generality of the operating and environmental conditions by spanning diverse failure scenarios. In addition, we foster the use of the dataset to benchmark GNN explainability methods by assigning ground-truth edge-level explanations. PowerGraph helps the development of better GNN models for graph-level tasks and explainability, critical in many domains ranging from chemistry to biology, where the systems and processes can be described as graphs. The dataset is available at https://figshare.com/articles/dataset/PowerGraph/22820534 and the code at https://anonymous.4open.science/r/PowerGraph/. |
Anna Varbella · Kenza Amara · Blazhe Gjorgiev · Giovanni Sansavini 🔗 |
-
|
Implicit Graph Neural Diffusion Based on Constrained Dirichlet Energy Minimization
(
Poster
)
>
link
Implicit graph neural networks (GNNs) have emerged as a potential approach to enable GNNs to capture long-range dependencies effectively. However, poorly designed implicit GNN layers can experience over-smoothing or may have limited adaptability to learn the graph geometry, potentially hindering their performance in graph learning problems. To address these issues, we introduce a geometric framework to design implicit graph diffusion layers based on a parameterized graph Laplacian operator. Our framework allows learning the metrics of vertex and edge spaces, as well as the graph gradient operator from data. We further show how implicit GNN layers can be viewed as the fixed-point solution of a Dirichlet energy minimization problem and give conditions under which it may suffer from over-smoothing. To overcome the over-smoothing problem, we design our implicit graph diffusion layer as the solution of a Dirichlet energy minimization problem with constraints on vertex features, enabling it to trade off smoothing with the preservation of node feature information. With an appropriate hyperparameter set to be larger than the largest eigenvalue of the parameterized graph Laplacian, our framework guarantees a unique equilibrium and quick convergence. Our models demonstrate better performance than leading implicit and explicit GNNs on benchmark datasets for node and graph classification tasks, with substantial accuracy improvements observed for some datasets. |
Guoji Fu · Mohammed Haroon Dupty · Yanfei Dong · Wee Sun Lee 🔗 |
-
|
RL4CO: a Unified Reinforcement Learning for Combinatorial Optimization Library
(
Spotlight
)
>
link
Deep reinforcement learning offers notable benefits in addressing combinatorial problems over traditional solvers, reducing the reliance on domain-specific knowledge and expert solutions, and improving computational efficiency. Despite the recent surge in interest in neural combinatorial optimization, practitioners often do not have access to a standardized code base. Moreover, different algorithms are frequently based on fragmentized implementations that hinder reproducibility and fair comparison. To address these challenges, we introduce RL4CO, a unified Reinforcement Learning (RL) for Combinatorial Optimization (CO) library. We employ state-of-the-art software and best practices in implementation, such as modularity and configuration management, to be flexible, easily modifiable, and extensible by researchers. Thanks to our unified codebase, we benchmark baseline RL solvers with different evaluation schemes on zero-shot performance, generalization, and adaptability on diverse tasks. Notably, we find that some recent methods may fall behind their predecessors depending on the evaluation settings. We hope RL4CO will encourage the exploration of novel solutions to complex real-world tasks, allowing the community to compare with existing methods through a unified framework that decouples the science from software engineering. We open-source our library at https://anonymous.4open.science/r/rl4co-glfrontiers. |
Federico Berto · Chuanbo Hua · Junyoung Park · Minsu Kim · Hyeonah Kim · Jiwoo SON · HAEYEON KIM · joungho kim · Jinkyoo Park 🔗 |
-
|
Multimodal Graph Learning for Generative Tasks
(
Poster
)
>
link
Multimodal learning combines multiple data modalities, broadening the types and complexity of data our models can utilize; for example, from plain text to image-caption pairs. Most multimodal learning algorithms focus on modeling simple one-to-one pairs of data from two modalities, such as image-caption pairs, or audio-text pairs. However, in most real-world settings, entities of different modalities interact with each other in more complex and multifaceted ways, going beyond one-to-one mappings. We propose to represent these complex relationships as graphs, allowing us to capture data with any number of modalities, and with complex relationships between modalities that can flexibly vary from one sample to another. Toward this goal, we propose Multimodal Graph Learning (MMGL), a general and systematic framework for capturing information from multiple multimodal neighbors with relational structures among them. In particular, we focus on MMGL for \emph{generative} tasks, building upon pretrained Language Models (LMs), aiming to augment their text generation with multimodal neighbor contexts. We study three research questions raised by MMGL: (1) how can we infuse multiple neighbor information into the pretrained LMs, while avoiding scalability issues? (2) how can we infuse the graph structure information among multimodal neighbors into the LMs? and (3) how can we finetune the pretrained LMs to learn from the neighbor context parameter-efficiently? We conduct extensive experiments to answer these three questions on MMGL and analyze the empirical results to pave the way for future MMGL research. |
Minji Yoon · Jing Yu Koh · Bryan Hooi · Ruslan Salakhutdinov 🔗 |
-
|
CuriousWalk: Enhancing Multi-Hop Reasoning in Graphs with Random Network Distillation
(
Poster
)
>
link
Structured knowledge bases in the forms of graphs often suffer from incompleteness and inaccuracy in representing information. One popular method of densifying graphs involves constructing a reinforcement learning agent that learns to traverse entities and relations in a sequential way from a query entity, according to a query relation until it reaches the desired answer entity. However, these agents are often limited by sparse reward structures of the environment, as well as their inability to find diverse paths from the question to the answer entities. In this paper, we attempt to address these issues by augmenting the agent with intrinsic rewards which can help in exploration as well as offering meaningful feedback at intermediate steps to push the agent in the right direction. |
Varun Kausika · Saurabh Jha · Adya Jha · Amy Zhang · Michael Sury 🔗 |
-
|
Sparse but Strong: Crafting Adversarially Robust Graph Lottery Tickets
(
Poster
)
>
link
Graph Lottery Tickets (GLTs), comprising a sparse adjacency matrix and a sparse graph neural network (GNN), can significantly reduce the inference latency and compute footprint compared to their dense counterparts. Despite these benefits, their performance against adversarial structure perturbations remains to be fully explored. In this work, we first investigate the resilience of GLTs against different structure perturbation attacks and observe that they are highly vulnerable and show a large drop in classification accuracy. Based on this observation, we then present an adversarially robust graph sparsification (ARGS) framework that prunes the adjacency matrix and the GNN weights by optimizing a novel loss function capturing the graph homophily property and information associated with both the true labels of the train nodes and the pseudo labels of the test nodes. By iteratively applying ARGS to prune both the perturbed graph adjacency matrix and the GNN model weights, we can find adversarially robust graph lottery tickets that are highly sparse yet achieve competitive performance under different untargeted training-time structure attacks. Thorough evaluations conducted on various benchmarks, considering different poisoning structure attacks such as PGD, MetaAttack, and Meta-PGD attack, demonstrate that the GLTs generated by ARGS can significantly improve the robustness, even when subjected to high levels of sparsity. |
Subhajit Dutta Chowdhury · Zhiyu Ni · Qingyuan Peng · Souvik Kundu · Pierluigi Nuzzo 🔗 |
-
|
ResolvNet: A Graph Convolutional Network with multi-scale Consistency
(
Spotlight
)
>
link
It is by now a well known fact in the graph learning community that the presence of bottlenecks severely limits the ability of graph neural networks to propagate information over long distances. What so far has not been appreciated is that, counter-intuitively, also the presence of strongly connected sub-graphs may severely restrict information flow in common architectures. Motivated by this observation, we introduce the concept of multi-scale consistency. At the node level this concept refers to the retention of a connected propagation graph even if connectivity varies over a given graph. At the graph-level, multi-scale consistency refers to the fact that distinct graphs describing the same object at different resolutions should be assigned similar feature vectors. As we show, both properties are not satisfied by poular graph neural network architectures. To remedy these shortcomings, we introduce ResolvNet, a flexible graph neural network based on the mathematical concept of resolvents. We rigorously establish its multi-scale consistency theoretically and verify it in extensive experiments on real world data: Here networks based on this ResolvNet architecture prove expressive; out-performing baselines significantly on many tasks; in- and outside the multi-scale setting. |
Christian Koke · Abhishek Saroha · Yuesong Shen · Marvin Eisenberger · Daniel Cremers 🔗 |
-
|
On the modelling and impact of negative edges in graph convolutional networks for node classification
(
Poster
)
>
link
Signed graphs are important data structures to simultaneously express positive and negative relationships. Their application ranges from structural health monitoring to financial models, where the meaning and properties of negative relationships can play a significant role. In this paper, we provide a comprehensive examination of existing approaches for the integration of signed edges into the Graph Convolutional Network (GCN) framework for node classification. Here we use a combination of theoretical and empirical analysis to gain a deeper understanding of the strengths and limitations of different mechanisms and to identify areas for possible improvement. We compare six different approaches to the integration of negative link information within the framework of the simple GCN. In particular, we analyze sensitivity towards feature noise, negative edge noise and positive edge noise, as well as robustness towards feature scaling and translation, explaining the results obtained on the basis of individual model assumptions and biases. Our findings highlight the importance of capturing the meaning of negative links in a given domain context, and appropriately reflecting it in the choice of GCN model. |
Thu Trang Dinh · Julia Handl · Luis Ospina-Forero 🔗 |
-
|
VN-EGNN: Equivariant Graph Neural Networks with Virtual Nodes Enhance Protein Binding Site Identification
(
Spotlight
)
>
link
Being able to identify regions within or around proteins, to which ligands canpotentially bind, is an essential step to develop new drugs. Binding site iden-tification methods can now profit from the availability of large amounts of 3Dstructures in protein structure databases or from AlphaFold predictions. Currentbinding site identification methods rely on geometric deep learning, which takesgeometric invariances and equivariances into account. Such methods turned outto be very beneficial for physics-related tasks like binding energy or motion tra-jectory prediction. However, their performance at binding site identification isstill limited, which might be due to limited expressivity or oversquashing effectsof E(n)-Equivariant Graph Neural Networks (EGNNs). Here, we extend EGNNsby adding virtual nodes and applying an extended message passing scheme. Thevirtual nodes in these graphs both improve the predictive performance and can alsolearn to represent binding sites. In our experiments, we show that VN-EGNN setsa new state of the art at binding site identification on three common benchmarks,COACH420, HOLO4K, and PDBbind2020. |
Florian Sestak · Lisa Schneckenreiter · Sepp Hochreiter · Andreas Mayr · Günter Klambauer 🔗 |
-
|
Hierarchical Relationships: A New Perspective to Enhance Scene Graph Generation
(
Poster
)
>
link
This paper presents a finding that leveraging the hierarchical structures among labels for relationships and objects can substantially improve the performance of scene graph generation systems. The focus of this work is to create an informative hierarchical structure that can divide object and relationship categories into disjoint super-categories in a systematic way. Specifically, we introduce a Bayesian prediction head to jointly predict the super-category of relationships between a pair of object instances, as well as the detailed relationship within that super-category simultaneously, facilitating more informative predictions. The resulting model exhibits the capability to produce a more extensive set of predicates beyond the dataset annotations, and to tackle the prevalent issue of low annotation quality. While our paper presents preliminary findings, experiments on the Visual Genome dataset show its strong performance, particularly in predicate classifications and zero-shot settings, that demonstrates the promise of our approach. |
Bowen Jiang · Camillo Taylor 🔗 |
-
|
Exploring the Potential of Large Language Models (LLMs) in Learning on Graph
(
Poster
)
>
link
Learning on Graphs has attracted immense attention due to its wide real-world applications. The most popular pipeline for learning on graphs with textual node attributes primarily relies on Graph Neural Networks (GNNs), and utilizes shallow text embedding as initial node representations, which has limitations in general knowledge and profound semantic understanding. In recent years, Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities that have revolutionized existing workflows to handle text data. In this paper, we aim to explore the potential of LLMs in graph machine learning, especially the node classification task, and investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors. The former leverages LLMs to enhance nodes' text attributes with their massive knowledge and then generate predictions through GNNs. The latter attempts to directly employ LLMs as standalone predictors. We conduct comprehensive and systematical studies on these two pipelines under various settings. From comprehensive empirical results, we make original observations and find new insights that open new possibilities and suggest promising directions to leverage LLMs for learning on graphs. |
Zhikai Chen · Haitao Mao · Hang Li · Wei Jin · Hongzhi Wen · Xiaochi Wei · Shuaiqiang Wang · Dawei Yin · Wenqi Fan · Hui Liu · Jiliang Tang
|
-
|
One Node Per User: Node-Level Federated Learning for Graph Neural Networks
(
Poster
)
>
link
Graph Neural Networks (GNNs) training often necessitates gathering raw user data on a central server, which raises significant privacy concerns. Federated learning emerges as a solution, enabling collaborative model training without users directly sharing their raw data. However, integrating federated learning with GNNs presents unique challenges, especially when a client represents a graph node and holds merely a single feature vector. In this paper, we propose a novel framework for node-level federated graph learning. Specifically, we decouple the message-passing and feature vector transformation processes of the first GNN layer, allowing them to be executed separately on the user devices and the cloud server. Moreover, we introduce a graph Laplacian term based on the feature vector's latent representation to regulate the user-side model updates. The experiment results on multiple datasets show that our approach achieves better performance compared with baselines. |
zhidong gao · Yuanxiong Guo · Yanmin Gong 🔗 |
-
|
GraphRAG: Reasoning on Graphs with Retrieval-Augmented LLMs
(
Poster
)
>
link
Recently, large language models (LLMs) have made significant advancements towards various tasks in natural language processing. However, LLMs often lack knowledge about data that is internal to enterprises, some of which is structured as graphs, and it remains unclear how to enable LLMs to reason on external graph-structured data. To address this, we propose a retrieval-augmented generative model, \textit{GraphRAG}, that employs a structure-aware retriever to leverage information from graph-structured data and leverages an LLM to reason on the retrieved structural data. We evaluate the model on the task of predicting product categories of an e-commerce portal that are relevant to a webpage, when the categories are organized as a collection of disjoint trees. Extensive experiments show that GraphRAG improves 10.22% in precision@3 over baseline retrieval-augmented generation models. |
Zhen Han · Anand Muralidhar · Aditya Degala 🔗 |
-
|
How does over-squashing affect the power of GNNs?
(
Spotlight
)
>
link
Graph Neural Networks (GNNs) are the state-of-the-art model for machine learning on graph-structured data. The most popular class of GNNs operate by exchanging information between adjacent nodes, and are known as Message Passing Neural Networks (MPNNs). While understanding the expressive power of MPNNs is a key question, existing results typically consider settings with uninformative node features. In this paper, we provide a rigorous analysis to determine which function classes of node features can be learned by an MPNN of a given capacity. We do so by measuring the level of \emph{pairwise interactions} between nodes that MPNNs allow for. This measure provides a novel quantitative characterization of the so-called over-squashing effect, which is observed to occur when a large volume of messages is aggregated into fixed-size vectors. Using our measure, we prove that, to guarantee sufficient communication between pairs of nodes, the capacity of the MPNN must be large enough, depending on properties of the input graph structure, such as commute times. For many relevant scenarios, our analysis results in impossibility statements in practice, showing that \emph{over-squashing hinders the expressive power of MPNNs}. Our theory also holds for geometric graphs and hence extends to equivariant MPNNs on point clouds. We validate our analysis through extensive controlled experiments and ablation studies. |
Francesco Di Giovanni · T. Konstantin Rusch · Michael Bronstein · Andreea-Ioana Deac · Marc Lackenby · Siddhartha Mishra · Petar Veličković 🔗 |
-
|
Beyond Erdos-Renyi: Generalization in Algorithmic Reasoning on Graphs
(
Poster
)
>
link
Neural algorithmic reasoning excels in many graph algorithms, but assessment mainly focuses on the Erdős-Rényi (ER) graph family. This study delves into graph algorithmic models' generalization across diverse distributions. Testing a leading model exposes overreliance on ER graphs for generalization assessment. We further investigate two scenarios: generalisation to every target distribution and single target distributions. Our results suggest that achieving the former is not as trivial and achieving the latter can be aided selecting source distribution via novel Tree Mover's Distance interpretation. |
Dobrik Georgiev · Pietro Lió · Jakub Bachurski · Junhua Chen · Tunan Shi 🔗 |
-
|
Graph Neural Networks Go Forward-Forward
(
Poster
)
>
link
We present the Graph Forward-Forward (GFF) algorithm, an extension of the Forward-Forward procedure to graphs, able to handle features distributed over a graph's nodes. This allows training graph neural networks with forward passes only, without backpropagation. Our method is agnostic to the message-passing scheme, and provides a more biologically plausible learning scheme than backpropagation, while also carrying computational advantages. With GFF, graph neural networks are trained greedily layer by layer, using both positive and negative samples. We run experiments on 11 standard graph property prediction tasks, showing how GFF provides an effective alternative to backpropagation for training graph neural networks. This shows in particular that this procedure is remarkably efficient in spite of combining the per-layer training with the locality of the processing in a GNN. |
Daniele Paliotta · Mathieu Alain · Bálint Máté · François Fleuret 🔗 |
-
|
Towards Particle Flow Event Reconstruction at the Future Circular Collider with GNNs
(
Poster
)
>
link
Reconstructing particles properties from raw signals measured in particle physics detectors is a challenging task due to the complex shapes of the showers, variety in density and sparsity. Classical particle reconstruction algorithms in current detectors use a multi-step pipeline, but the increase in data complexity of future detectors will reduce their performance. We consider a geometric graph representation due to the sparsity and difference in density of particle showers. We introduce a dataset for particle level reconstruction at the Future Circular Collider and benchmark the performance of state-of-the-art GNN architectures on this dataset. We show that our pipeline performs with high efficiency and response and discuss how this type of data can further drive the development of novel geometric GNN approaches. |
Dolores Garcia · Gregor Kržmanc · Philipp Zehetner · Jan Kieseler · Michele Selvaggi 🔗 |
-
|
EDGE++: Improved Training and Sampling of EDGE
(
Poster
)
>
link
Traditional graph-generative models like the Stochastic-Block Model (SBM) fall short in capturing complex structures inherent in large graphs. Recently developed deep learning models like NetGAN, CELL, and Variational Graph Autoencoders have made progress but face limitations in replicating key graph statistics. Diffusion-based methods such as EDGE have emerged as promising alternatives, however, they present challenges in computational efficiency and generative performance. In this paper, we propose enhancements to the EDGE model to address these issues. Specifically, we introduce a degree-specific noise schedule that optimizes the number of active nodes at each timestep, significantly reducing memory consumption. Additionally, we present an improved sampling scheme that fine-tunes the generative process, allowing for better control over the similarity between the synthesized and the true network. Our experimental results demonstrate that the proposed modifications not only improve the efficiency but also enhance the accuracy of the generated graphs, offering a robust and scalable solution for graph generation tasks. |
Xiaohui Chen · Mingyang Wu · Liping Liu 🔗 |
-
|
On the Temperature of Bayesian Graph Neural Networks for Conformal Prediction
(
Poster
)
>
link
Accurate uncertainty quantification in graph neural networks (GNNs) is essential, especially in high-stakes domains where GNNs are frequently employed. Conformal prediction (CP) offers a promising framework for quantifying uncertainty by providing \textit{valid} prediction sets for any black-box model. CP ensures formal probabilistic guarantees that a prediction set contains a true label with a desired probability. However, the size of prediction sets, known as \textit{inefficiency}, is influenced by the underlying model and data generating process. On the other hand, Bayesian learning also provides a credible region based on the estimated posterior distribution, but this region is \textit{well-calibrated} only when the model is correctly specified. Building on a recent work that introduced a scaling parameter for constructing valid credible regions from posterior estimate, our study explores the advantages of incorporating a temperature parameter into Bayesian GNNs within CP framework. We empirically demonstrate the existence of temperatures that result in more efficient prediction sets. Furthermore, we conduct an analysis to identify the factors contributing to inefficiency and offer valuable insights into the relationship between CP performance and model calibration. |
Seohyeon Cha · Honggu Kang · Joonhyuk Kang 🔗 |
-
|
Shedding Light on Random Dropping and Oversmoothing
(
Poster
)
>
link
Graph Neural Networks (GNNs) are widespread in graph representation learning. Random dropping approaches, notably DropEdge and DropMessage, claim to alleviate the key issues of overfitting and oversmoothing by randomly removing elements of the graph representation. However, their effectiveness is largely unverified. In this work, we find empirically that they have a limited effect in reducing oversmoothing, contrary to what is typically assumed in the literature. These approaches are also non-parametric and motivate us to question if learned dropping can alleviate the propagation of redundant or noisy edges. We propose a new information-theoretic approach, in which we learn to perform dropping on the data exchanged by nodes during message passing via optimizing an information bottleneck. Our approach is superior to previous dropping methods in oversmoothing reduction and has promising performance in the case of deep GNNs. |
Han Xuanyuan · Tianxiang Zhao · Dongsheng Luo 🔗 |
-
|
Towards Foundation Models for Knowledge Graph Reasoning
(
Poster
)
>
link
Foundation models in language and vision have the ability to run inference on any textual and visual inputs thanks to the transferable representations such as a vocabulary of tokens in language. Knowledge graphs (KGs) have different entity and relation vocabularies that generally do not overlap. The key challenge of designing foundation models on KGs is to learn such transferable representations that enable inference on any graph with arbitrary entity and relation vocabularies. In this work, we make a step towards such foundation models and present ULTRA, an approach for learning universal and transferable graph representations. ULTRA builds relational representations as a function conditioned on their interactions. Such a conditioning strategy allows a pre-trained ULTRA model to inductively generalize to any unseen KG with any relation vocabulary and to be fine-tuned on any graph. Conducting link prediction experiments on 57 different KGs, we find that the zero-shot inductive inference performance of a single pre-trained ULTRA model on unseen graphs of various sizes is often on par or better than strong baselines trained on specific graphs. Fine-tuning further boosts the performance. |
Michael Galkin · Xinyu Yuan · Hesham Mostafa · Jian Tang · Zhaocheng Zhu 🔗 |
-
|
A Multi-Task Perspective for Link Prediction with New Relation Types and Nodes
(
Poster
)
>
link
The task of inductive link prediction in (discrete) attributed multigraphs infers missing attributed links (relations) between nodes in new test multigraphs. Traditional relational learning methods face the challenge of limited generalization to test multigraphs containing both novel nodes and novel relation types not seen in training. Recently, under the only assumption that all relation types share the same structural predictive patterns (single task), Gao et al. (2023) proposed a link prediction method using the theoretical concept of double equivariance (equivariance for nodes & relation types), in contrast to the (single) equivariance (only for nodes) used to design Graph Neural Networks (GNNs). In this work we further extend the double equivariance concept to multi-task double equivariance, where we define link prediction in attributed multigraphs that can have distinct and potentially conflicting predictive patterns for different sets of relation types (multiple tasks). Our empirical results on real-world datasets demonstrate that our approach can effectively generalize to test graphs with multi-task structures without access to additional information. |
Jincheng Zhou · Beatrice Bevilacqua · Bruno Ribeiro 🔗 |
-
|
Prompt Learning Unlocked for App Promotion in the Wild
(
Poster
)
>
link
In recent times, mobile apps have increasingly incorporated app promotion ads to promote other apps, raising cybersecurity and online commerce concerns related to societal trust and recommendation systems. To effectively discover the intricate nature of the app promotion graph data, we center around the graph completion task, aiming to learn the connection patterns among diverse relations and entities. However, accurately deciphering the connection patterns in such a large and diverse graph presents significant challenges for deep learning models. To overcome these challenges, we introduce Prompt Promotion, a transformer-based framework that unlocks prompt learning capabilities by incorporating metapath- and embedding-based prompts that provide valuable hints to guide the model's predictions for undetermined connection patterns. Experimental results show that our Prompt Promotion model represents a pioneering prompt-based capability in effectively completing the app promotion graph. It not only demonstrates superior performance in heterogeneous graph completion in real-world scenarios, but also exhibits strong generalization capabilities for diverse, complex, and noisy connection patterns when paired with their respective prompts. |
Zhongyu Ouyang · Shifu Hou · Shang Ma · Chaoran Chen · Chunhui Zhang · Toby Li · Xusheng Xiao · Xiangchi Yuan · Yanfang Ye 🔗 |
-
|
Uncertainty-Aware Robust Learning on Noisy Graphs
(
Poster
)
>
link
Graph neural networks have shown impressive capabilities in solving various graph learning tasks, particularly excelling in node classification. However, their effectiveness can be hindered by the challenges arising from the widespread existence of noisy measurements associated with the topological or nodal information present in real-world graphs. These inaccuracies in observations can corrupt the crucial patterns within the graph data, ultimately resulting in undesirable performance in practical applications. To address these issues, this paper proposes a novel uncertainty-aware graph learning framework motivated by distributionally robust optimization. The framework aims to alleviate the challenges by considering the distributional uncertainty associated with the graph data. Specifically, we use a graph neural network-based encoder to embed the node features and find the optimal node embeddings by minimizing the worst-case risk through a minimax formulation. Such an uncertainty-aware learning process leads to improved node representations and a more robust graph predictive model that effectively mitigates the impact of uncertainty arising from data noise. The learned LFDs also provide a means to quantify the predictive uncertainty, which is valuable in some uncertainty-sensitive scenarios where incorrect decisions can have severe consequence. In addition, we adopt the idea of differentiable optimization and develop an end-to-end learning algorithm that seamlessly integrates graph learning and distributionally robust optimization. Our experimental result shows that the proposed framework achieves superior predictive performance compared to the state-of-the-art baselines under various noisy settings. |
Shuyi Chen · Kaize Ding · Shixiang Zhu 🔗 |
-
|
GAD-EBM: Graph Anomaly Detection using Energy-Based Models
(
Poster
)
>
link
Graph Anomaly Detection (GAD) is essential in fields ranging from network security, bioinformatics to finance. Previous works often adopt auto-encoders to compute reconstruction errors for anomaly detection: anomalies are hard to be reconstructed. In this work, we revisit the first principle for anomaly detection, i.e., the Neyman-Pearson rule, where the optimal anomaly detector is based on the likelihood of a data point given the normal distribution of data. However, in practice, the distribution is often unknown and the estimation of the distribution of graph-structured data may be hard. Moreover, the likelihood computation of a graph-structured data point may be challenging as well. In this paper, we propose a novel approach GAD-EBM that can estimate the distribution of graphs and compute likelihoods efficiently by using Energy-Based Models (EBMs) over graphs. GAD-EBM approaches the likelihood of a rooted subgraph of node v, and further can leverage the likelihood to accurately identify whether node v is anomalous or not. Traditional score matching for training EBMs may not be used to apply EBMs that model the distribution of graphs because of complicated discreteness and multi-modality of graph data. We propose a Subgraph Score Matching (SSM) approach, which is specifically designed for graph data based on a novel framework of neighborhood state-space graphs. Experimentation conducted on six real-world datasets validates the effectiveness and efficiency of GAD-EBM and the source code for GAD-EBM is openly available. |
Amit Roy · Juan Shu · Olivier Elshocht · Jeroen Smeets · Ruqi Zhang · Pan Li 🔗 |
-
|
Linear Complexity Framework for Feature-Aware Graph Coarsening via Hashing
(
Poster
)
>
link
Large-scale graphs are increasingly common in various applications, leading to significant computational challenges in data processing and analysis. To address this, coarsening algorithms are employed to reduce graph size while preserving key properties. However, existing methods for large-scale graphs are computationally intensive, undermining the coarsening goal. Additionally, real-world graphs often contain node-specific features or contexts, which current coarsening approaches overlook, focusing solely on structural information like adjacency matrices. This limitation may not suit downstream tasks reliant on node features. In this paper, we introduce a Feature-Aware graph Coarsening algorithm via Hashing, called FACH, inspired by locality sensitive hashing to coarsen the graph based on the node features. To our knowledge, this is the first-ever method that coarsens a graph with node features in linear time. FACH is over 7× faster than the quickest and around 150× faster than the existing techniques for datasets like Coauthor Physics which has 34,493 nodes. We also demonstrate the efficacy of the proposed framework in terms of superior run-time complexity. The coarsened graph obtained by our method also preserves the spectral properties of the original graph while achieving massive improvement in time-complexity of coarsening which is the primary goal of our study. We showcase the effectiveness of FACH for the downstream task by evaluating the performance on scalable training of graph neural networks using coarsened data on benchmark real-world datasets. |
Mohit Kataria · Aditi Khandelwal · ROCKTIM DAS · Sandeep Kumar · Jayadeva Dr 🔗 |
-
|
SATG : Structure Aware Transformers on Graphs for Node Classification
(
Poster
)
>
link
Transformers have achieved state-of-the-art performance in the fields of Computer Vision (CV) and Natural Language Processing (NLP). Inspired by this, architectures have come up in recent times that incorporate transformers into the domain of graph neural networks. Most of the existing Graph Transformers either take a set of all the nodes as an input sequence leading to quadratic time complexity or they take only one hop or k-hop neighbours as the input sequence, thereby completely ignoring any long-range interactions. To this end, we propose Structure Aware Transformer on Graphs (SATG), where we capture both short-range and long-range interactions in a computationally efficient manner. When it comes to dealing with non-euclidean spaces like graphs, positional encoding becomes an integral component to provide structural knowledge to the transformer. Upon observing the shortcomings of the existing set of positional encodings, we introduce a new class of positional encodings trained on a Neighbourhood Contrastive Loss that effectively captures the entire topology of the graph. We also introduce a method to effectively capture long-range interactions without having a quadratic time complexity. Extensive experiments done on five benchmark datasets show that SATG consistently outperforms GNNs by a substantial margin and also successfully outperforms other Graph Transformers. |
Sumedh B G · Sanjay Patnala · Himil Vasava · Akshay Sethi · Sonia Gupta 🔗 |
-
|
What Improves the Generalization of Graph Transformer? A Theoretical Dive into Self-attention and Positional Encoding
(
Poster
)
>
link
Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-convex interactions across layers and the recursive graph structure have made it challenging to establish a theoretical foundation for learning and generalization. This study introduces the first theoretical investigation of a shallow Graph Transformer for semi-supervised node classification, comprising a self-attention layer with relative positional encoding and a two-layer perception. Focusing on a graph data model with discriminative nodes that determine node labels and non-discriminative nodes that are class-irrelevant, we characterize the sample complexity required to achieve a zero generalization error by training with stochastic gradient descent (SGD). Our theoretical findings suggest that a larger fraction of discriminative nodes, a clearer-cutting vote among discriminative nodes, a smaller fraction of erroneous labels, and smaller errors in the initial model and node patterns improve generalization. Furthermore, we demonstrate that self-attention and positional encoding enhance generalization by making the attention map sparse and promoting the core neighborhood during training, which explains the superior feature representation of Graph Transformers. Our theoretical results are supported by empirical experiments on synthetic and real-world benchmarks. |
Hongkang Li · Meng Wang · Tengfei Ma · Sijia Liu · ZAIXI ZHANG · Pin-Yu Chen 🔗 |
-
|
TOD-Flow: Modeling the Structure of Task-Oriented Dialogues
(
Poster
)
>
link
Task-Oriented Dialogue (TOD) systems have become crucial components in interactive artificial intelligence applications. While recent advances have capitalized on pre-trained language models (PLMs), they exhibit limitations regarding transparency and controllability. To address these challenges, we propose a novel approach focusing on inferring the TOD-flow graph from dialogue data annotated with dialog acts, uncovering the underlying task structure in the form of a graph. The inferred TOD-flow graph can be easily integrated with any dialogue model to improve its prediction performance, transparency, and controllability. Our TOD-flow graph learns what a model can, should, and should not predict, effectively reducing the search space and providing a rationale for the model's prediction.We show that the proposed TOD-flow graph better resemble human-annotated graphs compared to prior approaches.Furthermore, when combined with several dialogue policies and end-to-end dialogue models, we demonstrate that our approach significantly improves dialog act classification and end-to-end response generation performance in the MultiWOZ and SGD benchmarks. |
Sungryull Sohn · Yiwei Lyu · Anthony Liu · Lajanugen Logeswaran · Dong-Ki Kim · Dongsub Shim · Honglak Lee 🔗 |
-
|
Networked Inequality: Preferential Attachment Bias in Graph Neural Network Link Prediction
(
Spotlight
)
>
link
Graph neural network (GNN) link prediction is increasingly deployed in citation, collaboration, and online social networks to recommend academic literature, collaborators, and friends. While prior research has investigated the dyadic fairness of GNN link prediction, the within-group fairness and ``rich get richer'' dynamics of link prediction remain underexplored. However, these aspects have significant consequences for degree and power imbalances in networks. In this paper, we shed light on how degree bias in networks affects Graph Convolutional Network (GCN) link prediction. In particular, we theoretically uncover that GCNs with a symmetric normalized graph filter have a within-group preferential attachment bias. We validate our theoretical analysis on real-world citation, collaboration, and online social networks. We further bridge GCN's preferential attachment bias with unfairness in link prediction and propose a new within-group fairness metric. This metric quantifies disparities in link prediction scores between social groups, towards combating the amplification of degree and power disparities. Finally, we propose a simple training-time strategy to alleviate within-group unfairness, and we show that it is effective on citation, online social, and credit networks. |
Arjun Subramonian · Levent Sagun · Yizhou Sun 🔗 |
-
|
Bridging the Gap: Towards Flexible, Efficient, and Effective Tensor Product Networks
(
Poster
)
>
link
Geometric graph neural networks have showcased exceptional performance in modelling geometric data, a common requirement in natural science research. These models heavily rely on equivariant operations, encompassing vital techniques such as scalarization and the Clebsch-Gordan tensor product. However, tensor-product-based architectures face substantial computational challenges as the representation order increases, significantly limiting their versatility. Moreover, the interpretability of interactions between steerable components remains elusive. In contrast, scalarization methods benefit from cost-efficient invariant scalar functions while still being capable of outperforming certain tensor-product-based models. To bridge the gap between these approaches, we introduce a conceptual framework that emphasizes the potential flexibility of tensor product networks. To provide guidance for efficient framework design and gain deeper insights into steerable components, we conduct a preliminary investigation by pruning tensor product interactions. This approach enables us to directly assess the redundancy and significance of steerable components, paving the way for future design possibilities. |
Nanxiang Wang · Chen Lin · Michael Bronstein · Philip Torr 🔗 |
-
|
ChatPathway: Conversational Large Language Models for Biology Pathway Detection
(
Spotlight
)
>
link
Biological pathways, like protein-protein interactions and metabolic networks, are vital for understanding diseases and drug development. Some databases such as KEGG are designed to store and map these pathways. However, many bioinformatics methods face limitations due to database constraints, and certain deep learning models struggle with the complexities of biochemical reactions involving large molecules and diverse enzymes. Importantly, the thorough exploration of biological pathways demands a deep understanding of scientific literature and past research. Despite this, recent advancements in Large Language Models (LLMs), especially ChatGPT, show promise. We first restructured data from KEGG and augmented it with molecule structural and functional information sourced from UniProt and PubChem. Our study evaluated LLMs, particularly GPT-3.5-turbo and Galactica, in predicting biochemical reactions and pathways using our constructed data. We also assessed its ability to predict novel pathways, not covered in its training dataset, using findings from recently published studies. While GPT demonstrated strengths in pathway mapping, Galactica encountered challenges. This research emphasizes the potential of merging LLMs with biology, suggesting a harmonious blend of human expertise and AI in decoding biological systems. |
Yanjing Li · Hannan Xu · Haiteng Zhao · Hongyu Guo · Shengchao Liu 🔗 |
-
|
Talk like a Graph: Encoding Graphs for Large Language Models
(
Poster
)
>
link
Graphs are a powerful tool for representing and analyzing complex relationships in real-world applications such as social networks, recommender systems, and computational finance. Reasoning on graphs is essential for drawing inferences about the relationships between entities in a complex system, and to identify hidden patterns and trends. Despite the remarkable progress in automated reasoning with natural text, reasoning on graphs with large language models (LLMs) remains an understudied problem. In this work, we perform the first comprehensive study of encoding graph-structured data as text for consumption by LLMs. We show that LLM performance on graph reasoning tasks varies on three fundamental levels: (1) the graph encoding method, (2) the nature of the graph task itself, and (3) interestingly, the very structure of the graph considered. These novel results provide valuable insight on strategies for encoding graphs as text. Using these insights we illustrate how the correct choice of encoders can boost performance on graph reasoning tasks inside LLMs by 4.8% to 61.8%, depending on the task. |
Bahare Fatemi · Jonathan Halcrow · Bryan Perozzi 🔗 |
-
|
Motif-aware Attribute Masking for Molecular Graph Pre-training
(
Spotlight
)
>
link
Attribute reconstruction is used to predict node or edge features in the pre-training of graph neural networks. Given a large number of molecules, they learn to capture structural knowledge, which is transferable for various downstream property prediction tasks and vital in chemistry, biomedicine, and material science. Previous strategies that randomly select nodes to do attribute masking leverage the information of local neighbors. However, the over-reliance of these neighbors inhibits the model's ability to learn long-range dependencies from higher-level substructures. For example, the model would learn little from predicting three carbon atoms in a benzene ring based on the other three but could learn more from the inter-connections between the functional groups, or called chemical motifs. In this work, we propose and investigate motif-aware attribute masking strategies to capture long-range inter-motif structures by leveraging the information of atoms in neighboring motifs. Once each graph is decomposed into disjoint motifs, the features for every node within a sample motif are masked. The graph decoder then predicts the masked features of each node within the motif for reconstruction. We evaluate our approach on eight molecular property prediction datasets and demonstrate its advantages. |
Eric Inae · Gang Liu · Meng Jiang 🔗 |
-
|
Graph Neural Networks on Discriminative Graphs of Words
(
Poster
)
>
link
In light of the recent success of Graph Neural Networks (GNNs) and their ability to perform inference on complex data structures, many studies apply GNNs to the task of text classification. In most previous methods, a heterogeneous graph, containing both word and document nodes, is constructed using the entire corpus and a GNN is used to classify document nodes. In this work, we explore a new Discriminative Graph of Words Graph Neural Network (DGoW-GNN) approach encapsulating both a novel discriminative graph construction and model to classify text. In our graph construction, containing only word nodes and no document nodes, we split the training corpus into connected components according to their labels and weight edges by the pointwise mutual information of the represented words. Our graph construction, for which we provide theoretical motivation, allows us to reformulate the task of text classification as the task of walk classification. We also propose a new model for the graph-based classification of text, which combines a GNN and a sequence model. We evaluate our approach on seven benchmark datasets and find that it is outperformed by several state-of-the-art baseline models. We analyse reasons for this performance difference and hypothesise under which conditions it is likely to change. |
Yassine ABBAHADDOU · Johannes Lutzeyer · Michalis Vazirgiannis 🔗 |
-
|
Estimating Epistemic Uncertainty of Graph Neural Networks using Stochastic Centering
(
Poster
)
>
link
While graph neural networks (GNNs) are widely used for node and graph representation learning tasks, the reliability of GNN uncertainty estimates under distribution shifts remains relatively under-explored. Indeed, while \textit{post-hoc} calibration strategies can be used to improve in-distribution calibration, they need not also improve calibration under distribution shift. However, techniques which produce GNNs with better \textit{intrinsic} uncertainty estimates are particularly valuable, as they can always be combined with post-hoc strategies later. Therefore, in this work, we propose G-$\Delta$UQ, a novel training framework designed to improve intrinsic GNN uncertainty estimates. Our framework adapts the principle of stochastic data centering to graph data through novel graph anchoring strategies, and is able to support partially stochastic GNNs. While, the prevalent wisdom is that fully stochastic networks are necessary to obtain reliable estimates, we find that the functional diversity induced by our anchoring strategies when sampling hypotheses renders this unnecessary and allows us to support \gduq~ on pretrained models. Indeed, through extensive evaluation under covariate, concept and graph size shifts, we show that G-$\Delta$UQ leads to better calibrated GNNs for node and graph classification. Further, it also improves performance on the uncertainty-based tasks of out-of-distribution detection and generalization gap estimation. Overall, our work provides insights into uncertainty estimation for GNNs, and demonstrates the utility of G-$\Delta$UQ in obtaining reliable estimates.
|
Puja Trivedi · Mark Heimann · Rushil Anirudh · Danai Koutra · Jayaraman Thiagarajan 🔗 |
-
|
On the Consistency of GNN Explainability Methods
(
Poster
)
>
link
Despite the widespread utilization of post-hoc explanation methods for graph neural networks (GNNs) in high-stakes settings, there has been a lack of comprehensive evaluation regarding their quality and reliability. This evaluation is challenging primarily due to non-Euclidean nature of the data, arbitrary size, and complex topological structure. In this context, we argue that the \emph{consistency} of GNN explanations, denoting the ability to produce similar explanations for input graphs with minor structural changes that do not alter their output predictions, is a key requirement for effective post-hoc GNN explanations. To fulfill this gap, we introduce a novel metric based on Fused Gromov-Wasserstein distance to quantify consistency. |
Ehsan Hajiramezanali · Sepideh Maleki · Alex Tseng · Aicha BenTaieb · Gabriele Scalia · Tommaso Biancalani 🔗 |
-
|
GNN Predictions on k-hop Egonets Boosts Adversarial Robustness
(
Poster
)
>
link
Like many other deep learning models, Graph Neural Networks (GNNs) havebeen shown to be susceptible to adversarial attacks, i.e., the addition of craftedimperceptible noise to input data changes the model predictions drastically. Wepropose a very simple method k-HOP-PURIFY which makes node predictions on ak-hop Egonet centered at the node instead of the entire graph boosts adversarial accuracies. This could be used both as i) a post-processing step after applying populardefenses or ii) as a standalone defense method which is comparable to many othercompetitors. The method is extremely lightweight and scalable (takes 4 lines ofcode to implement) unlike many other defense methods which are computationallyexpensive or rely on heuristics. We show performance gains through extensive experimentation across various types of attacks (poison/evasion, targetted/untargeted),perturbation rates, and defenses implemented in the DeepRobust Library. |
Jian Vora 🔗 |
-
|
GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks
(
Poster
)
>
link
Graph neural networks (GNNs) learn the representation of nodes in a graph by aggregating the neighborhood information in various ways. As these networks grow in depth, their receptive field grows exponentially due to the increase in neighborhood sizes, resulting in high memory costs. Graph sampling solves memory issues in GNNs by sampling a small ratio of the nodes in the graph. This way, GNNs can scale to much larger graphs. Most sampling methods focus on fixed sampling heuristics, which may not generalize to different structures or tasks. We introduce GRAPES, an adaptive graph sampling method that learns to identify sets of influential nodes for training a GNN classifier. GRAPES uses a GFlowNet to learn node sampling probabilities given the classification objectives. We evaluate GRAPES across several small- and large-scale graph benchmarks and demonstrate its effectiveness in accuracy and scalability. In contrast to existing sampling methods, GRAPES maintains high accuracy even with small sample sizes and, therefore, can scale to very large graphs. Our code is publicly available at https://anonymous.4open.science/r/GRAPES. |
Taraneh Younesian · Thiviyan Thanapalasingam · Emile van Krieken · Daniel Daza · Peter Bloem 🔗 |
-
|
HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation
(
Poster
)
>
link
The ability to quickly understand scenes from visual observations via structured representations, known as Scene Graph Generation (SGG), is a crucial component of perception models. Despite recent advancements, most existing models assume perfect observations, an often-unrealistic condition in real-world scenarios. Such models can struggle with visual inputs affected by natural corruptions such as sunlight glare, extreme weather conditions, and smoke. Drawing inspiration from human hierarchical reasoning skills (i.e., from higher to lower levels) as a defense against corruption, we propose a new framework called Hierarchical Knowledge Enhanced Robust Scene Graph Generation (HiKER-SGG). First, we create a hierarchical knowledge graph, facilitating machine comprehension of this structured knowledge. Then we bridge between the constructed graph and the initial scene graph and perform message passing for hierarchical graph reasoning. Finally, we propose a hierarchical prediction head to enable the model to predict from a higher to lower level, thus enhancing robustness against corruptions that frequently impact only fine-grained details. Experiments on various settings confirm the superior performance of the proposed framework with both clean and corrupted images. |
Ce Zhang · Simon Stepputtis · Joseph Campbell · Katia Sycara · Yaqi Xie 🔗 |
-
|
Higher-Order Expander Graph Propagation
(
Poster
)
>
link
Graph neural networks operate on graph-structured data via exchanging messages along edges. One limitation of this message passing paradigm is the over-squashing problem. Over-squashing occurs when messages from a node's expanded receptive field are compressed into fixed-size vectors, potentially causing information loss. To address this issue, recent works have explored using expander graphs, which are highly-connected sparse graphs with low diameters, to perform message passing. However, current methods on expander graph propagation only consider pair-wise interactions, ignoring higher-order structures in complex data. To explore the benefits of capturing these higher-order correlations while still leveraging expander graphs, we introduce higher-order expander graph propagation. We propose two methods for constructing bipartite expanders and evaluate their performance on both synthetic and real-world datasets. |
Thomas Christie · Yu He 🔗 |
-
|
MCGC: an MLP-based supervised Contrastive learning framework for Graph Classification
(
Poster
)
>
link
Graph Neural Networks (GNNs) have been widely used for tasks involving graph-structured data. These networks create matrix representations of graphs by aggregating nodes information from neighbor nodes recursively. Integrating with contrastive learning, graph contrastive learning has shown enhanced performance on graph-level tasks. However, architectures of graph contrastive learning frameworks become complicated due to the sophisticated structures of GNN-based encoder and necessity of both encoder and projection head. In this paper, we proposed a significantly simplified MLP-based supervised contrastive learning framework for graph classification tasks, coined as MCGC, which does not incorporate any GNN layers. Experimental results on graph benchmark datasets and ablation studies indicate that, despite not utilizing GNN layers, our framework achieved comparable or even superior performance on graph classification tasks against some state-of-the-art models. |
Xiao Yue · Bo Liu · Andrew Meng · Guangzhi Qu 🔗 |