Graph neural networks (GNNs) have received more and more attention in past several years, due to the wide applications of graphs and networks, and the superiority of their performance compared to traditional heuristics-driven approaches. However, most existing GNNs still focus on node-level applications, such as node classification and link prediction, and many challenging graph tasks are graph-level, such as graph search. In this talk, I will introduce our recent progress on graph-level neural operator development. In particular, we will examine three challenging tasks that are key to the success of graph search: (1) How can we conduct efficient graph similarity search by turning the NP-Complete problems, such as the Graph Edit Distance (GED) and Maximum Common Subgraph (MCS) computation, into a learning problem? We will present SimGNN [1] and GraphSim [3] that are able to provide more efficient and effective results compared to state-of-the-art approximate algorithms. (2) How can we provide a neural operator that can turn any graph into a low dimensional representation vector, which is learnable, inductive, and unsupervised? In this line, we propose UGraphEmd [2] that is able to leverage graph-graph interaction to produce manifold-preserving graph-level embedding. Moreover, GHashing [5] is designed to map each graph to a discrete hash code, which enables a much more efficient search (20 times speed up) to handle large graph database with millions of graphs. And (3) how can we design GNNs that can directly detect the best matched subgraphs of two graphs? A deep reinforcement learning framework RLMCS [4] is then proposed to address this issue, with the goal to learn the best strategy to pick the next matching pair for two graphs. In the end, we will provide some discussions to the open questions in the field.

More and more knowledge graphs are becoming available on the Web, many of them accessible via queryable interfaces, such as SPARQL endpoints. These knowledge graphs are often linked to each other and in doing so form the so-called Web of Data. The great strength and potential of exploiting these links is that we can obtain answers for queries that would not yield any results over the knowledge graphs if considered individually. However, one of the fundamental problems is that Web-accessible knowledge graphs are not always available when we would like to access them. This means that the exact same query issued at different points in time might have different results or, in the worst case, no results if one of the knowledge graphs that are essential to answer the query is (temporarily) not available. Often this is caused by crashes of the server hosting the knowledge graph - a problem that is deeply rooted within the architecture of the current Web of Data, i.e., relying on data providers to keep interfaces and knowledge graphs available and accessible. Building upon the basics of federated query processing over knowledge graphs, this talk will discuss challenges as well as strategies on how to overcome the unavailability problem and keep the data available and queryable.

Graphs and graph-processing have become increasingly important. This has led to an explosion in the development of graph-processing systems, each of which is evaluated relative to its predecessors. In the absence of a large corpus of real-world graphs, synthetic generators provide an easy way to construct graphs of varying sizes. The most widely used generator is the Kronecker generator. Unfortunately, the Kronecker generator was not designed to produce graphs for benchmarking and when used in this fashion, it is problematic in two ways. First, the fraction of zero-degree vertices scales with the graph size, dramatically reducing the effective size of the connected graph. Second, the generator produces a vertex degree distribution not found in real world settings. We demonstrate these phenomena and present the Smooth Kronecker Generator, which remedies the vertex degree distribution problem without changing the statistical properties of the graph.

We describe the design and implementation of EdgeFrame: a graph-specialized Spark DataFrame that caches the edges of a graph in compressed form on all worker nodes of a cluster, and provides a fast and scalable Worst-Case-Optimal Join (WCOJ) that is especially useful for matching of complex and cyclical patterns in large graphs. Our choice to forego shuffle- or communication-based WCOJ is motivated by our analysis of the Shares algorithm for distributed WCOJ, that was proven communication-optimal, but which we show to quickly deteriorate to a full broadcast of all data already with moderately complex graph patterns. Our work shows that specializing WCOJ to a multi-way self-join, and leveraging compressed storage, provides a significant opportunity for better WCOJ performance. Finally, we investigate WCOJ parallelization and load-balancing strategies and show that fine-grained dynamic load-balancing with work-stealing is to be preferred, creating interesting insights and challenges for the future evolution of the Spark scheduler.

A recent study showed that the applicability of context-free path querying (CFPQ) algorithms with relational query semantics integrated with graph databases is limited because of low performance and high memory consumption of existing solutions. In this work, we implement a matrix-based CFPQ algorithm by using appropriate high-performance libraries for linear algebra and integrate it with RedisGraph graph database. Also, we introduce a new CFPQ algorithm with single-path query semantics that allows us to extract one found path for each pair of nodes. Finally, we provide the evaluation of our algorithms for both semantics which shows that matrix-based CFPQ implementation for Redis-Graph database is performant enough for real-world data analysis.

Vertex-centric block synchronous processing systems, exemplified by Pregel and Giraph, have received extensive attention for graph processing. These systems allow programmers to think only about operations that take place at one vertex and provide the underlying computation framework that involves multiple iterations (supersteps) with communication between neighboring vertices between supersteps. As graphs grow in size to billions of vertices and trillions of edges, processing them in this model face challenges: (1) The poor latency of supersteps dominated by the tasks performed on high degree vertices or densely connected components; and (2) The overwhelming network communication among vertices that can be proved of high redundancy. For many applications, approximate results are acceptable, and if these can be computed rapidly, they may be preferable. Many of the existing approximate solutions suffer from algorithm-specific designs that are not generic or lacking theoretical guarantees on the results' quality. In this paper we tackle this problem using a generic approach that can be incorporated into the graph processing platform. The approach we advocate involves communicating vertex states to a subset of the neighbors at each superstep; this is called selective edge lookup. We show how this approach can be incorporated into two primitive graph operators: BFS and DFS, which can be the basis of many graph analysis workloads. Extensive experiments over real-world and synthetic graphs validate the effectiveness and efficiency of the selective edge lookup approach.

Classically, ML models trained with stochastic gradient descent (SGD) are designed to minimize the average loss per example and use a distribution of training examples that remains static in the course of training. Research in recent years demonstrated, empirically and theoretically, that significant acceleration is possible by methods that dynamically adjust the training distribution in the course of training so that training is more focused on examples with higher loss. We explore loss-guided training in a new domain of node embedding methods pioneered by DeepWalk. These methods work with implicit and large set of positive training examples that are generated using random walks on the input graph and therefore are not amenable for typical example selection methods. We propose computationally efficient methods that allow for loss-guided training in this framework. Our empirical evaluation on a rich collection of datasets shows significant acceleration over the baseline static methods, both in terms of total training performed and overall computation.

Many data processing pipelines operate on highly-connected data sets that can be efficiently modelled as graphs. These graphs are rarely static, but rather change rapidly and often exhibit dynamic, temporal, or streaming behaviour. During the last decade, numerous graph benchmarks have been proposed, which cover a significant portion of the features required in practical use cases. However, whilst these benchmarks often contain some update operations, none of them include complex deletions, which makes it challenging to test the performance of graph processing systems under such operations. To address this limitation, we have extended the LDBC Social Network Benchmark (SNB) by introducing lifespan attributes for the creation and deletion dates of its entities. We have defined constraints for selecting these dates from intervals that ensure that the graph always satisfies the cardinality constraints prescribed by the schema and other semantic constraints of the social network domain. We have implemented the proposed lifespans in the SNB generator.

We present the design overview of a pattern matching engine for labeled graphs that supports interactive search: the user, based on feedback received from the search system, repeatedly revises her search template until s/he is satisfied with the results. To this end, we have developed a distributed memory solution that supports human-in-the-loop processing. Our solution embraces a number of design principles to offer high-performance, scalability and efficiency: (i) fast parallel processing - we adopt a vertex parallel computation model; (ii) aggressive search space reduction - using lightweight routines, we identify and prune away the non-matching part of the graph early; (iii) redundant work elimination - a revised query is likely to share label(s) and/or substructure(s) with its predecessor(s); therefore, whenever possible, we avoid redundant computation by reusing (partial) match information from earlier searches. Our preliminary evaluation highlights the effectiveness of the proposed approach: using a 257 billion edge real-world webgraph, on a 128 node (4,608 cores) deployment, we demonstrate the advantage of our technique over a naive approach (that uses an exact matching solution to independently search the original query and each of its revisions).

In this paper, we demonstrate a framework for DSL-based SQL query classification according to data-privacy directives. Based on query-log analysis, this framework automatically derives query meta-information (QMI) and provides interfaces for browsing and filtering queries based on this QMI. Domain-specific policy rules enable automatic classification of queries concerning their access to personal data. The generic policy-rule definition based on the QMI covers many syntactical SQL variations. To optimize classification performance, our framework stores the QMI both in relational and graph-based databases (DBs).

This case study compares the behavior of a relational DB with that of a graph-based DB with respect to a particular task, namely searching for the policy rules applicable to a given query. It turned out that both solutions have their benefits, so a hybrid solution has been chosen in the end.

In this demonstration paper, we present the Graph Based Benchmark Suite (GBBS), a suite of scalable, provably-efficient implementations of over 20 fundamental graph problems for shared-memory multicore machines. Our results are obtained using a graph processing interface written in C++, extending the Ligra interface with additional functional primitives that have clearly defined cost bounds. Our approach enables writing high-level codes that are simultaneously simple and high-performance by virtue of using highly-optimized primitives. Another benefit is that optimizations, such as graph compression, are implemented transparently to high-level user code, and can thus be utilized without changing the implementation. Our approach enables our codes to scale to the largest publicly-available real-world graph containing over 200 billion edges on a single multicore machine.

We show how to use GBBS to process and perform a variety of tasks on real-world graphs. We present the high-level C++ APIs that enable us to write concise, high-performance implementations. We also introduce a Python interface to GBBS, which lets users easily prototype algorithms and pipelines in Python that significantly outperform NetworkX, a mature Python-based graph processing solution.