"Similarity Search in Large Graph Databases: a Multi-layered Indexing Approach"
, 499 DSL
Today's highly networked world is facing computational challenges raised in particular by the abundance of complex and interconnected data, which, without loss of generality, are often modeled and interpreted as graphs. The proliferation of graph-structured data in widely varying applications has sparked a growing interest in enabling efficient access capabilities and flexible, structure-aware querying functionalities. In this talk, we consider the classical, NP-hard similarity search problem that retrieves relevant graphs from a graph database under the well-known graph edit distance (GED) constraint. Our contributions include (1) a parameterized, partition-based GED lower bound, (2) an efficient, selectivity-aware graph partitioning algorithm, and (3) a cost-effective, multi-layered indexing structure for GED lower bound evaluation and false-positive graph filtering with theoretical performance guarantees. Experimental studies in real and synthetic graph databases validate the efficiency and effectiveness of our method, which achieves up to an order of magnitude speedup over the state-of-the-art techniques for similarity search in large graph databases.