Abstract
Keyword search has been popularly used to query graph data. Due to the lack of structure support, a keyword query might generate an excessive number of matches, referred to as "answer graphs", that could include different relation-ships among keywords. An ignored yet important task is to group and summarize answer graphs that share similar structures and contents for better query interpretation and result understanding. This paper studies the summarization problem for the answer graphs induced by a keyword query Q. (1) A notion of summary graph is proposed to character-ize the summarization of answer graphs. Given Q and a set of answer graphs G, a summary graph preserves the relation of the keywords in Q by summarizing the paths connecting the keywords nodes in G. (2) A quality metric of summary graphs, called coverage ratio, is developed to measure infor-mation loss of summarization. (3) Based on the metric, a set of summarization problems are formulated, which aim to find minimized summary graphs with certain coverageratio. (a) We show that the complexity of these summarization problems ranges from ptime to np-complete. (b) We provide exact and heuristic summarization algorithms.(4) Using real-life and synthetic graphs, we experimentally verify the effectiveness and the efficiency of our techniques. © 2013 VLDB Endowment.