Master's Defense

Ranking Aggregate Query Answers with Relevance, Diversity and Coverage

Speaker:Xiaodan Zhu
xdzhu at
Date: Monday, April 18, 2016
Time: 1:00pm - 2:30pm
Location: D344 LSRC, Duke


Recently, making top-k results more meaningful has attracted significant attention since it can improve the quality and utility of the results in response to user queries. Previous research considered three aspects of answers returned to the user -- diversity, relevance, and coverage, although focused only on one or two of them at the same time. In this work, we propose a novel, intuitive framework of ranking aggregate query results that simultaneously considers all three aspects by clustering result tuples having the same value of some of the attributes together. In particular, given a distance parameter D and coverage parameter L, instead of returning top-k original result tuples according to their values, our goal is to output at most k clusters such that (a) these clusters are at least at distance D from each other (diversity), (b) together they cover top-L elements from the original answers (coverage), and (c) the total value of the clusters is maximized (relevance). We explore complexity of this optimization problem, propose efficient algorithms to find the clusters, and provide experiments on real data to evaluate the algorithms. We also illustrate a simple user interface for the system that outputs the new answers in two layers (output clusters and the original result tuples contained by them), and helps the user interact with the system by varying the input parameters on diversity and coverage.
Advisor(s): Sudeepa Roy
Committee: Jun Yang, Rong Ge