Query-biased summaries for tabular data

Vincent Au; Paul Thomas; Gaya K Jayasinghe

Query-biased summaries for tabular data

Vincent Au ,
Paul Thomas ,
Gaya K Jayasinghe

Proceedings of the Australasian Document Computing Symposium | December 2016

Published by ACM

Publication | Publication

Download BibTex

Government, research, and academic data portals publish a large amount of public data, but present tools make discovery difﬁcult. In particular, search results do not support a user’s decision whether or not to commit to a download of what might be a large data set.

We describe a method for producing query-biased summaries of tabular data, which aims to support a user’s download decision—or even to answer the question on the spot, with no further interaction. The method infers simple types in the data and query; automatically reﬁnes queries, where that makes sense; extracts relevant subsets of the complete table; and generates both graphical and tabular summaries of what remains. A small-scale user study suggests this both helps users identify useful results (fewer false negatives), and reduces wasted downloads (fewer false positives).