Group-by Query Verification by Untrusted Clients on Outsourced Data Streams

Outsourcing data streams and desired computations to a third party such as the cloud is practical for many companies due to overwhelming flow of information and excessively high resource requirements of their data stream applications. However, data outsourcing and remote computations intrinsically raise issues of trust, making it crucial to verify results returned by third parties. In this context, we propose a novel solution to verify outsourced “Group-By, Sum” (or histogram) queries that are common in many business applications. We consider a setting where a data owner employs an untrusted remote server to run continuous Group-By, Sum queries on a data stream it forwards to the server. Untrusted clients then query the server for the computed results. More importantly, a client can efficiently verify the correctness of the results by using a small and easy-to-compute signature provided by the data owner. Our work complements previous works on authenticating remote computation of selection and aggregation queries. Moreover, unlike prior work on remote Group-by queries, we support untrusted clients (who can collude with other clients or with the server) and provide stronger cryptographic guarantees. Experimental results on real and synthetic data show that our solution is practical and efficient.