Thursday, March 13, 2014

You can have the result, but you're not getting my data!

A recent study group focused on the ZQL query language that was published in USENIX 2013 by Microsft research (http://research.microsoft.com/pubs/184348/zql.pdf). The ZQL query language was created to address a simple problem. Given a set of private client data, how can an external server learn the result of a function on that data without ever (directly) learning anything about the private data. We focus on a setting of a single client that knows all the data and does not want to give it to the server. Although it may sound trivial, devising a practical method to achieve this whilst providing the desired security guarantees is far from it. ZQL, which supports a subset of SQL functions, achieves this through the use of zero-knowledge protocols to produce code that performs the data certification, computation and verification. The final compiled code aims to provide the following security properties:
  • Correctness. For any given source inputs, the sequential composition of the cryptographic queries for the data sources, the user, and the verifier yields the same result as the source query.
  • Integrity. An adversary given the capabilities of the user cannot get the verifier to accept any other result (except with a negligible probability).
  • Privacy. An adversary given the capabilities of the verifier, able to choose any two collections of in- puts such that the source query yields the same result, and given the result of the user’s cryptographic query, cannot tell which of the two inputs was used.

One running use case in the paper is in the setting of smart metering. A household may be billed at a different rate as per the usage at different times of the day (e.g. economy7 meters but more fine-grained). The privacy concern here is that a user may not want the energy company to know the exact usage for each second of the day (possibly allowing the energy supplier to learn information about the users habits - such as when the user has a cup of tea, leaves for work in the morning or what time they watch T.V.) but the energy supplier is entitled to know what the bill for the user should be. ZQL allows for the meter to perform the bill calculations (as per the request of the supplier), certify the data and prove that the resulting bill is correct without revealing any of the data used to compute the bill.

The authors constructed ZQL to have sufficient abstraction such that using ZQL does not require a thorough understanding of the underlying crypto functions (making it appealing/practical for industrial deployment). Each time a query is constructed, the ZQL compiler will synthesize all the required zero-knowledge protocols (currently supported by RSA and Elliptic Curve primitives). The package is tied up nicely with cost metrics for evaluating and verifying the queries on the data. On the other hand - ZQL is still in it's infancy and limited to relatively simple SQL-like functions in order to remain practical and preserve the security guarantees.

I would say this is a nicely presented paper. My understanding of SQL is sparse at best, but that being said, the authors do well to present the intuition of what they are trying to achieve and what they have attained thus far. People are becoming increasingly aware of the data being given away about their personal habits and this paper does well to demonstrate the application of various crypto and computer science primitves to piece together a nice solution to this challenging problem.

No comments:

Post a Comment