Private AI Bootcamp: Microsoft researchers share knowledge on cryptography, security, and privacy with PhD students

Published

By , Senior Research SDE

The fields of cryptography and machine learning (ML) are evolving at a rapid pace and are each complex in their own ways. The Cryptography and Privacy Research Group at Microsoft Research has led the way in creating a new research area at the intersection of these two important fields. But there is a shortage of training programs and mentoring in new skills and techniques, like homomorphic encryption, and a high demand for people who have the knowledge to find solutions to improve these technologies in the future. Not only will researchers need to advance these technologies, but they will also need to find creative ways to do so while considering and improving privacy for people using these systems.

In December 2019, Microsoft researchers took steps to solve this problem by empowering a group of young PhD students with cutting-edge privacy-preserving knowledge and experience in the Private AI Bootcamp at Microsoft Research Redmond. The goals for this event were threefold: to help more PhD students become experts on innovative technologies, to build an awareness of real-world problems, and to contribute to the greater goal of privacy protection.

The Private AI Bootcamp was a three-day tutorial on using an advanced cryptographic tool, homomorphic encryption, for privacy-preserving ML (PPML). 34 outstanding PhD students, chosen from more than 100 applicants in related fields, were funded by Microsoft Research Outreach and were invited to Redmond for tutorials from cryptography and ML experts at Microsoft Research. Students came from top universities around the world, submitting personal visions for their research, among other materials, that were reviewed by Microsoft researchers for selection. In their one-minute introductions at the start of the seminar, it was clear that selected PhD students came from many diverse perspectives and backgrounds.

Microsoft research podcast

What’s Your Story: Lex Story

Model maker and fabricator Lex Story helps bring research to life through prototyping. He discusses his take on failure; the encouragement and advice that has supported his pursuit of art and science; and the sabbatical that might inspire his next career move.

Funny group photo, Private AI Bootcamp, Dec 2, 2019

The bootcamp went beyond our expectations. In addition to numerous speakers, group discussions, and mentoring sessions, we planned a contest for students to work in eight groups and propose novel technology ideas related to PPML using homomorphic encryption. Despite the restrictive topic and less than 30 hours for preparation, all eight groups managed to present their solutions and implementations at a high quality. (Presentations are detailed at the end of this post.)

A focus on teaching privacy-preserving machine learning

The overarching concept explored at the event by researchers and students was PPML. Thanks to the growing awareness of privacy and related legislation, PPML techniques have attracted increasing attention from industry, academia, and government. Among all possible solutions to PPML, homomorphic encryption stands out for its post-quantum secure protection on data and extraordinary capability to process data without decryption. Microsoft has long been a leader in homomorphic encryption. The Cryptography and Privacy Research Group at Microsoft Research maintains one of the most widely adopted libraries that implements homomorphic encryption, called Microsoft SEAL, and together with Microsoft Research Outreach, has been devoted to building a collaborative homomorphic encryption community and pushing the standardization of homomorphic encryption.

Speakers presented a wide range of topics related to PPML

Kristin Lauter, Partner Research Manager and the leader of the Cryptography and Privacy Research Group at Microsoft Research, gave a welcoming keynote. She introduced the basic concepts of PPML and homomorphic encryption. Lauter also discussed some instances where people’s information can be better protected while still sharing data to make advancements in different areas, including government, healthcare, and others. One of the technologies she pinpointed was location services, where certain apps will collect and use location information, sometimes without a user knowing. This illustrates the need, as cloud technology using AI and ML advances, to incorporate privacy standards and technology across many contexts.

Kristin Lauter, Private AI Bootcamp, Dec 2, 2019

Sreekanth Kannepalli, the head of engineering for Foundry99, a startup incubation group within Microsoft Research, illustrated the growing difficulties that the industry faces to protect privacy. He shared common interests in and concerns of PPML from Microsoft customers and partners, which include both the proper use and protection of data. Kannepalli emphasized the need for new collaboration models and regulations to improve privacy in applications for technologies across finance, healthcare, and manufacturing.

Sreekanth Kannepalli, Private AI Bootcamp, Dec 2, 2019

Arun Gururajan, principal data scientist working on ML for cybersecurity applications at Microsoft, gave an inspirational introduction to a phishing detection application that protects users’ privacy using homomorphic encryption.

Steve Rowe, Data Science and Engineering Manager in the Core Data team, and Eric Olson, Data Science Lead on the Developer Platform team within Azure Cosine group, gave a fantastic introduction to ML, “Lifting the Lid on ML.” The goal was to provide students with enough background on ML to investigate further themselves. The talk was well received by the students based on the exciting interactions we saw during and following the presentation.

Eric Olson, Private AI Bootcamp, Dec 3, 2019

The rest of the tutorials consisted of several talks given by the researchers from the Cryptography and Privacy research group at Microsoft Research.

Wei Dai, Senior Research SDE, started the series with an introduction to homomorphic encryption. Dai detailed how homomorphic encryption has moved beyond other cryptography methods to increase privacy: it can compute on encrypted data without decryption, and no one can see results without a secret key.

Wei Dai, Private AI Bootcamp, Dec 2, 2019

Kim Laine, Senior Researcher, gave a hands-on tutorial on using the Microsoft SEAL library. Microsoft SEAL, the popular open-source homomorphic encryption library, is actively developed in C++17 and C# with updates being released often on its GitHub page.

LINK: Learn more about homomorphic encryption and Microsoft SEAL in our webinar with Kim Laine

Kim Laine, Private AI Bootcamp, Dec 2, 2019

Yongsoo Song, Senior Researcher, introduced the most popular homomorphic encryption scheme used in PPML, the CKKS scheme. Compared to other schemes, CKKS works much faster with fixed-point numbers, which is essential to ML. As one of the inventors of the CKKS scheme, Song not only understands it deeply but also has insights into its design. He explained the theoretic construction of the scheme and demonstrated its usage with several examples.

Yongsoo Song, Private AI Bootcamp, Dec 2, 2019

Hao Chen, Senior Researcher, demonstrated several expert techniques commonly used in PPML, which also revealed open research problems in the area. Chen introduced a few examples of how to build applications that do machine learning over encrypted data with homomorphic encryption, including both linear and nonlinear function examples.

man talking

Hao Chen, Private AI Bootcamp, Dec 2, 2019

PhD student group presentations: New solutions in privacy

Each student group was asked to propose a novel technology that combines ML and homomorphic encryption, based on what they had learned from the bootcamp, and presented a suggested direction for shaping an idea and investigating competing solutions. There were many sessions for the students to brainstorm and discuss in groups. Hao Chen, Wei Dai, Kim Laine, Kristin Lauter, and Yongsoo Song served as mentors—each researcher spoke with every group and provided help and comments on their ideas.

Birth of the winning idea, Private AI Bootcamp, Dec 3, 2019

The contest took place on the last day. Each group presented to all other attendees. They were graded by mentors based on the novelty, soundness, feasibility, and impact of the idea they proposed.

Team presentations included (winning solution in bold):

• “Private Outsourced Translation” by Travis Morrison, Bijeeta Pal, Sarah Scheffler, and Alexander Viand (Team 1)
• “HappyKidz: Privacy Preserving Phone Usage Tracking” by Benjamin Case, Marcella Hastings, Siam Umar Hussain, and Monika Trimoska (Team 2)
• “i-SEAL2: Identifying Spam EmAiL with SEAL” by Martha Norberg Hovd, Ioannis Demertzis, Ning Luo, and David Froelicher (Team 3)
• “Privacy-Preserving Detection of Depression and Suicidal Intent from Speech Data” by Anders Dalskov, Daniel Kales, Shabnan Khanna, Deepika Natarajan, and Jiaheng Zhang (Team 4).
• “Ensuring Trust When Trading ML Models” by Laia Amorόs, Syed Hafiz, Keewoo Lee, and Caner Tol (Team 5)
• “Secure Data Aggregation for Healthcare” by Rami El Khatib, Erin Hales, Leo de Castro, and Xinlei Xu (Team 6)
• “Private Video Recommendations for Children” by Ahn Pham, Mohammad Samragh Razlighi, Sameer Wagh, and Emily Willson (Team 7)
• “Privacy-Preserving Prescription Drug Management from HE Techniques” by Aria Shahverdi, Ni Trieu, Chenkai Weng, and William Youmans (Team 8)

The students quickly digested the tutorial, were aware of real-world privacy concerns, used creativity in building solutions, and capably implemented their ideas into software. The recordings of tutorials and students’ presentations are available on the event’s webpage. All groups’ proposals are also summarized in short tech reports that are now available at this page.

Continue reading

See all blog posts