Wentao Wu

Principal Researcher

À propos

I am with the Data Systems (opens in new tab) group, Microsoft Research. I received my Ph.D. in the Database Group at the University of Wisconsin-Madison, under the supervision of Prof. Jeffrey Naughton.

I have broad interest in database system, data mining, and machine learning. I am currently working on query optimization, query processing, database system performance tuning, big data systems, distributed systems, data stream processing, and machine learning systems. In the past, I have worked on various topics including graph data management, personal data management, knowledgebase construction, social network analysis, data privacy, entity matching in data integration, database as a service in the cloud, and so on.

Since I joined Microsoft Research, my primary focus has been the project of “Autonomous Index Tuning for Database Systems”, with an emphasis on using machine learning (ML) technologies to improve the efficiency and effectiveness of index tuning. More details can be found in [SIGMOD’22] (opens in new tab)[SIGMOD’22] (opens in new tab)[VLDB’22] (opens in new tab)[SIGMOD’19] (opens in new tab)[VLDB’18] (opens in new tab). I have also worked with production teams on developing indexing technologies in Helios [VLDB’20] (opens in new tab) and Hyperspace [VLDB’21] (opens in new tab). Helios is a system for inexpensive and flexible ingestion, indexing, and aggregation of large streams of real-time data at Microsoft, which combines the cloud and the edge as a single, holistic data processing platform. It has been featured in two blog entries of “the morning paper” series (part 1 (opens in new tab) et part 2 (opens in new tab)). Hyperspace introduces an indexing subsystem for Apache Spark (opens in new tab). It has been used by Azure Synapse Analytics (opens in new tab) and also open-sourced on GitHub (opens in new tab).

Before joining Microsoft, I worked on the project of “Cost Modeling and Query Optimization for Database Systems”, using sampling-based technologies. More details can be found in [SIGMOD’16] (opens in new tab)[VLDB’14] (opens in new tab)[VLDB’13] (opens in new tab)[ICDE’13] (opens in new tab). I also worked on developing Probase [SIGMOD’12] (opens in new tab), a probabilistic knowledge graph for text understanding, which later on became the Microsoft Concept Graph (opens in new tab).

Wentao Wu

À propos

Contact Wentao Wu

Microsoft Research Lab – Redmond