Cloud computing changes the way we practice public speaking

Published

Overcoming the fear of public speaking using cloud-based technology

By Vani Mandava (opens in new tab), Senior Program Manager, Microsoft Research

People often rank public speaking as the number one fear that they face. New cloud-based technology from researchers at the University of Rochester lets speakers polish and practice at home in front of their computer camera, while the analysis provides instant feedback about improvement.

Spotlight: blog post

GraphRAG auto-tuning provides rapid adaptation to new domains

GraphRAG uses LLM-generated knowledge graphs to substantially improve complex Q&A over retrieval-augmented generation (RAG). Discover automatic tuning of GraphRAG for new datasets, making it more accurate and relevant.

Leading this effort known as ROC Speak is M. Ehsan Hoque (opens in new tab), an assistant professor of computer science and electrical and computer engineering at University of Rochester, where he codirects the Rochester Human-Computer Interaction (ROC HCI) Lab (opens in new tab).

Hoque has more than one motivation for helping people improve their communication. He has a brother with a severe social deficit who isn’t able to communicate well. Hoque also has heard from the public in more than 2,000 emails that some wish they could practice public speaking on computers in the privacy of their homes. Social speaking ability is valued by everyone, from academics who lecture to business leaders and students. With the donation of Microsoft Azure for Research tools (opens in new tab), Hoque has explored and designed a new platform to help speakers.

At a conference a few years ago, a man walked up to Hoque and said he feared social stigma because his speaking style was monotonous and he had difficulty making eye contact with people. Hoque was inspired by that encounter to further develop what is now ROC Speak (opens in new tab). It’s possible that in the future, ROC Speak will help people overcome speaking issues and might smooth social difficulties for people with Asperger’s Syndrome.

Among those testing the tool has been Valentina Kutyifa, a research cardiologist and former president of the Toastmasters Club at the University of Rochester. She has helped build a collaboration between the nonprofit Toastmaster’s International, which also helps people practice speaking, and ROC Speak. “I have used the ROC Speak product for some of my speeches, and I felt it’s very useful and helpful for preparing speeches and providing instant feedback,” she said.

Hoque explains that communication is much more complex than we realize, especially the nonverbal elements. The ROC Speak platform works by measuring many forms of nonverbal behavior simultaneously. Using the video camera and audio recording on the user’s laptop, the program measures eye gaze, word use, voice level, and hand gestures. ROC Speak uses techniques that automatically analyze these subtle human behaviors. In addition, the system provides feedback, which allows users to explore the nuances of their behavior during practice of a speech.

“The human face has 43 muscles. Using 43 muscles, we can create 10,000 unique facial expressions. To model nonverbal behavior, we need to get a lot of data and collect it in a naturalistic environment,” Hoque said. To facilitate capturing and handling that data, Microsoft Azure for Research lent the project access to cloud resources with advanced tools in the Cortana Intelligence (opens in new tab) suite, tools such as Azure Machine Learning and Microsoft Cognitive Services. This enabled the ROC Speak team to make the platform broadly available, capture and store participant data, and synthesize it.

Hoque and his lab used Azure-based tools to analyze user videos by scoring automated visual features such as smile intensity and movement, and audio features like pitch and loudness. After users record a 2-minute video, they are given immediate feedback by the machine–learning-based analysis. The feedback is presented in visually appealing graphs that show, for example, voice level for every few seconds of the speech. Word use—both the speed of talking and the sophistication of language—is analyzed. Gestures are tracked as well. The user can choose to share the video with other users and receive ratings on such elements as friendliness and gestures, as well as an overall rating.

Azure has helped his team, Hoque said, because it has an intuitive user interface and has allowed his students to use it without prior experience in cloud computing. One of his students is Vivian Li, who is an undergraduate research assistant. Li was surprised by the sense of community that developed among ROC Speak users who were rating other users on their videos. She and Kutyifa also were inspired by how dramatically people improved as they practiced.

Hoque sees further developments for the ROC Speak project, especially as they gather more and more data from participants. “It is one of the largest datasets on nonverbal communication captured from people practicing public speaking in front of a computer,” he said. Because it is cloud based, there is enormous potential to grow even further and collect even more data.

One of the great advantages he sees to cloud computing is that he is able to tweak and improve his algorithms as users keep using the program. His next step is to deploy ROC Speak widely in the world. “Social skills are fundamental to who you are … So I think if there is a platform out there that helps you to be better with communication, it can change the way we communicate.”

Learn more

Continue reading

See all blog posts