Speaking in public is the top fear for many people. Now, researchers from the Human-Computer Interaction Group at the University of Rochester have developed an intelligent user interface for "smart glasses" that gives real-time feedback to the speaker on volume modulation and speaking rate, while being minimally distracting.
The Rochester team describes the system, which they have called Rhema after the Greek word for "utterance," in a paper presented at the Association for Computer Machinery's Intelligent User Interfaces (IUI) conference in Atlanta.
Smart glasses with Rhema installed can record a speaker, transmit the audio to a server to automatically analyze the volume and speaking rate, and then present the data to the speaker in real time. This feedback allows a speaker to adjust the volume and speaking rate or continue as before.
Ehsan Hoque, assistant professor of computer science and senior author of the paper, used the system himself while giving lectures last term. "My wife always tells me that I end up speaking too softly," he says. "Rhema reminded me to keep my volume up. It was a good experience." He feels the practice has helped him become more aware of his volume, even when he is not wearing the smart glasses.
In the paper, Hoque and his students M. Iftekhar Tanveer and Emy Lin explain that providing feedback in real-time during a speech presents some challenges. "One challenge is to keep the speakers informed about their speaking performance without distracting them from their speech," they write. "A significant enough distraction can introduce unnatural behaviors, such as stuttering or awkward pausing. Secondly, the head mounted display is positioned near the eye, which might cause inadvertent attention shifts."
Tanveer, the lead author of the paper, explains that overcoming these challenges was their focus. To do this, they tested the system with a group of 30 native English speakers using Google Glasses. They evaluated different options of delivering the feedback. They experimented with using different colors (like a traffic light system), words and graphs, and no feedback at all (control). They also tried having a continuous slowly changing display and a sparse feedback system, by which the speaker sees nothing on the glasses for most of the time and then just sees feedback for a few seconds. After user-testing, delivering feedback in every 20 seconds in the form of words ("louder," "slower," nothing if speaker is doing a good job, etc.) was deemed the most successful by most of the test users.
The researchers also highlight that the users, overall, felt it helped them improve their delivery compared to the users who received continuous feedback and no feedback at all. They also addressed the system from the point of view of the audience and enlisted 10 Mechanical Turk workers.
"We wanted to check if the speaker looking at the feedback appearing on the glasses would be distracting to the audience," Hoque said. "We also wanted the audience to rate if the person appeared spontaneous, paused too much, used too many filler words and maintained good eye contact under the three conditions: word feedback, continuous feedback, and no feedback."
However, there was no statistically significant difference among the three groups on eye contact, use of filler words, being distracted, and appearing stiff, judged by the Mechanical Turk workers. As part of their future work, the researchers want to test their system with members of Toastmasters International as a more knowledgeable audience.
The researchers also believe that live feedback displayed in a private and non-intrusive manner could also be useful for people with social difficulties (e.g., Asperger syndrome), and even for people working in customer service.
Source and top image: University of Rochester
Learn more at the next leading event on the topic: Healthcare Sensor Innovations USA 2020 on 18 - 19 Nov 2020 at Santa Clara Convention Center, CA, USA hosted by IDTechEx.