If there’s one thing the digital age has made clear to us all, it is that our experience with products is everything. Gone are the days of studying instruction manuals; as each new iteration of our cherished devices reaches our hands, we expect a leap in their intuitiveness. And at the same time, we expect them be more pleasing: faster, more capable, quieter.
For Microsoft, leading this progression requires top sound quality in everything we hear and whenever we speak. When you hold a Surface™ tablet in your hands, every ‘bong’ that informs you a window has opened affects your perception of the device. While videoconferencing with colleagues through a Surface Hub screen, you expect clear voice transmission – wherever you are in the room. And when using Microsoft’s speech interface Cortana®, she must respond accurately to your voice commands.
Natural language interfaces are a major part of Microsoft’s vision for the future. The company’s world-leading brains are hard at work on human/machine interfaces that feel so natural and effortless that they essentially disappear. “The most natural human communication mode is speech and language – all over the world,” says Hundraj Gopal, Principal Human Factors Engineer at Microsoft. “We are finally at an inflection point. We are on the verge of using spoken language as a real and valuable communication interface with technology.”
Bringing the world together requires a dedication to achieving sound quality in the devices we use.
Beyond machines understanding us, we humans also need to understand each other effortlessly if technology is to become an ‘invisible’ assistant. So in phones and large screen devices alike, Microsoft uses multiple microphones to hone in on our voices with location algorithms. By then separating our voices from the background noise, they can clarify the signal we need – so we don’t strain our ears to hear or raise our voices to be heard.
Beneath such clever programming however, the quality of any audio interface ultimately comes down to its hardware. As LeSalle Munroe, Senior Engineer, Surface Devices says: “Good voice recognition starts with good acoustic design. Our anechoic chambers and test equipment allow us to reliably characterize our microphone and speakers to give us the best chance of meeting our voice recognition goals.”
For all hardware devices, LeSalle and his colleagues characterize microphones and speakers precisely. “In general, we test components alone, and then components in the whole system, focusing on raw acoustics like frequency response, total harmonic distortion (THD), rub and buzz, dynamic range, acoustic seal, sensitivity and noise floor,” he says. “Then we do full system qualification with added processing.”
The last step is to test voice recognition and sound quality. “This can take up more than 50% of the time, because it is a very iterative process,” says LeSalle. “We investigate the relevant aspects of audio engineering technologies, and map them to human perception, acceptance and annoyance, in order to increase user satisfaction.”
Much of Microsoft’s hardware testing takes place in Building 87, on Microsoft’s Redmond Campus. Inside, Cortana gets blasted with precise speech from a Head and Torso Simulator (HATS) or mouth simulator, which she must understand and respond to – whatever the background noise they add. The researchers also test the ability of beamforming algorithms to locate a speaker’s voice – again in quantified background noise. 3D spatialization technologies are tested on HATS to see how effective they are at conveying the audio cues we need to immerse us in authentic sound fields – particularly for the HoloLens augmented reality headset. They also measure sounds such as keyboard and trackpad clicks, to find the most pleasing sounds for a device to confirm our interactions with it.
Whatever the test, a controlled acoustic environment is critical. Microsoft has several anechoic chambers in Building 87, but with their quietest one, they have gone beyond merely controlled. With a background noise level of –20.6 dB(A) SPL, its noise floor is closer to the absolute lowest sound possible than other anechoic chambers. It even took the Guinness World Record in 2015.
One of the core reasons behind the huge effort to make this record-breaking chamber was to test components like humming displays, singing capacitors, rattling components and structural vibrations. “Being able to capture and characterize printed circuit board noise is a huge challenge for us,” says LeSalle. Although such noise levels are often tiny and well below the levels that our ears can detect, they can add up in non-linear ways to make a total noise that is audible, annoying, and interferes with voice recognition.
“We always want to have the best tools available for the job,” continues LeSalle. “Our other anechoic chambers are very good, no doubt. However, we wanted to build one with even better audio capabilities, so we could measure lower levels of sound, a higher purity of sound measurements, and increase the validity and reliability of our measurements – so we can quantify the audio performance of our products at a finer and greater level of detail. The chamber and the Brüel & Kjær microphones and preamps we use allow us to achieve the repeatability we want.”
It’s probably no surprise that Microsoft’s engineers are perfectionists. And according to Gopal, it is a pre-requisite. “Top products require a long-term commitment to excellence: top-notch experts from several disciplines and high-quality equipment,” he says. With this recipe for success, Microsoft can be sure the sound performance in their devices is built on the purest data. With precise knowledge of their individual components and systems, and the sharpest algorithms and codecs, they are melting the machine/human divide.
But the record-breaking lab is about more than the finest, most reliable measurements today. It’s a stone-built commitment to developing top quality hardware in the future. Because when Microsoft’s researchers are innovating how we will interact with new devices, there’s no roadmap to follow. They must imagine and then build their visions on the best foundations possible. And whatever amazing leaps that requires, Microsoft wants the best tools to hand, ready to realize the future we all want to see and hear.
The world’s quietest room is just one chamber within Building 87. This cutting-edge complex of hardware labs houses research into acoustics, human engineering factors such as ergonomics, and the ‘Lab of the Future’, where some of the world’s leading experts in fields as diverse as psychoacoustics, industrial design and history come together to find new approaches to human/machine interfaces. Few had seen inside Building 87 before late 2015, but following the acoustic world record, Microsoft has made the whole complex accessible through an interactive tour and videos.See more here: Microsoft’s building 87
Microsoft’s record-holding chamber was specified by a large team in Microsoft, and built by acoustic chamber specialists Eckel Industries Inc. The team paid careful attention in excruciating detail to ventilation systems, sprinkler systems, lighting, vibration control, instrumentation panels, cabling, and electrical noises.
Brüel & Kjær and BlackHawk Technology Inc. measured the noise floor of Microsoft’s quietest anechoic chamber at –20.6 dB(A) SPL. The quietest level of noise theorized by mathematicians is Brownian motion – the movement of particles in a gas or liquid – at –23 dB. The measurement method was specified by Guinness, and used a two-microphone coherent power measurement technique with two Type 4955 low-noise microphones. The acousticians measured the same overall dB(A) multiple times.
Scan here to see more about the Guinness World Record attempt.
See the Guinness World Record attempt here: