Ultrasonic hack ‘DolphinAttack’ sends inaudible voice commands to Siri, Alexa

Ultrasonic hack ‘DolphinAttack’ sends inaudible voice commands to Siri, Alexa
Hackers can take control of smart devices using ultrasonic frequencies the human ear cannot detect, according to a new study. Tablets, phones or even in-car interfaces with major voice recognition platforms are all at risk.

Researchers at Zhejiang University published a white paper last Thursday detailing how they were able to attack devices like phones, tablets, and even in-car interfaces using a completely inaudible frequency. 

The DolphinAttack uses ultrasonic frequencies to access speech recognition systems such as Siri, Cortana or Alexa. Voice controllable systems (VCS) allow users to control a device with simple voice commands, such as “hey Siri. Open Google.com” with Apple devices or “Alexa, what’s the weather like today?” with Amazon Echo.

Most devices have a Micro Electro Mechanical Systems (MEMS) microphone, which contains a membrane, or a movable plate that vibrates in response to air pressure changes caused by sound waves. The device records those vibrations and converts them into an electrical signal that can be turned back into sound waves on the receiving end.

The tiny microphones in phones and other devices are able to detect frequencies above 20,000Hz, which cannot be detected by the human ear. Using harmonics, or resonant frequencies that are higher and lower than the fundamental frequency, they were able to translate voice commands into ultrasonic frequencies, which were recognized by the device’s speech recognition system.

"DolphinAttack voice commands, though totally inaudible and therefore imperceptible to [a] human, can be received by the audio hardware of devices, and correctly understood by speech recognition systems," the team wrote in their paper.

The DolphinAttack was used to execute a number of commands, from “activating Siri to initiate a FaceTime call on an iPhone, activating Google Now to switch the phone to the airplane mode, and even manipulating the navigation system in an Audi automobile.”

Researchers speculated that the attack could also be used to command a device to visit a malicious website that would download a virus, initiate a phone call in order to listen in on the user or instruct the device to send messages or publish materials online under the user’s name.

PayPal even allows users to send money with their smartphone using voice commands, which could potentially be controlled through a DolphinAttack.

Researchers claim they successfully tested the attack in five languages across 16 voice controllable systems including Apple iPhone, Google Nexus, Amazon Echo, and several in-car interfaces.

A separate team in the US also demonstrated that they could use inaudible voice commands to gain access to the Amazon Echo.

The team of researchers at Zhejiang University theorized how hackers could even breach a device that was trained to respond to only one person’s voice. They devised a way that hackers could use voice recordings of the device’s owner to synthesize the opening voice command using pieces of other words. Hackers could use a phrase such as “he carries cake to the city,” and slice up the phonemes to create the phrase “hey, Siri.”

They were even able to create a working portable transmitter using inexpensive parts that could be found in any electronics shop. The hacker could then perform a walk-by attack, gaining control of a device without physically touching it, altering the device settings, or installing any malware. However, the transmitter did not work from more than five feet away.

The team did provide ways the Dolphin attack could be prevented. Users can turn off the waking phrases function to stop hackers from being able to access the voice controllable systems. However, that would mean users would have to manually open the voice recognition interface every time they wanted to use it.

The researchers also suggested device manufacturers design microphones that do not act on ultrasonic frequencies or implement software that ignores commands at a certain frequency.