

Each utterance has a slightly different sound wave, which Apple's algorithms analyze to find the best fit for any given sentence.

(In English, "fourteen" is a word, the toothy "e" sound in the middle is a phoneme.) They try to capture these phonemes spoken in every imaginable way: trailing off at the end of the word, harder at the beginning, longer before a pause, rising in a question. They supplement that by hiring local voice talent, and having them read books, newspapers, web articles, and more.Īpple's team transcribes those recordings, matching words to sounds-and more importantly, identifying phonemes, the individual sounds that make up all speech. When bringing Siri into a new market-say, Shanghai-the team first finds pre-existing databases of local speech. One helpful way to understand how these systems work is through Apple's process of teaching Siri a new language. Siri needs to learn to separate human speech from machine speech, and your speech from everyone else's. It's also a security concern: Researchers recently found they could communicate with Siri at frequencies too high for humans to hear, rendering the hack invisible. The more data Siri has, and the better Apple's models become, the more it can discern between people and understand even heavy accents. The AI works in two distinct and critical parts of the system: speech-to-text, in which Siri tries to figure out what you said and text-to-speech, in which Siri speaks back.Īmong Siri's most important jobs entails distinguishing your voice from everyone else's, especially as these systems become more personalized. Siri's raw voice recognition rivals all its competitors, correctly identifying 95 percent of users' speech. It's now based on deep learning and AI, and has improved vastly as a result. It's nicer to listen to, and to talk to.Ī few years ago, the team at Apple, led by Acero, took control of Siri's back-end and revamped the experience. The words sound more fluid and Siri speaks more languages, too. Siri now takes more pauses in sentences, elongates syllables right before a pause, and the speech lilts up and down as it speaks. It doesn't include many new features or tell better jokes, but you'll notice the difference.
#Who made siri voice software#
This fall, when iOS 11 hits millions of iPhones and iPads around the world, the new software will give Siri a new voice. Mostly, it confirmed that his team at Apple has spent the last few years on the right project: making Siri sound more human.
/cdn.vox-cdn.com/uploads/chorus_image/image/66107003/xex_168_artofvoiceover_v7yt.00_00_02_05.still001.0.0.0.0.png)
"It was not robotic!" This hardly counts as a revelation for Acero. When I ask Acero what he learned about why the voice worked so well, he laughs because the answer is so obvious. Acero was trying to discern what about Samantha could make someone fall in love without ever seeing her. He paid attention to how she talked to Theodore Twombly, played by Joaquin Phoenix, and how Twombly talked back. Acero, the Apple executive in charge of the tech behind Siri, sat there with his eyes closed, listening to how Scarlett Johansson voiced her artificially intelligent character Samantha.
#Who made siri voice movie#
The second time, he didn't watch the movie at all. The first time Alex Acero saw Her, he watched it like a normal person.
