The foundations of voice-enabled technology were laid in 1952 with Audrey, the first ever documented speech recognition software, and continued with Eliza, an early natural language processing program created 14 years later. Human ingenuity and the pursuit of innovation continued for decades that followed. Today, we have automated assistants who can talk, understand, and perform actions based on our commands.
In 2011, Apple released the iPhone 4 and made waves by introducing Siri. Other major tech companies followed suit; Amazon launched Alexa in 2014, Microsoft introduced Cortana as part of its makeover in the same year, and Google gave us Google Assistant in 2016. These AI-powered virtual assistants are more than just talking alarms—they can issue reminders throughout the day, perform queries, control smart homes, and manage our connected devices.
Yet with the ubiquity of AI-powered smart assistants and conversational bots, voice-enabled technology offers the most personal communication experience. It preserves the human presence in the tech-driven smart world that we live in.
Now, as social distancing becomes the new normal, no-touch, voice-enabled technology is taking off in several Asian countries where consumers are conscientious and proactive about avoiding infection. The concern that mobile devices—which are touched more than 2,600 times a day, according to one study—can spread the coronavirus is one of the factors expediting this growth.
Conversational AI on the rise
“The first quarter of 2020 was one of our busiest quarters since 2017,” said Raghavendra Kumar Ravinutala, the co-founder of Yellow Messenger.
In 2018, the global market for conversational AI platforms was valued at around USD 2.364 billion, and it is expected to grow at a CAGR of 26.75% until 2025. The adoption and deployment of voice-enabled technology is on the rise all over the world, and India is not far behind. According to a recent report titled “Voice technology in India: Now and future – Consumer and business perspective,” India’s voice market is set to leap from USD 21 million at the end of 2019 to USD 58.4 million by the end of 2020—growing by around 40.47% within a year.
Voice commerce, meaning an arrangement where consumers can use verbal commands to search for products and make purchases, is an essential service in countries like India, where about a quarter of the population is not literate and spoken language is their only medium of communication.
Since March, Yellow Messenger has seen a surge in demand for conversational AI, especially from players in e-commerce, banking, insurance, retail, healthcare, and FMCG industries, all of which are looking to digitize their call centers, automate their sales engines, and drive direct-to-consumer commerce and services.
Owing to massive business disruptions due to COVID-19, as well as the urgency to adopt solutions to stay afloat, companies are turning to conversational AI. Ravinutala said, “Deals that would normally take at least three months to close, are now being closed and implemented within three days. We are also seeing FMCG and retail companies offer ‘direct-to-consumer (D2C)’ [sales] on WhatsApp.” As one of the most popular ways to communicate, WhatsApp provides the perfect environment for conversational commerce. With WhatsApp Business, companies can integrate voice commerce features for faster, smoother, more accessible customer outreach.
According to research and consulting firm Ovum’s report titled “Digital assistant and voice AI-capable device forecast for 2016 to 2021,” by next year, there will be more than 7.5 billion voice-enabled devices in the hands of consumers around the world, and the next generation of users will expect their web-based interactions with financial institutions, retail outlets, telecom service providers, and other businesses to be voice-first. However, keeping in mind the wide adoption of WhatsApp across countries and how easy it is to build a chatbot, even one that responds to spoken responses, many enterprises are developing these capabilities on the messaging platform before applying them to other contexts in India and all over the world.
Yellow Messenger stands out in the current climate, in that it managed to raise USD 20 million in Series B funding at a time when investors are extra cautious and many startups are in survival mode.
The company’s key product merges voice tech, AI, and multilingual capabilities with tools that many businesses already use, like enterprise resource planning software, human resource management systems, and customer relationship management systems. “Our platform has consistently offered a scalable solution by delivering meaningful and measurable results to enterprises across the globe. We’ve seen massive demand for our conversational AI platform since the very beginning, achieving fivefold growth in bookings, year-over-year, since 2017,” said Ravinutala.
He told KrASIA that Yellow Messenger was able to woo investors because of several reasons. In less than three years, the company managed to establish itself as a leader in conversational AI, powering 30 million conversations on chatbots for more than 100 clients each month. And that roster of customers includes recognizable names—Accenture, Dominos India, Flipkart, Grab, MG Motors, Royal Enfield, Schlumberger, and Xiaomi India.
Integrated voice tech systems are not merely bells and whistles. They’re necessary now.
Giving voice to single millennials
Yet another current success story in the voice-enabled tech industry is Singapore-based Goodnight, which developed a voice-based dating app with the same name. Goodnight has witnessed significant growth in the past few months, in particular gaining traction in Thailand.
Andy Huang, co-founder of Goodnight, said that there has been a 30% spike in new registrations compared to the last quarter of 2019.
Goodnight renders users anonymous; people who are on the platform don’t even need to upload profile pictures. Once two members are matched, the voice-enabled dating app allows them to chat with each other through speech, not via text. The idea is to let people have a conversation in the truest sense of the word.
Huang shared that Goodnight has observed a 120% increase in time spent on the app since partial lockdowns began, and users are browsing profiles more actively. There is also a 70% increase in the number of profiles that users speak to across all markets, and the average duration of each call has increased from 50 minutes to 75 minutes.
Another interesting observation is that people are making calls throughout the day. “It used to be mostly at night between 7:00 p.m. and 3:00 a.m., after work or school hours, but now we are seeing a surge between noon and 2:00 p.m. since late February,” said Huang.
Launched in 2015, Goodnight’s user base is chiefly made up of millennials, with 85% of its users under the age of 34. Each day, the app generates 300,000 connections via voice calls that altogether span more than 50,000 hours. With a pool of 8 million registered users, Goodnight is the largest voice dating app in Asia in terms of scale and is ranked among the top 5 apps under the dating category in Taiwan on Google Play.
One of the main reasons why Goodnight is so popular is that new versions of the app are frequently released so it stays relevant. For instance, as the pandemic set in, Goodnight promptly launched a new feature where mask stocks of convenience stores in Taiwan are reported in real-time. The developers also added a dedicated channel specifically so that users can discuss news related to COVID-19.
When probed further about the factors behind the app’s massive popularity, Huang told KrASIA that with social distancing being the new norm, people have a stronger desire to interact with others in virtual spaces, and Goodnight’s voice-based functionality provides an experience that is more intimate than text- and image-based dating apps.
Voice-enabled dating apps “add a dimension of authenticity to each profile,” Huang said. “The element of voice is appealing as users can detect sincerity and compatibility in real-time. Users can also look beyond physical appearances to truly focus on the quality of conversations.”
There is also appeal in instant gratification—users can interact with others who are online at the same time. This saves them from the agony of being ghosted, which is a prevalent problem across dating apps.
Furthermore, Goodnight serves more than just singles looking for love; the app is used by many who are interested in cross-cultural connections. For example, Chinese-speaking users who want to learn Thai or Korean users who are keen to learn Mandarin are very active within Goodnight’s community.
The challenges: Fragmentation and localization
Ravinutala of Yellow Messenger believes that one of the biggest challenges in the voice-enabled tech industry is the absence of someone who sets the bar. He said, “Globally, there are more than 1,500 companies engaged in building chatbots and helping enterprises automate their business processes. However, there is no major leader.”
Meanwhile, Huang from Goodnight said that for there to be a consistently smooth experience in voice chats, it is crucial to have stable network infrastructure with strong connectivity. Furthermore, data consumption patterns across different markets present other challenges. For example, people in Taiwan are accustomed to unlimited data plans, while Japanese users pay a subscription fee for their fixed data usage.
Huang added, “As we continue to enhance our features and user experience, we have to keep these factors in mind and localize our product to suit the tech environment of each market.”
China is yet another emerging voice tech ecosystem in Asia. The aggregate value of China’s AI-enabled voice recognition market is projected to reach USD 1.86 billion by 2023, according to IDC MarketScape. When more and more people were being infected with COVID-19 in the country, AI-enabled voice technology was adopted to automate some aspects of the effort to contain the disease’s spread. For instance, in some parts of the country, chatbots made verbal queries during calls with people who were susceptible to medical complications.
Furthermore, recently, Zhejiang-based AISpeech—a voice tech unicorn startup that specializes in voice interaction solutions, closed a USD 58 million Series E round led by CTC Capital. They have more than 8,000 clients, including Alibaba, Xiaomi, and SF Express.
The recent evolution is the shift to streamlined solutions that offer smoother user experience—rather than simply selling raw speech-to-text solutions. Vendors are looking to provide workflow and services across a wider range of conversational and analytics experiences. Over the next five years, we can expect to see broader voice service packages presented as synergistic suites comprising many aspects, from smart homes to conversational commerce, from voice searches to gaming. There is still a lot of potential for growth across Asian markets in the voice tech industry, and it will be interesting to see how these trends will develop in a post-pandemic world.