The metaverse, the virtual world proposed by Meta (Facebook’s parent company), is still under construction. Its main building block is artificial intelligence (AI), the technology that will make it possible for everything to work. Meta CEO Mark Zuckerberg on Thursday gave a presentation where he showed off some of the projects that his team of AI researchers has been working on, and which he feels will be fundamental to the success of the “immersive internet.” All of them have one element in common: voice.
One of the challenges facing the company to ensure users are able to navigate this new augmented reality is the creation of a whole new generation of digital assistants. In the metaverse, which we will access with virtual reality eyewear, we will receive a lot of visual stimuli. In order to avoid being overwhelmed by so much information, it will be critical to improve interactions between the machine (the metaverse) and users (the avatars).
The easiest way to achieve fluid communication is by being able to have a conversation with the system, that is to say, with a conversational assistant who can learn from us. Although it still doesn’t have a name, the neural model is being referred to as Project CAIRaoke. A video showed the potential of this tool when combined with mixed reality glasses (which superimpose digital images on the physical reality). For example, a man is cooking at home and a voice tells him what to do, step by step, while the ingredients he needs light up when he has to use them.
During his presentation, Zuckerberg also showed off BuilderBot, a voice-powered bot that allows users to change their virtual surroundings with voice commands. “Put a sea there,” said Zuckerberg’s avatar, and suddenly a digital ocean showed up. “Now let’s add an island over there, and some cumulus clouds.”
But there is another project that is more ambitious than these assistants: a real-time universal speech translator. Until now, existing translation services take any text or audio in the language of origin, translate it into English, and from there to the target language. This method increases the probability of making mistakes and missing nuances.
Meta is working on a system that will eliminate this middle step, using self-supervised machine learning to make direct translations without going through English as an intermediate step. The goal is to make meetings between avatars of different nationalities as natural as possible in the future metaverse. “The ability to communicate with anyone in any language is a superpower that people have dreamed of forever,” said Zuckerberg. In the metaverse, according to this plan, we will all have this ability thanks, once again, to AI.
The company is hoping to include 20% of the world population (around 1.5 billion people) who do not speak any of the more popular languages such as English, Spanish or Mandarin Chinese, and who are less well served by conventional translation tools.
Zuckerberg also talked about Meta’s new supercomputer whose AI research will help make all these projects possible. But he did not mention Quest, the virtual reality glasses that are the gateway to the metaverse and which, according to rumor, might already be ready to enter the production stage.
“The kinds of experiences you’re going to have in the metaverse are beyond what is possible today in the digital sphere. You’re going to feel what you see and touch, and that will require advances in a wide range of areas, hardware and software,” said Zuckerberg in his presentation. “The key to unlocking these advances is AI.”