Apple, Baidu, Science and Technology News, which is the voice semantic technology?

From 1945 to 2015, many beautiful stories were told about artificial intelligence. Imitating the game is one of them. But in the spring and winter reincarnation, in the world of artificial intelligence, people's fantasies are like a small ball, thrown high, and destined to be miserable (two artificial intelligence winters in 1974 and 1987).

This time, people's fantasies about artificial intelligence are thrown higher. Strong artificial intelligence and machines that destroy humans are commonplace. But this is more like a prophet, like conspiracy theory. The current artificial intelligence is still based on logic and data, no intuition, and the understanding and creativity of emotions is impossible to talk about.

When the ball is thrown high, it will surely fall badly, and even smash it. However, the person doing the work is more cautious, to tie the ball to a rope. This rope is "from the practical application, from the user's point of view".

The current state of speech semantics can roughly describe the application of artificial intelligence in the engineering community. Several companies that have their own voice-speech-related technologies and can use their services in China include: Going Out, Keda Xunfei, Yunzhisheng, Si Bi Chi, Baidu (Tibe), Tencent (Tencent Little Whale) and Google ( Google Now), Microsoft (Little Ice, Xiaona Cortana), Apple (Siri).

Which voice semantic technology cow?

Microsoft Xiaobing belongs to the chat bot, which focuses on man-machine dialogue and often unlocks a game every week. But Microsoft Xiao Bing can't provide too many substantive services, such as checking air tickets, inquiring about tickets, and calling cars.

How to objectively evaluate the pros and cons of technology? In this era of frequent information exchange, most of the artificial intelligence algorithms used come from the public achievements of the academic world. It is difficult for a company to have technology that does not belong to this era. From the perspective of speech recognition, the recognition rate of each company is mostly above 90% (this problem is actually more complicated, such as the dialect recognition rate, the recognition rate of uncommon words, etc.). When a company's technology is only 5% or 1% better than the other, it is difficult for users to feel the pros and cons, and the scene in which the technology is applied becomes an important difference.

Basic implementation method of artificial intelligence speech semantic system

The voice semantic technologies of these companies are inseparable from the "cloud". The artificial intelligence system uses a set of logical reasoning procedures to find the answers that are considered to be the most correct in the vast amount of data. This means that the wider the area covered by the artificial intelligence system being used, the more eager it is for data and computing power. Small offline devices, such as mobile phones and home computers, cannot meet this demand. The solution is to build a "computer cluster" with super processing power and massive data, and connect it to the network, which is what we call "cloud." The cloud that builds the AI ​​voice semantic system is the AI ​​voice semantic cloud, which is the infrastructure of these AI companies.

After building a smart cloud, you need a suitable voice input port. This port can come from software such as the public number, APP, or hardware such as speakers and wearable smart watches.

How to correctly view the well-known speech semantic system currently on the market?

Although the concept of artificial intelligence systems is very broad, the concept of speech semantics is still very wide, so so many similar but different artificial intelligence companies are deeply ploughed in different directions. From a user perspective, the differences between these products are quite large.

Tencent whales who are good at searching for query messages and go out to ask

1. Tencent's little whales currently have less information to disclose and are mainly embedded in the TOS system. Similar to TIcwear, the TOS system is being installed in smart watches. However, the function of the little whale is similar to that of other voice assistants. You can set an alarm clock and query information, but you can't call the car or order coffee just like you ask.

According to information, engineers who have worked at GoogleBrain have created a company called ScaledInference. The company's seed round investors include Tencent's chief exploration officer David Wallerstein and Tencent itself. In addition, there are also reports that Tencent has not stopped research and development of other visual identity products.

This is a bit like the development of WeChat that year.

2, go out to ask mainly to do 2C (for consumers) hardware products, its smart watch TIcwatch is a full-interaction smart watch with high sales in China. The artificial intelligence interaction technology that goes out to ask questions is self-contained, including speech recognition, speech synthesis (TTS), semantic understanding, vertical search, and intelligent push. Its biggest feature is that it can voice localized life service, navigation, news, weather, etc. can be achieved through mobile APP or watch.

Recently, I went out and asked about the introduction of the "magic question" in the beta version. I want to upgrade the artificial intelligence from querying information, such as "what movie is there today" to providing a complete service, such as directly helping users book movie tickets. This is something that other smart systems can't do.

"Chat robot" that is good at continuous dialogue has Turing robot and secret

3, Turing robot mainly do 2B business, its speech semantic system is very similar to Xiao Bing, is a continuous dialogue robot. Turing can be embedded in WeChat public account, APP, website or smart hardware. Users can let it tell jokes, play games, and use it to check express and check news. Recently, Turing Robot accepted the investment of Aofei Animation and took the road of intelligent children's toys.

4. The secret is a continuous dialogue robot similar to Xiao Bing launched by Baidu. In addition, users can query the information "Who is Turing" at the same time as the dialogue - this is based on Baidu's search engine. Recently, Baidu launched a "small degree" of physical robots embedded with "degree secrets".

There is also a type of artificial intelligence system that is not an assistant or a chat bot, but is used to make intelligent voice interaction solutions. They have the voice cloud, the cloud, the spirit

5. The “Xunfei Voice Cloud” of Keda Xunfei includes technologies such as speech synthesis, speech recognition and search, and voice dictation. The speech recognition rate of the cloud dialect and uncommon words is relatively high. Xunfei voice input method is based on this cloud development. The “main” of the University of Science and Technology is to do 2B (for enterprise users) services, and to license their own artificial intelligence services to other enterprises.

In addition, in recent years, HKUST has also launched 2C (consumer-oriented) products, such as smart speakers, recording treasures, recording pens and other hardware. It can be seen that the products of HKUST are mostly intelligent traditional products, not personal assistants or chat bots. In addition, the University of Science and Technology has a tendency to deepen "speech recognition" rather than continuous dialogue.

6. Yunzhisheng mainly does 2B business, and the direction is smart home and car market. Here, there is a concept of “soft and hard combination” similar to the question of going out.

Unlike "smart voice semantic systems on wearables, mobile phones, and computers," the sound source of a smart home voice system can be far away and surrounded by noise. For this reason, Yunzhisheng focuses on the promotion of “source recognition, noise suppression, echo cancellation” and instruction recognition, rather than understanding the user's intention (semantic understanding).

It is worth mentioning that when Jingdong and Keda Xunfei announced "together", Ali and Yunzhisheng reached a cooperation. However, there has been no relevant news in the recent period.

7. The direction of Si Bi Chi and Yun Zhisheng is quite similar, providing intelligent hardware voice interaction solutions for enterprise customers, such as switching the intelligent hardware through voice. Spitzer has more cooperation in the automotive field, such as car radish. Recently, the company has announced that it has completed the financing, and will take the road of “intelligent voice into hardware” to go further.

8. Xiao Bing, Cortana, Siri and Google Now

Cortana, Siri, and Google Now are chat bots and personal assistants.

Microsoft Xiao Bing and Microsoft Xiaona are both based on Bing search and deep neural network technologies, and belong to 2C products. Microsoft Xiaona can now be used in iOS, Android, Window 10 systems, she is more like a secretary, can call, send text messages, send emails and query the previous Manchester United score. However, Microsoft Xiaona has much less function in domestic mobile phones, and the voice response is very slow.

As a dialogue system, Microsoft Xiaobing is stationed in WeChat, Weibo and other applications or Windows 10 system. She can also query various information. Unlike Xiaona, Xiao Bing can't call apps like the phone. She is more like a virtual character, can chat with the user (continuous dialogue), play games (based on artificial intelligence technology such as image recognition).

In general, Microsoft Xiaona and Xiao Bing represent Microsoft's two directions: Xiao Bing has a strong dialogue and strong portability, so it can be embedded in WeChat, Weibo, and Meipai applications. Xiaona is an artificial intelligence system for Window 10 that can call various applications and data in Windows 10 and is considered a personal assistant.

Everyone familiar with Siri is a chat bot that can call applications. Today, Siri can search for photos, play music, and provide suggestions in addition to alarm clocks and text messages. It may not be the most powerful, but more user-oriented.

Google Now can't be used well in China, the recognition effect on Chinese is very poor, and the Chinese version is also very weak. But as a Google product, it is born with more data and better data mining capabilities than other artificial intelligence products, which makes Google Now an "active push" capable assistant product.

Google Now's "close relatives" with Chrome apps, Gmail, Google Calendar, and other Google apps, as well as the Android system, make it possible to implement features that many other products can't. For example, voice unlocking, any interface "OK, Google" wakes up, actively pushes the "spots" you like and what you need. Someone once praised:

“Wake up in the morning, I was amazed to find that Google Now actually told me the time it took to work on a part-time job. But I didn’t set it up, and that’s not where I really work.”

As you can see, Google understands when and where he is part-time from the user's geographic and practical information. However, it is still important to emphasize that Google Now has few features in the country, and that it implements some instant functions, users must hand over personal data to Google.

As a product launched by the giants, Xiaona, Siri and Google Now are often used by the media to compare, but this comparison is somewhat inappropriate. On the one hand, among the several products, only Siri's ease of use is good in China. On the other hand, artificial intelligence systems must understand the user's intentions and must obtain data—the more the better. Therefore, artificial intelligence systems must be stationed in more software or platforms to obtain higher privileges. Although Xiaona has settled in Android and iOS, she is still a Microsoft product, and it has better usability on Windows 10. Similarly, Google Now and Siri are products of the Android camp and the iOS camp respectively. The former is good at mining data and actively pushing. The latter is good at chatting and passive suggestion.

It can be seen that these voice-oriented artificial intelligence companies have business 2B or 2C, choose soft and hard combination or only software solutions, and deepen smart home or smart wear. These choices are right or wrong. But overall, artificial intelligence still has no cognitive ability, and human and artificial intelligence chat involves emotional input – this is a problem beyond the natural sciences. The user's awareness of the smart home is not high enough, the related products are not mature enough, and the pricing is not enough for the people. In this way, it seems that the ecology of a smart home system is too far away.

Therefore, whether the artificial intelligence can be detonated on the "one point" in the ordinary consumer, for example, the artificial intelligence system as an assistant can be upgraded from only the ticket information to the ticket service.

Intelligent Blackboard

smart touch screen whiteboard for meeting room and classroom,smart Interactive Whiteboard for conference video system and teaching;

smart board,smart interactive flat panel,smart touch screen interactive,smart interactive whiteboard

Jumei Video(Shenzhen)Co.,Ltd ,

This entry was posted in on