What surprises do Apple and Qualcomm have for far-field voice interaction?

(Original title: What does Qualcomm and Apple bring to the far-end voice interaction?)


The author of this article, Li Zhiyong, co-founder of Acoustic Intelligence Technology, has a fun public account of “reliefs”.

When reporting on Apple, the reporter always used this analogy: Of the 199 countries tracked and researched by the World Bank, 183 countries had a lower gross domestic product (GDP) than Apple's market value in 2015. Apple's market value is almost the sum of Iran's and Austria's GDP. From this perspective, the competition between Big Macs like Apple, Google and Amazon is very much like a national war. The fundamental feature of the national warfare is that whether you like it or not, it will eventually be involved. Even if you seem to be irrelevant to this, it is just like the Internet and the traditional retail industry.

Ultimate Ecology and Coming Competition

Until now we have seen three types of very different successes in the IT industry:

One type is hard technology companies, such as Intel and Qualcomm. Although such hard technology projects have ecological aspects, technical barriers still occupy a greater weight in the business model;

One is a tool platform company. At this point, Dacheng is an operating system. A typical company is Microsoft. For the operating system, technology, engineering, and so on are critical, but what is even more critical is the applied ecosystem. So although Linux is free, it is not possible to leverage Microsoft's position on PCs.

One type is the Internet ecosystem company, which we will be more familiar with, such as BAT we often say. The core difference from the previous category is that Windows is not data-driven at all, and most of the time Internet eco companies rely on their data and content. The social relationship chain is its core content. The content that crawls when searching is its content, and the electronic product is its content.

This industry division of labor has built-in sufficient technical rationality and is unlikely to change substantially. But what may change is the role the company plays in it. For example, Google eventually controls the operating system and search engine at the same time, which will affect the final competition. The actual role played in the final setup affects control, and control influences the business model and profit space.

With regard to the size of control, the non-data-driven OS in the middle tier has the greatest influence, followed by the Internet eco companies, and the weakest are hard technology companies. This is particularly clear in the recent Apple PK Tencent. So we can also say that until Android, Google's business model is not solid, so it is so motivated to break through at the OS level.

In this way, the focus of the national war is who can control the OS under far-field voice interaction. At this point, the outcome is likely to be different from the past because these companies are too big.

Apple does not have to say that it will certainly be its own closed ecosystem. Google, Amazon and Microsoft will definitely reject each other in their respective spheres of influence. In the short term, Google and Amazon are unlikely to win the game and are unlikely to use each other's systems. This is a very interesting situation. In the past, the industry's early giants were not concerned about the precedent of an industry at the same time, but they could not form relatively even competition, such as competition between Windows+Nokia and the Android camp. But this time all the giants in the industry are concerned about this point, and each has nearly endless resources behind their backs. Therefore, the battle situation will certainly be more intense than thought.

As a result, we may face the coexistence of multiple OS in a relatively long period of time.

Where are the chip companies such as Qualcomm?

Every change in human-computer interaction will inevitably lead to the transformation of the OS, which has been verified in at least two major product upgrades in the past. From the command line to the graphical user interface, an operating system such as Windows is enabled, and from the keyboard, mouse, and touch screen, iOS and Android are facilitated. Therefore, we have reason to believe that this far-field speech interaction will cause changes at the OS level. In this context, it is particularly interesting to see the role and behavior of such chip companies as Qualcomm.

Qualcomm recently introduced a SoC, which allows the IPQ40x8/9 to support array algorithms. It should be realized by DSP from various parties. Rahul Patel, senior vice president, said: It is possible that Echo's voice features will be integrated into the AP. This is a new trend. As a result, some traditional vendors such as Conexant are really upset, because in the past they used Qualcomm and other methods to implement a set of functions. Under this situation, Qualcomm has done everything. It is equivalent that they are not very good at finding their place.

However, the fact that Qualcomm does not actually do the right thing and puts it in the context of the entire human-computer interaction changes is equivalent to the need to answer: Is the OS suitable for being placed in a chip?

Obviously, the OS can't be put in the chip. The algorithm can be put in the chip, but the algorithm is a part of the far-field voice interaction and is not suitable for cutting out.

This can be explained by taking the wake-up example as an example. When waking up, it usually needs to be associated with the lighting of the final product. In this way, firstly, there must be a noise reduction algorithm to increase the arousal rate. Then the trained arousal model monitors the surrounding sounds. Once the arousal word is detected, it needs to feedback a specific angle. This angle information is then passed to the system and the system knows Echo. Which light should be on that circle? Obviously in this scenario, algorithms, messages, and hardware controls are intertwined with each other. This is the scope of the OS, but not the scope of the chip.

In general, the more messages there are, the more powerful the OS will be. Such a system is obviously impossible to put on the chip. Only some of the algorithms can be put on the chip, but in the early stages of the industry, this is actually not very valuable. First, the algorithm is not stable and still needs continuous improvement. First, this chip integration algorithm will reduce the flexibility of the entire system. degree. Assume that company A wants to make a product called X. Its future derivative product is Y. Of course, it wants to achieve both with a unified architecture, and company A obviously does not want this implementation to use only certain chips from Qualcomm. , And want to make flexible choices in a wider range.

Many people on WeChat see that Qualcomm will be shocked by this and other things, and feel that it will have an impact on the industry. However, it does not actually realize that Qualcomm has actually chosen a roadmap. Qualcomm has not been thinking too much about voice interaction. It is not the first time it has been related to voice interactions (JAN 6, 2016 had similar announcements), but it seems that every time it will eventually come to an end.

In this way, the role of the chip company is very clear: the chip company will be the beneficiary in the far-field voice interaction PK, but obviously can not play a leading role. This is very different from when the PC or phone just started up. At that time, if Intel and Qualcomm were not empowered from generation to generation, products such as PCs or mobile phones could not be upgraded from generation to generation. In other words, the chip company is on the critical path. But now there are various chips whose computing power is sufficient, and the interaction mode itself is more tightly integrated with the operating system.

What does Qualcomm and Apple have for far-end voice interaction?

Although Apple's products may not be sold immediately, Qualcomm's SoC estimates can't afford any waves, but they do inject more confidence into this track.

Just as a touch screen affects all devices, far-field voice interaction will also affect all devices. This provides the market with enough new opportunities.

Driven by the giants, existing product categories will generally be upgraded to include automobiles, mobile phones, PADs, notebooks, televisions, toys, cameras, and headsets. And new product categories will continue to emerge, such as the information converters and teleconferencing systems often displayed by XFX.

Large product upgrades will also generate demand for solution providers. The degree of secrecy of the voice is much greater than that of the system when doing mobile phones, and the complexity will be far greater than that of previous mobile phones. In some scenarios, it may impose extremely high demands on power consumption. In some scenarios, it may be more challenging to cost-effectiveness. It is difficult for people who are no longer on the line to understand how this complexity is imported, because theoretically it seems that Qualcomm combines algorithms with chips, and all problems should be solved. But in reality this is not feasible, because there are still a lot of distances between the algorithm and the chip to the product that can actually land: the distance from the number of microphones, the distance from the formation, the distance from the calculation structure, the continuous occurrence of new demands, etc. Import similar questions.

However, this level is indeed highly uncertain. Will there be MTK companies offering Turnkey solutions, or will there be new OSes, or will there be any new variants of the OS, or will there be a long-term multi-OS coexistence?

summary

In the short-term, the impact of Qualcomm and Apple's entry will not be significant, and it will be more at the confidence level. Among the two obvious consequences: one is highly defined, that is, far-field speech interaction must be; and one is highly uncertain. What will happen on the traditional OS layer?

燑br>

LiFePo4 Battery

Our Lithium Battery includes 5G Base Station Backup Power System,like 48V 100Ah/150Ah/200Ah Lithium Battery. 3.2V Prismatic cells,like 3.2V 50Ah/105Ah/202Ah Lithium Battery. And Lithium Ion Pouch Cells, including 3.2V 12-30Ah.

Lifepo4 Battery,Lifepo4 Lithium Ion Battery,Lifepo4 48V 100Ah Lithium Ion Battery,Lithium Ion Battery For Solar 100Ah

Jiangsu Zhitai New Energy Technology Co.,Ltd , https://www.zttall.com

Posted on