In the business intelligence (BI) world, more and more companies are talking about machine learning (ML) being leveraged in their software. However, try to talk to them about it and there is silence. Infrastructure vendors, IBM, NVIDIA, Intel, Oracle (remember, that’s where Sun went), Qualcomm and more are talking up their chips for ML. Again, try to talk with them about a real business case study, a customer who has implemented a system, and, if you get back anything, you get back anonymous companies described in a paragraph or even just a sentence.
On the other hand, the success of Apple Siri and Amazon Echo, the continued growth of Microsoft Cortana Echo, and the entrance of Google Now show that voice recognition is rapidly becoming mainstream in the consumer world.
So what can we expect in the next two years?
I expect that the ML knowledge gained in voice will rapidly see an enhancement in business voice systems, from CRM, to SFA and pure PBX office software. The basics of the call question and answer structures already exist in those solutions, what ML based voice will bring to the discussion (yes, pun intended) are two things: A more natural voice on the phone and a better understanding of the process flow of the customer on the line. The first is obvious, a less mechanical and smoother flow of the voice gives the caller a higher comfort level when interacting with the system.
The second is also critical, as the “press 1 if … , press 2 if … , ad infinitum” creates stress and slows down the process of solving a problem. Understanding the voice of the customer and better analyzing responses (both different aspects of ML) will more quickly bring the caller to a solution or the appropriate human for further assistance.
Note, also, that is predicated upon another advantage of many business systems. They’re still run from the cloud. That means that these types of voice response aren’t as dependent upon mobile devices handling the ML processing. It’s easier to provide a cloud service, more rapidly expanding or replacing data center resources, as opposed to having to wait on a future generation of mobile devices optimized for ML through FPGA, GPU or other focused hardware.
That will be faster to incorporate into the call infrastructure because of the robustness of both existing infrastructures and the ML voice technology packages. I expect to see good, strong, initial in-roads into the business voice market in the next year.
As I mentioned in a previous article, ML is becoming a blend of BI and AI. The reason is that many of the statistical analysis tools are similar regardless of if they’re being run as a module in an analytics system or by a deep learning engine. ML is expanding. However, if we focus on the AI portion, the deep learning side, we see a different story.
Those learning systems require a lot of specific resources to provide accurate enough results in a useful timeframe, hence the focus on GPUs and TPUs. Infrastructure companies are pushing their latest chips and servers focused on the market, with specifications to show that they will help ML expand.
The reality is slow, as I pointed out in the first paragraph. Getting those systems into the field, even into cloud data centers, takes time and can be a significant cost. Then corporations need to understand their data, always a challenge. Alongside the resource fight to do that, there’s the problem of building deep learning applications. Right now, the maturity of the languages is at 3rd generation. Development requires a heavy knowledge of very new areas, meaning costs to hire the appropriate people are extremely high.
Python, R, and other 3rd generation (heavy coding is involved) languages are adding libraries, but they’re still early. At the same time, containerization is slowly growing and will add some flexibility to deployment, but that’s still a step away from a real, 4th generation language (drag and drop and higher level scripting) or other tools that will help a wider audience more quickly adopt the technology into their systems.
That is the reason that you’ve only seen a few, global-scale, enterprises roll out cloud infrastructures for voice, network risk management, and fraud detection. They can afford the time and expenses necessary to build the systems, but it will take longer for the ML ecosystem to develop the tools necessary to spread adoption.
The power of deep learning is clear, but the barriers to it being integrated into many applications mean that 2018 won’t see a major growth. The industry is still at the point of creating initial infrastructure and applications that have people creating prototypes and proof of concepts. I expect the next few years to see both success in that and advances in tools that will help the technology spread further, but it’s further off than are voice applications.