Traditionally, computing power is associated with the number of CPUs and the cores per processing unit. During the 90s, when WinTel started to invade the enterprise data center, application performance and database throughput were directly proportional to the number of CPUs and available RAM. While these factors are critical to achieving the desired performance of enterprise applications, a new processor started to gain attention – Graphics Processing Unit or GPU.
For many of us, GPUs remind us of the video cards that were designed for graphic-intensive games. These were purely optional, which didn’t influence the buying decision of an average user investing in a PC or server. Only gaming junkies playing popular PC games like Quake and Half-Life appreciated the power of GPUs. But in the era of Machine Learning and Artificial Intelligence, GPUs found a new place that makes them as relevant as CPUs.
But, why is the GPU getting so much attention now? The answer lies in the rise of deep learning, an advanced machine learning technique that is heavily used in AI and Cognitive Computing. Deep learning powers many scenarios including autonomous cars, cancer diagnosis, computer vision, speech recognition, and many other intelligent use cases.
Like most of the ML algorithms, deep learning relies on sophisticated mathematical and statistical computations. Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) are some of the modern implementations of deep learning. These neural nets emulate human brains with close resemblance to neuroscience. Each type of neural net is aligned with a complex use case like classification, clustering and prediction. For example, image recognition and face recognition use CNN while Natural Language Processing (NLP) relies on RNN. ANN, the simplest type of neural network, is often used in predictions involving numerical data.
Irrespective of the type of neural network used, all the deep learning algorithms perform complex statistical computations. From typical stock market data to a picture of a cat to a radiology report, everything is first decoded into a set of numbers. In the case of images, the first step is to turn them into the grayscale format and then to assign a number to each pixel depending on how light or dark it is. As we can imagine, a simple selfie shot from a mobile phone translates to few million pixels, which in turn translates to a large matrix of numbers. During the training phase of deep learning, these matrices of numbers are fed as input into the neural network along with the correct classification. For example, by training the neural network with 1000s of cat images, we are going to get a model that can easily recognize a cat visible in a photo. This training process is all about correlating multiple pixels (numbers) to find patterns of a cat image. The correlation involves multiplying millions of matrices with each other to arrive at the right result. To increase the training speed, these operations need to be done in parallel.
Typical CPUs are designed to tackle calculations in a sequential order, which means each mathematical operation will have to wait for the previous to complete. A CPU with multiple cores may marginally speed up the calculation by offloading the operations to each core. But as we know, CPUs with large numbers of cores are prohibitively expensive, making them less optimal for training neural networks.
Enter GPUs, and we have a processor with thousands of cores capable of performing millions of mathematical operations in parallel. There is a similarity between graphic rendering and deep learning. Both these scenarios deal with a huge number of matrix multiplication operations per second. That’s one of the reasons why laptops or desktops with high-end GPUs are preferred for deep learning. Nvidia has a programming model for GPU called CUDA that lets developers build parallel programs.
There is another class of GPU called GPGPU, which is a general-purpose GPU meant for parallelization. In advanced research and development, GPGPUs are powering the deep learning experiments.
To put things in perspective, Nvidia’s latest GPUs come with 3,584 cores while Intel’s top end server CPUs may have a maximum of 28 cores.
The rise of the GPU doesn’t result in the death of CPU. We still need those beefy processors to do the heavy lifting. The combination of CPU and GPU along with sufficient RAM makes a perfect testbed for deep learning.
The growing importance of GPUs in computing is putting pressure on Intel. In the last couple of years, it made two strategic acquisitions in the form of Altera and Nervana. Altera designed a type of chip called a field programmable gate array (FPGA), a chip that can be programmed later for niche use cases. The Altera acquisition cost Intel $16.7 million. FPGAs are becoming popular with cloud providers like Microsoft building massive data centers that host a diverse set of customer workloads. Microsoft has turned out to be one of the largest customers of Intel for FPGA chips.
In August 2016, Intel invested a whopping $400 million to buy Nervana Systems, a startup specializing in deep learning. This acquisition gave Intel an edge to compete with its archrival in the GPU segment – Nvidia. After a year, Nervana became the key asset and an umbrella brand for Intel to project its investments in Artificial Intelligence.
The growth of the GPU has brought Nvidia into the limelight. While the traditional PC and server market is witnessing a decline, the GPU market is getting hot. Top public cloud providers including Amazon, Google, IBM and Microsoft offer GPU-based VMs in the cloud. These virtual machines are powered by the latest Nvidia Tesla processors, which deliver the required performance for training deep learning models. Cloud customers are taking advantage of the pay-by-hour pricing model to access these specialized machines on demand.
Google, one of the pioneers in AI and deep learning, has announced Tensor Processing Unit or TPU – a chip that is designed to perform complex math operations with massive parallelism. A version of TPU called Cloud TPU is available for customers using Google’s IaaS offering, Google Compute Engine.
Artificial Intelligence and Machine Learning are changing the landscape of enterprise IT. The recent interest in GPUs is squarely attributed to the rise in AI and ML.