Intel and Microsoft release fruits of AI collaboration in Project Brainwave

Intel has been hyperactive in AI; both at the hardware level with its dedicated Field Programmable Gate Arrays (FPGAs), and on the software front, applying associative learning to support decisions in quality control and maintenance. The biggest splash was made by the announcement of its close collaboration with Microsoft over application of its FPGAs in the latter’s Project Brainwave, which has just been released.

Intel and Microsoft were once almost joined at the hip in the early days of personal computing in the 1980s, and recently their collaboration has intensified again around AI, with Project Brainwave woven around FPGAs to accelerate AI applications. At this stage, many ‘real-time’ AI applications will require dedicated hardware to meet performance expectations given the huge scale of the data sets involved coupled with the complexity of computation in deep learning.

Project Brainwave itself is based on deep learning, where heavy training is required to extract value from a data set through recognition of patterns and correlations between specific combinations of values across multiple dimensions – sometimes through time.

Brainwave has been designed to map the deep learning process to a specific hardware architecture comprising three layers. At the top is a distributed layer allowing use of different hardware designed to combine performance and efficiency, then in the middle is the neural networking engine where deep learning algorithms are embedded into the FPGAs. The third layer is the compiler and runtime system for deploying the models after pre-training.

A key aspect is that Microsoft has attached the FPGAs to its data center network so that they provide in effect hardware microservices to the whole AI process. In this way the process bypasses the CPUs of general purpose servers via the FPGA layer, which Microsoft claims reduces latency well below say Google’s TPUs (Tensor Processing Units) – although we only have their word for it.

However, Microsoft did announce that FPGA-based architecture can run ResNet 50, an industry-standard DNN (Deep Neural Network) benchmark, requiring almost 8 billion calculations, without batching. ResNet50 is a 50 layer so called Residual Network developed by Microsoft itself to make deep networks with many layers operate more efficiently, avoiding pitfalls such as overfitting as they go. Residual learning is a somewhat iterative process where new layers are added initially by repeating ones already created, while ensuring that something extra is learned in the process. Then the residual learning comes to an end when virtually no added benefit is gained by adding another layer.

But note that the TPU runs TensorFlow, Google Brain’s second generation neural network framework. The name TensorFlow is derived from the operations such neural networks perform on their multidimensional data arrays, referred to as tensors. This enables them to exploit the mathematics of tensors, describing operations between such data arrays that played a crucial role in Einstein’s development of general relativity – but that is another story. The advantage was that TPUs outperformed general purpose GPUs which themselves are more efficient for learning tasks than most CPUs.

Here that performance claim is interesting, in so far that Microsoft and Intel are running with FPGAs while Google’s TPU is an ASIC (Application Specific Integrated Circuit), which at first sight might seem to have the edge. ASICs predated FPGAs and are still widely used for dedicated tasks benefiting from silicon design optimized specifically for that process, minimizing footprint and consequently boosting performance as on-chip latency is reduced.

Their disadvantage is their lack of adaptability which renders them unsuitable for tasks where requirements or standards evolve over time. FPGAs emerged as a kind of compromise by allowing reprogramming but are still usually dedicated to particular fields and cannot run general purpose software. Especially in the early days they were larger and less efficient than ASICs, particularly for high volume embedded applications where the onetime costs can be more readily absorbed while the savings through having smaller chips and fewer on board components mount up. FPGAs tend to need additional components perhaps for I/O for example. It is true this distinction has been muddied slightly by development of reconfigurable ASICs to provide some limited programmability, but it still exists.

Microsoft and Intel decided that FPGAs are the way to go, because now they can be just as low latency as ASICs for AI tasks at very competitive costs. A key advantage here is the desire to develop a general-purpose cloud-based AI platform that combines flexibility and performance at a reasonable price. After all, one of the often-touted advantages of the cloud is its ability to use affordable off-the-shelf hardware – so relying on dedicated ASICs would fly in the face of that.

It was not surprising then to hear Mark Russinovich, CTO for Microsoft’s Azure cloud computing platform, announce at its Build developers conference in Seattle that the preview of Project Brainwave marks the start of efforts to bring the power of FPGAs to customers for a variety of purposes. “I think this is a first step in making the FPGAs more of a general-purpose platform for customers,” he said.

The main benefit for data scientists and developers is the ability to apply DNNs to a variety of real-time workloads and application sectors such as manufacturing, retail and healthcare, across a large accelerated cloud. They can train a model and deploy it on Project Brainwave, exploiting the Intel FPGAs, either in the cloud or at the edge.

Intel’s other recent Saffron announcement was not directly related to Brainwave, involving associative learning that is sometimes presented as complementary or even alternative to deep learning. It evolved in animals to associate different stimuli in its classical form, the most celebrated example being the Pavlovian response where a dog learns to salivate when hearing a sound associated with arrival of a meal. One stimulus replaces another though association.

The second form of associative learning more relevant for AI is ‘operant conditioning,’ sometimes known as ‘instrumental conditioning,’ where an association is strengthened or modified by reinforcement or punishment. In AI, punishment would equate to some form of negative feedback, but the main point is that this allows association to become relative and more sophisticated than classical conditioning through use of weights and categorization. Intel has applied this in its new Saffron AI Quality and Maintenance Decision Support Suite, which has two components.

One is the Similarity Advisor, which seeks the closest match to an issue under review, taking account of both resolved and open cases, identifying solutions from previous cases. Then the second element, the Classification Advisor, automatically assigns work issues into pre-set categories, which can either be specified or self-defined, speeding up and increasing reporting accuracy while improving operations planning.

It has one customer so far, the worldwide services group Accenture for its Touchless Testing Platform launched in April 2017 for automated software testing based on analytics. Accenture claims that the incorporation of Saffron AI has reduced testing time by between 30% and 50% by cutting down on redundancy in the testing processes.

Intel Saffron AI applies associative memory learning and reasoning to probe structured and unstructured text data looking for patterns, trends and similarities. These include insights about previous issues, how they were resolved, who resolved them and what information was needed. Such information was not always explicit in the data and therefore did not show up via normal word searches. The advantage is the ability to learn from incomplete data without statistical models that need to be trained. It can divine relationships between words.

Intel has got into associative learning relatively recently through acquisition, notably Saffron Technologies which it bought in October 2015. Saffron Technologies had developed incremental learning techniques to build up connections between entities in data along with the context of their connections and their raw frequency counts. This was stored in associative memory defining entities broadly as people, places and things. The technology then supposedly mimics human memory by recalling associations between those people, places and things, and specifically the context and frequency of association. Since each entity has its own memory about all the other entities it is associated with, the engine can apply human-like techniques to learn.

One attraction of Saffron for Intel was that the technology appeared scalable and efficient in handling large datasets, as it is doing for Accenture. It can also work on compressed datasets because of its abstract nature enabling reduction in storage and CPU hardware, which makes it more suitable for a distributed environment.