Google’s DeepMind AI wing has scored some major Brownie points with its parent, after Google announced that it had deployed AI-based algorithms to optimize data center cooling systems. The installation has reportedly saved Google around 30% on these cooling power bills, since they were deployed in 2016.
Resource usage optimization is one of the most promising areas for AI-based systems, thanks to IoT sensors being able to generate vast amounts of data that can be crunched by these programs. Data centers share much in common with other industrial facilities, but quicker equipment refresh cycles mean that there’s greater opportunity to install newer sensor-filled equipment.
In Google’s facilities, the DeepMind code was given control of the cooling equipment, and is apparently significantly better at this job than humans. As power is typically the largest expense in a data center, this is a huge improvement. However, because so little detail has been published, it’s not clear to what extent the DeepMind system is not simply a glorified Nest thermostat.
When the system was first installed, in 2016, it was offering suggested actions to human operators. This led to a 15% saving. Now, it appears that direct control has improved the outcome, but again, it’s not clear if this is just down to human’s being bad at following these suggested instructions in a timely fashion.
If the results are purely down to the machine-learning system, automating the cooling operations, then DeepMind has found a rather good demonstration of the business value of its technology. It might, however, want to ease off on the buzzwords in the marketing materials, as we get the distinct impression that the industry is souring on them.
DeepMind’s announcement says that it now powers multiple Google data centers. The system creates a snapshot of all the environmental data it can pull from the data centers, in five-minute intervals. These snapshots are then fed to the deep neural networks (DNNs), which are then being used to predict how changes will impact cooling.
That mechanism then comes up with its suggestion, which it then kicks off to the automated control system. Crucially, these commands have to get past the data center’s own local control system, to ensure that the commands aren’t going to break any local rules – such as drawing too much power or causing spikes in emissions, for instance.
DeepMind says that the feedback from the pure suggestion system, from the data center operators, was that while they had learnt some new best-practices, such as balancing the cooling load between more appliances, they found that implementing the suggestions required too much effort and supervision.
As such, they wanted to automate the process, and DeepMind has provided – using a confidence threshold to determine which actions are likely good ones. If the suggestion breaches a local rule, the fail-over system moves to a neutral state, so that there’s less risk of boiling the solder in the CPUs if it makes a mistake. That local rule also ensures that the local operator, if in some exceptional panic mode, can retain full control over their environment. Those are lessons that are rather important for industrial environments.
DeepMind expects the process to improve its efficiency scores further. It notes that its developers have constrained the system’s suggestion boundaries, in order to ensure safety. To this end, it notes that if it were to widen those thresholds, there’s a greater reward to be had, with the tradeoff being higher risk. As the DNN gets more example data over time, it should get better at this specific task.
Google data center operator Dan Fuenffinger said “it was amazing to see the AI learn to take advantage of winter conditions and produce colder than normal water, which reduces the energy required for cooling within the data center. Rules don’t get better over time, but AI does.”