A Fortune article by Barb Darrow.

Few interesting highlights:

  • IBM used 64 of its own Power 8 servers—each of which links both general-purpose Intel microprocessors with Nvidia graphical processors with a fast NVLink interconnection to facilitate fast data flow between the two types of chips.
  • Expanding deep learning from a single eight-processor server to 64 servers with eight processors each can boost performance some 50 to 60 times
  • system achieved 95% scaling efficiency across 256 processors
  • In terms of image recognition, the IBM system claimed a 33.8% accuracy rate working with 7.5 million images over seven hours. The previous record, set by Microsoft, was 29.8% accuracy, and that effort took 10 days
  • Caffe deep learning framework created at the University of California at Berkeley was used in the technology.
  • Popular Google TensorFlow framework can likewise run atop this new technology