» Current | 2018 | 2017 | 2016 | Subscribe

Linley Newsletter

Turing T4 Targets AI Inference

October 16, 2018

Author: Linley Gwennap

Nvidia wants to take over the AI inference market, and its newest weapon is the Turing architecture. Whereas its predecessor, Volta, focuses on fast matrix multiplication for floating-point values, Turing adds support for more efficient integer data types that have become common in neural-network inference. As a result, the new architecture doubles Volta’s per-core throughput at roughly the same power.

Rather than unleashing a full-blown Turing design to replace the high-end V100 board, the company deployed the new architecture only in smaller configurations that mainly target PC graphics (for both gamers and professionals). The V100, which just entered production late last year, remains Nvidia’s primary data-center offering for neural-network training. For inference, the company began shipping a Tesla T4 card that uses the TU104 die and, at 70W, fits into a standard PCIe power envelope.

Most cloud-service providers run the majority of their AI inferencing on Intel Xeon processors. This approach treats inference like any other workload, allowing them to perform the task on any number of servers as needed to handle demand fluctuations. But the strong performance of the Tesla T4 should start to change this approach. The accelerator fits into a standard server but delivers 10x more ResNet-50 performance than a high-end Xeon Gold processor. It will continue to have a strong lead even when Intel’s Cascade Lake becomes broadly available. Nvidia also offers a complete software stack, such as the TensorRT tool that converts trained neural networks from FP to integer weights and optimizes them.

Subscribers can view the full article in the Microprocessor Report.

Subscribe to the Microprocessor Report and always get the full story!

Purchase the full article

Events

Linley Fall Processor Conference 2018
Covers processors and IP cores used in embedded, communications, automotive, IoT, and server designs.
October 31 - November 1, 2018
Hyatt Regency, Santa Clara, CA
More Events »

Newsletter

Linley Newsletter
Analysis of new developments in microprocessors and other semiconductor products
Subscribe to our Newsletter »