Nvidia's T4 makes an appearance in several of Supermicro's newest servers, but we expect these speedy AI inference GPUs to experience broad uptake in the data center. The T4 comes bearing the same Turing architecture as Nvidia's GeForce RTX 20-series gaming graphics cards, but is designed for neural networks that process video, speech, search engines, and images.
The Tesla T4 GPU comes equipped with 16GB of GDDR6 that provides up to 320GB/s of bandwidth, 320 Turing Tensor cores, and 2,560 CUDA cores. The T4 features 40 SMs enabled on the TU104 die to optimize for the 75W power profile.
The GPU supports mixed-precision, such as FP32, FP16, and INT8 (performance above). The Tesla T4 also features an INT4 and (experimental) INT1 precision mode, which is a notable advancement over its predecessor, the P4.