Categories
Misc

Delivering 1.5 M TPS Inference on NVIDIA GB200 NVL72, NVIDIA Accelerates OpenAI gpt-oss Models from Cloud to Edge 

Open AI and NVIDIA logos.NVIDIA and OpenAI began pushing the boundaries of AI with the launch of NVIDIA DGX back in 2016. The collaborative AI innovation continues with the OpenAI…Open AI and NVIDIA logos.

NVIDIA and OpenAI began pushing the boundaries of AI with the launch of NVIDIA DGX back in 2016. The collaborative AI innovation continues with the OpenAI gpt-oss-20b and gpt-oss-120b launch. NVIDIA has optimized both new open-weight models for accelerated inference performance on NVIDIA Blackwell architecture, delivering up to 1.5 million tokens per second (TPS) on an NVIDIA GB200 NVL72 system.

Source

Leave a Reply

Your email address will not be published. Required fields are marked *