Categories
Misc

CUTLASS 3.x: Orthogonal, Reusable, and Composable Abstractions for GEMM Kernel Design

GEMM optimization on GPUs is a modular problem. Performant implementations need to specify hyperparameters such as tile shapes, math and copy instructions, and…

Source

Leave a Reply

Your email address will not be published. Required fields are marked *