mirror of
https://github.com/intel/llvm.git
synced 2026-01-22 23:49:22 +08:00
This work adds `nvgpu.tma.async.load` Op that requests tma load asyncronusly using mbarrier object. It also creates nvgpu.tma.descriptor type. The type is supposed be created by `cuTensorMapEncodeTiled` cuda drivers api. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155453
Multi-Level Intermediate Representation
See https://mlir.llvm.org/ for more information.