AITemplate (AIT) is a Python framework that translates deep neural networks into CUDA (NVIDIA GPU) / HIP (AMD GPU) C++ code for fast inference services. Highlights of AITemplate include:
- High performance: Approaching roofline fp16 TensorCore (NVIDIA GPU)/MatrixCore (AMD GPU) performance on major models, including ResNet, MaskRCNN, BERT, VisionTransformer, Stable Diffusion, etc.
- Unified, open, flexible: Seamless fp16 deep neural network models for NVIDIA GPUs or AMD GPUs. Completely open source, Lego-style, easily extensible, high-performance primitives that support new models.
Install
Hardware Requirements:
- NVIDIA : AIT is only tested on SM80+ GPUs, not all cores will work on older SM75/SM70 (T4/V100) GPUs.
- AMD : AIT is only tested on CDNA2 (MI-210/250) GPUs, older CDNA1 (MI-100) GPUs may have compiler issues.
clone code
When cloning code, use the following command to clone submodules at the same time:
git clone --recursive https://github.com/facebookincubator/AITemplate
Docker image
We strongly recommend using AITemplate with Docker to avoid accidentally using the wrong version of NVCC or HIPCC.
- CUDA:
./docker/build.sh cuda
- ROCM:
DOCKER_BUILDKIT=1 ./docker/build.sh rocm
This will build a ait:latest
docker image for label
#AITemplate #Homepage #Documentation #Downloads #Meta #Open #Source #Python #Framework #News Fast Delivery