SageMaker Sample Notebooks for LLM¶
In this section, we provide some sample instruction to use LMI container on SageMaker.
For the list of LMI containers that is on DLC, please click here.
For the list of available BYOC containers, please clck here.
For more information on LMI documentation on SageMaker, click here.
For all the serving.properties options you could set on DJLServing, click here.
Sample notebooks¶
LMI_Dist Rolling Batch¶
- LLAMA2-70B
- LLAMA 7B rolling batch with customized model.py
- LLAMA 7B rolling batch with stop reasoning
- LLAMA 7B rolling batch with chat template
- Flan-UL2
- MPT-30B
- Octocoder
- CodeGen 2.5
- Falcon-40B
- CodeLLAMA-34B
- LLAMA2-13B-GPTQ
- LLAMA2-70B-GPTQ
- Mixtral-8x-7b
HF Acc Rolling Batch¶
- LLAMA2-70B
- LLAMA 7B rolling batch with customized model.py
- Mistral 7B
- Falcon-40B
- CodeLLAMA-34B
- LLAMA2-13B-GPTQ
DeepSpeed (deprecated)¶
FasterTransformer (deprecated)¶
VLLM Rolling Batch¶
Neuron inference¶
- OPT RollingBatch
- LLAMA-13B RollingBatch
- LLAMA-7B Int8 RollingBatch
- Mistral-7B RollingBatch
- LLAMA-70B RollingBatch
- Speculative Decoding deployment guide for Neuron
TensorRT-LLM Rolling Batch¶
- LLAMA-13B RollingBatch
- LLAMA-7B RollingBatch with Customized Processing
- Baichuan2-13B RollingBatch
- ChatGLM3-6B RollingBatch
- Qwen 7B RollingBatch
- Mixtral-8x-7b