Create a serving ready model¶
To deploy a machine learning model for inference usually involve more than the model artifacts. In most cases, developer has to handle pre-process, post-process and batching for inference. DJL introduces Translator interface to handle most of the boilerplate code and allows developer focus on their model logic.
There are many state-of-the-art models published publicly. Due to the complicity nature of the data processing, developers still need to dig into examples, original training scripts or even contact the original author to figure out how to implement the data processing. DJL provides two ways to address this gap.
ModelZoo¶
Models in the DJL ModelZoo are ready to use. End user don't need to worry about data processing. DJL's ModelZoo allows you easily organize different type of models and their versions. However, creating a custom model zoo isn't straightforward. We are still working on the tooling to make it easy for model authors to create their own model zoo.
Bundle your data processing scripts together with model artifacts¶
DJL allows model author to create a ServingTranslator
class together with the model artifacts. DJL will load the bundled ServingTranslator
and use
this class to conduct the data processing.
Step 1: Create a ServingTranslator class¶
Create a java class that implements ServingTranslator interface. See: MyTranslator as an example.
Step 2: Create a libs
folder in your model directory¶
DJL will look into libs
folder to search for Translator implementation.
Step 3: Copy Translator into libs
folder¶
DJL can load Translator from the following source:
- from jar files directly locate in
libs
folder - from compiled java .class file in
libs/classes
folder - DJL can compile .java files in
libs/classes
folder at runtime and load compiled class
Configure data processing based on standard Translator¶
DJL provides several built-in Translator for well-know ML applications, such as Image Classification
and Object Detection
. You can customize those built-in Translators' behavior by providing
configuration parameters.
There are two ways to supply configurations to the Translator
:
-
Add a
serving.properties
file in the model's folderHere is an example:
# serving.properties can be used to define model's metadata, all the arguments will be
# passed to TranslatorFactory to create proper Translator
# defines model's application
application=nlp/question_answer
# defines the model's engine, can be overrid by Criteria.optEngine()
engine=PyTorch
# defines TranslatorFactory, can be overrid by Criteria.optTranslator() or Criteria.optTranslatorFactory()
translatorFactory=ai.djl.modality.cv.translator.ImageClassificationTranslatorFactory
# Add Translator specific arguments here to customize pre-processing and post-processing
# specify image size to be cropped
width=224
height=224
# specify the input image should be treated as grayscale image
flag=GRAYSCALE
# specify if apply softmax for post-processing
softmax=true
-
Pass arguments in Criteria:
You can customize Translator's behavior with Criteria, for example:
Criteria<Image, Classifications> criteria = Criteria.builder()
.setTypes(Image.class, Classifications.class) // defines input and output data type
.optApplication(Application.CV.IMAGE_CLASSIFICATION) // spcific model's application
.optModelUrls("file:///var/models/my_resnet50") // search models in specified path
.optArgument("width", 224)
.optArgument("height", 224)
.optArgument("height", 224)
.optArgument("flag", "GRAYSCALE")
.optArgument("softmax", true)
.build();