Paddy projects 9 minutes read( About 1295 words)0 visits
Deploy the VGG19 model to AWS sagemaker and make inference
Introduction
AWS SageMaker is an excellent ML platform for conducting MLOps, simplifying model deployment, seamlessly integrating with other AWS services, and enabling rapid iteration and experimentation. In this blog post, we’ll delve into the process of deploying a VGG19 model to AWS SageMaker, covering the steps involved in training the model, creating a SageMaker endpoint, and making real-time inferences.
Get the model
the model have been trained and saved on /home/sagemaker-user/model/vgg19/model_file,
definput_fn(request_body, request_content_type='application/json'): if request_content_type == 'application/json': # Parse the image URL or Base64 data from the request body data = json.loads(request_body) image_data = data['image'] # Assume the JSON contains an 'image' key # If it's a URL, download the image if'http'in image_data: image = Image.open(requests.get(image_data, stream=True).raw) else: # If it's Base64 encoded, decode it image = Image.open(io.BytesIO(base64.b64decode(image_data))) # Convert to a format suitable for model input image = image.resize((224, 224)) # Assume VGG19 input size is 224x224 image = np.array(image) / 255.0# Normalize to [0, 1] image = np.expand_dims(image, axis=0) # Add batch dimension return tf.convert_to_tensor(image, dtype=tf.float32) else: raise ValueError("Unsupported content type: {}".format(request_content_type))
defoutput_fn(prediction, response_content_type='application/json'): if response_content_type == 'application/json': style_output = {k: v.numpy().tolist() for k, v in prediction['style'].items()} content_output = {k: v.numpy().tolist() for k, v in prediction['content'].items()} result = { 'style': style_output, 'content': content_output } return json.dumps(result) else: raise ValueError("Unsupported content type: {}".format(response_content_type))
1
tar -czvf model_package.tar.gz /home/sagemaker-user/model/vgg19/model_file inference.py
withopen('/home/sagemaker-user/model/vgg19/requirements.txt', 'w') as f: f.write(packages.strip())
Make dockerfile
1 2 3 4 5 6 7
# Use the SageMaker Tensorflow image as the base image # 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.0.0-gpu-py310 # region: us-east-1 for reference FROM <docker_registry_url>.dkr.ecr.<my_aws_region>.amazonaws.com/tensorflow-inference:2.0.0-gpu-py310
# Install the additional dependency RUN pip install /home/sagemaker-user/model/vgg19/requirements.txt
we can start to invoke the endpoint to test it once the endpoint status is in-service
1 2 3 4 5 6 7 8 9
import time
whileTrue: response = sagemaker_client.describe_endpoint(EndpointName='endpoint-name') status = response['EndpointStatus'] print(f'Endpoint status: {status}') if status in ['InService', 'Failed']: break time.sleep(30)
Invoke the endpoint
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
import boto3 from PIL import Image import io withopen(file_name, "rb") as f: payload = f.read()
on making inference.py part, the ways of defining model_fn and predict_fn are a little bit simple and lack of flexibility.
Here is an good template of model_fn about setting up pretrain_model, loading configs, model_weights,joblib storage banks and transforms on enhancing performance, efficiency, and usability in machine learning workflows.
defmodel_fn(model_dir): """ This function is the first to get executed upon a prediction request, it loads the model from the disk and returns the model object which will be used later for inference. """
• Write the Sagemaker model serving script(inference.py) • Upload the Model to S3 • Upload a custom Docker image to AWS ECR • Create a Model in SageMaker • Create an Endpoint Configuration • Create an Endpoint • Invoke the Endpoint