Amazon Bedrock supports custom model deployment, enabling businesses to combine proprietary models with AWS-managed foundation models. This tutorial guides beginners through deploying a custom Flan-T5 model end-to-end, including model preparation, safety configurations, and production monitoring.
Objective: Configure Bedrock model deployment permissions
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:CreateModel",
"bedrock:CreateEndpoint",
"s3:PutObject"
],
"Resource": "*"
}
]
}
aws bedrock list-custom-models --region us-west-2
Requirements: Model artifacts in TorchScript or TensorFlow SavedModel format
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Save for Bedrock deployment
torch.jit.save(torch.jit.trace(model, ["input_ids"]), "model.pt")
model_card = {
"modelName": "flan-t5-custom",
"description": "Custom fine-tuned Flan-T5 for customer support",
"inferenceSpec": {
"containerImage": "123456789012.dkr.ecr.us-west-2.amazonaws.com/bedrock-custom",
"supportedContentTypes": ["application/json"]
}
}
Best Practice: Use versioned buckets for model artifacts
aws s3 cp model.pt s3://your-bucket/models/v1.0.0/model.pt
aws s3 cp model_card.json s3://your-bucket/models/v1.0.0/
aws s3 ls s3://your-bucket/models/v1.0.0/ --recursive
aws bedrock create-custom-model \
--model-name "flan-t5-custom" \
--job-name "flan-deployment-job" \
--role-arn "arn:aws:iam::123456789012:role/BedrockModelRole" \
--base-model-identifier "arn:aws:bedrock:us-west-2::foundation-model/amazon.titan-text-express-v1" \
--training-data-config "file://model_card.json" \
--output-config '{"s3OutputLocation": "s3://your-bucket/outputs/"}' \
--region us-west-2
aws bedrock get-model-deployment-status \
--model-id "arn:aws:bedrock:us-west-2:123456789012:custom-model/flan-t5-custom" \
--region us-west-2
aws bedrock create-endpoint \
--endpoint-name "flan-t5-endpoint" \
--model-arn "arn:aws:bedrock:us-west-2:123456789012:custom-model/flan-t5-custom" \
--endpoint-config '{"instanceType": "ml.g5.xlarge", "instanceCount": 1}' \
--region us-west-2
import boto3
bedrock = boto3.client('bedrock-runtime', region_name='us-west-2')
response = bedrock.invoke_model(
modelId="flan-t5-custom",
body=json.dumps({
"text_inputs": "Translate to French: Hello world",
"max_length": 50
})
)
print(json.loads(response['body'].read())['generated_text'])
aws cloudwatch put-metric-alarm \
--alarm-name "Bedrock-High-Latency" \
--metric-name "ModelLatency" \
--namespace "AWS/Bedrock" \
--statistic "Average" \
--period 300 \
--threshold 1000 \
--comparison-operator "GreaterThanThreshold" \
--evaluation-periods 1 \
--alarm-actions "arn:aws:sns:us-west-2:123456789012:BedrockAlerts"
aws cloudwatch get-metric-statistics \
--namespace "AWS/Bedrock" \
--metric-name "Invocations" \
--start-time $(date -u +"%Y-%m-%dT%H:%M:%SZ" -d "-7 days") \
--end-time $(date -u +"%Y-%m-%dT%H:%M:%SZ") \
--period 3600 \
--statistics "Sum"
aws bedrock update-endpoint \
--endpoint-name "flan-t5-endpoint" \
--model-arn "arn:aws:bedrock:us-west-2:123456789012:custom-model/flan-t5-custom-v2" \
--region us-west-2 \
--deployment-config '{"blueGreenUpdate": {"trafficRoutingConfiguration": {"type": "ALL_AT_ONCE"}}}'
After mastering basic deployment:
Reference: AWS Bedrock Custom Models Guide
Category: AWS
Similar Articles