Bridging the gap between your trained models and real-world applications.
Congratulations on training your machine learning model! The next crucial step is to deploy it so that it can be used by others or integrated into applications. This tutorial will guide you through the fundamental concepts and a practical example of deploying a simple model.
ML model deployment is the process of making your trained machine learning model available for use in a production environment. This can involve various methods, such as:
A model is only valuable if it can be used to make predictions or decisions. Deployment unlocks the potential of your ML efforts, allowing them to drive business value, improve user experiences, and automate processes.
Before we dive into an example, let's touch upon some key concepts:
We'll demonstrate how to deploy a basic model using Python's Flask framework to create a simple web API.
Assume you have a trained scikit-learn model. For this example, let's pretend we have a simple linear regression model.
First, train a model (e.g., using scikit-learn) and save it using joblib:
import joblib
from sklearn.linear_model import LinearRegression
import numpy as np
# Sample data (replace with your actual training data)
X_train = np.array([[1], [2], [3], [4], [5]])
y_train = np.array([2, 4, 5, 4, 5])
# Train a simple model
model = LinearRegression()
model.fit(X_train, y_train)
# Save the model
joblib.dump(model, 'linear_regression_model.pkl')
print("Model saved as linear_regression_model.pkl")
Create a Python file (e.g., app.py) for your Flask application.
from flask import Flask, request, jsonify
import joblib
import numpy as np
app = Flask(__name__)
# Load the trained model
try:
model = joblib.load('linear_regression_model.pkl')
print("Model loaded successfully.")
except FileNotFoundError:
print("Error: model file 'linear_regression_model.pkl' not found.")
model = None # Handle case where model is not found
@app.route('/')
def home():
return "Welcome to the ML Model Deployment API!"
@app.route('/predict', methods=['POST'])
def predict():
if model is None:
return jsonify({'error': 'Model not loaded'}), 500
try:
# Get data from the POST request
data = request.get_json()
if not data or 'features' not in data:
return jsonify({'error': 'Invalid input format. Expecting JSON with a "features" key.'}), 400
# Ensure features is a list of lists for single prediction or batch
features = np.array(data['features'])
# Make prediction
predictions = model.predict(features)
# Return prediction as JSON
return jsonify({'predictions': predictions.tolist()})
except Exception as e:
print(f"Prediction error: {e}")
return jsonify({'error': str(e)}), 500
if __name__ == '__main__':
app.run(debug=True) # debug=True for development, set to False for production
Make sure you have Flask and Joblib installed:
pip install Flask scikit-learn joblib numpy
Run your Flask application from the terminal:
python app.py
You should see output indicating the Flask server is running, usually on http://127.0.0.1:5000/.
You can use tools like curl or Postman to send POST requests to your API.
Using curl:
curl -X POST -H "Content-Type: application/json" -d '{"features": [[6]]}' http://127.0.0.1:5000/predict
This should return a JSON response like:
{
"predictions": [5.6]
}
You can also send multiple features for batch prediction:
curl -X POST -H "Content-Type: application/json" -d '{"features": [[7], [8], [9]]}' http://127.0.0.1:5000/predict
This is a basic introduction. For production-ready deployments, consider:
Happy deploying!
Download Tutorial Code