Model deployment is the process of making a trained machine learning model available for use in real-world applications. With Scikit-learn, models can be serialized and integrated into APIs, web services, or other environments for inference.
Key Characteristics
- Enables real-time predictions and integration into production systems
- Supports deployment via Flask, FastAPI, or cloud services
- Models are saved using joblib or pickle
- Deployment involves serialization, API serving, and request handling
Basic Rules
- Always serialize models after validation
- Ensure consistent preprocessing pipelines are saved
- Use lightweight frameworks like Flask for local or small-scale deployment
- For scalability, consider using Docker or cloud-based services (AWS Lambda, Azure Functions)
Syntax Table
SL NO | Task | Syntax Example | Description |
---|---|---|---|
1 | Save Model | joblib.dump(model, 'model.pkl') |
Serializes the model to disk |
2 | Load Model | model = joblib.load('model.pkl') |
Deserializes model from file |
3 | Flask App Setup | Flask(__name__) |
Initializes a basic Flask web server |
4 | Define API Route | @app.route('/predict', methods=['POST']) |
Creates endpoint for predictions |
5 | Return JSON Response | return jsonify({'prediction': result}) |
Sends back model output as JSON |
Syntax Explanation
1. Save Model
What is it?
Serialize a trained Scikit-learn model.
Syntax:
from joblib import dump
dump(model, 'model.pkl')
Explanation:
dump()
writes the trained model object to a.pkl
file.- Supports saving large NumPy arrays and Scikit-learn estimators efficiently.
- Saves preprocessing pipelines, GridSearchCV objects, or full pipelines.
- Use an absolute or relative file path.
- Essential step before deploying or sharing a model.
2. Load Model
What is it?
Loads a previously saved model for prediction.
Syntax:
from joblib import load
model = load('model.pkl')
Explanation:
- Reads and reconstructs the serialized model from disk.
- Must use the same code and environment (Python/Scikit-learn version).
- Supports loading into any Python session that has compatible libraries.
- Critical step for production usage, especially with web servers.
- You can load it into a Flask or FastAPI app for real-time inference.
3. Flask App Setup
What is it?
Initialize a minimal web server to host the model API.
Syntax:
from flask import Flask
app = Flask(__name__)
Explanation:
- Flask helps expose Python functionality over HTTP endpoints.
Flask(__name__)
sets up the app object to define routes.- Supports middleware, CORS, error handling, and more.
- Can be extended to include preprocessing logic or database interaction.
- Use
app.run(debug=True)
for development and debugging.
4. Define API Route
What is it?
Create an endpoint that listens for client prediction requests.
Syntax:
@app.route('/predict', methods=['POST'])
def predict():
...
Explanation:
- Creates a
/predict
API that handles POST requests with input data. - Decorator
@app.route()
binds URL path to a function. - Inside
predict()
, parse the JSON, reshape data, and return prediction. - You can create other routes for health checks, documentation, etc.
- Use tools like Postman or curl to send POST requests.
5. Return JSON Response
What is it?
Sends prediction results back in JSON format.
Syntax:
from flask import jsonify
return jsonify({'prediction': result})
Explanation:
- Converts Python dictionaries or lists into valid JSON output.
- Required to ensure the client can interpret the response.
- Can add metadata, error codes, model info, or processing time.
jsonify()
automatically sets headers and mimetype.- Avoid returning raw Python objects without formatting.
Real-Life Project: House Price Prediction API
Project Overview
Expose a trained regression model (e.g., for Boston Housing dataset) as a Flask API for real-time predictions.
Code Example
from flask import Flask, request, jsonify
from joblib import load
import numpy as np
app = Flask(__name__)
model = load('model.pkl')
@app.route('/predict', methods=['POST'])
def predict():
data = request.get_json(force=True)
features = np.array(data['features']).reshape(1, -1)
prediction = model.predict(features)[0]
return jsonify({'prediction': prediction})
if __name__ == '__main__':
app.run(debug=True)
Expected Output
- JSON response with predicted value
- Can be tested using curl or Postman:
curl -X POST -H "Content-Type: application/json" -d '{"features": [0.00632, 18.0, 2.31, 0.0, 0.538, 6.575, 65.2, 4.09, 1.0, 296.0, 15.3, 396.9, 4.98]}' http://127.0.0.1:5000/predict
Common Mistakes to Avoid
- ❌ Forgetting to scale or preprocess input features consistently
- ❌ Using different library versions during deployment
- ❌ Not validating model performance before saving