devops_depi Tensorflow serving

download the model

making a directory
```
mkdir -p <dirname>
```
```
mkdir -p resnet 
```

To download the model (must be on Savedmodel format it's a directory)

curl -L -o <path_to_save_the_moderl.tar.gz>\
link_to_the_Saved_model_on_kagglehub

#!/bin/bash
curl -L -o resnet.tar.gz \
https://www.kaggle.com/api/v1/models/tensorflow/resnet-50/tensorFlow2/classification/1/download

-o, --output Write to file instead of stdout

Extract the file
```
tar -xzvf resnet.tar.gz -C resnet/
```
- -x: Extract the archive.
- -z: Decompress the archive (since it’s .tar.gz).
- -v: Verbose output (shows the files being extracted).
- -f: Specifies the archive file (resnet.tar.gz).
- -C resnet/: Specifies the directory (resnet/) where the files should be extracted.
Check the extracted model content
```
ls resnet 
```
- output should be like this
```
saved_model.pb  variables
```

Making a directory for the models that would be served

mkdir -p models

the different versions of the model

# creating 3 subfolders inside the parent folder models
# you can ignore the previous line and run this only and it would work
mkdir -p models/{1..3}

let's for the purpose of testing coping the same moder resnet to the three subfolders in the models/

sudo cp -rf resnet/* models/1/
sudo cp -rf resnet/* models/2/
sudo cp -rf resnet/* models/3/

Runing the docker image

we would need to use volume so that the models in our VM can be mapped to the container and remain.

# it would pull and open the container termianl 
docker run -it -v $(pwd)/models/:/models -p 8501:8501 --entrypoint /bin/bash tensorflow/serving

To delete the containers

  docker rm $(docker container ps -aq)

After runing the `tensorflow/serving` container we need to serve the models

tensorflow_model_server --rest_api_port=8501 --model_name=resnet --model_base_path=/models/

you should see something like this

download this cat image or anyother iamge from the 1000class in imagenet

wget https://raw.githubusercontent.com/hossamAhmedSalah/devops_depi/refs/heads/main/cat.jpeg

create this file request_payload.json as it would be used to send the data to the model to make predictions on it, using this command tocuh request_payload.json
You would need to path the image to the tensorflow serving server and resnet expect images to be in a certain shape and dimensions so I made a utility script in python : image_preprocessing.py
To use this file you would run python3 script.py <image_path> <mode> make sure you have the libraries installed
- mode can be append or overwrtie as this script can be used sequentially to preprocess the images before passing it to the server, it save the image in the request_payload.json file.
- append add the new preprocessed image to the json file.
- overwrite just delete any previous content and add the new processed image.
great let's use the send the image to the server after we had processed it.

curl -d @request_payload.json -H "Content-Type: application/json" -X POST http://localhost:8501/v1/models/resnet:predict

you would see a terrifying matrix of predictions
let's postprocess the prediction to make it readable I hope the model would guess the image correctly after all this...anyway
here is the 1000 class of iamgenet
I used another script that would send the request and map the result to the classes predict_and_map.py
let's run it python3 predict_and_map.py
Let's make it more simpler and package everything in one script that is preprocess_predict_map.py

The current version that is runing

curl http://localhost:8501/v1/models/resnet

so by default it goes to the models/ directory and select the highest number to be the servable version, let's change this and serve multiple versions at the same time.

pause the server inside the container by pressing ctrl+c
the config file model.config.a

model_config_list: {
  config: {
    name: "resnet",
    base_path: "/models/",
    model_platform: "tensorflow",
    model_version_policy: {
      all: { }                                          
    }
  }
}

it should be like this
this in the host
this in the container
run this command in the container to server the models following the configurations we made
```
tensorflow_model_server --rest_api_port=8501 --model_config_file=/models/model.config.a
```
now let's see the versions by runing this command curl http://localhost:8501/v1/models/resne

{
  "model_version_status": [
   {
    "version": "3",
    "state": "AVAILABLE",
    "status": {
     "error_code": "OK",
     "error_message": ""
    }
   },
   {
    "version": "2",
    "state": "AVAILABLE",
    "status": {
     "error_code": "OK",
     "error_message": ""
    }
   },
   {
    "version": "1",
    "state": "AVAILABLE",
    "status": {
     "error_code": "OK",
     "error_message": ""
    }
   }
  ]
 }

To simplfy things further I made another script that can take the version
```
python3 preprocess_predict_map_v.py <image> <mode> [<version>]
```

Batching

we would need to create new one with the batching parameters config_batching

max_batch_size { value: 128 }
batch_timeout_micros { value: 0 }
max_enqueued_batches { value: 1000000 }
num_batch_threads { value: 8 }

tensorflow_model_server --rest_api_port=8501 --model_config_file=/models/model.config.a --batching_parameters_file=/models/config_batching --enable_batching=true

kubernetes

building docker file for a custom tensorflow/serving image

build & push

Define Kubernetes Manifests

deployment & service

Monitoring and visualization

1. building docker file for a custom tensorflow/serving image

run_server.sh put it

#!/bin/bash

# Run TensorFlow Serving with the specified model and batching configurations
tensorflow_model_server --rest_api_port=8501 --model_config_file=/models/model.config.a --batching_parameters_file=/models/config_batching --enable_batching=true

# Keep the container running
tail -f /dev/null

# Use TensorFlow Serving as base image
FROM tensorflow/serving:latest

# Copy the model and configuration files into the container
COPY models/ /models/

# Create a script to run TensorFlow Serving and keep the container running
COPY run_server.sh /models/run_server.sh
RUN chmod +x /models/run_server.sh

# Set the entrypoint to run the script
ENTRYPOINT ["/models/run_server.sh"]

Building

sudo docker build -t  hossamahmedsalah/tf-serving:resnet .

The command tail -f /dev/null is a clever trick to keep a container running indefinitely.

What is tail? tail is a Unix command that displays the last few lines of a file. By default, it shows the last 10 lines of a file. The -f option stands for "follow," which means that tail will continue to display new lines as they are added to the file.

What is /dev/null? /dev/null is a special file in Unix-like systems that represents a null device. It's a "black hole" where any data written to it is discarded, and it always returns an end-of-file (EOF) when read from. In other words, /dev/null is a file that never contains any data and always appears empty.

Pushing the image to docker hub (you need to login)
```
docker push hossamahmedsalah/tf-serving:resnet
```

to pull it you would use this command

docker pull hossamahmedsalah/tf-serving:resnet

2. Define Kubernetes Manifests

tf-serving-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tf-serving-deployment
  labels:
    app: tf-serving
spec:
  replicas: 3  # Number of replicas
  selector:
    matchLabels:
      app: tf-serving
  template:
    metadata:
      labels:
        app: tf-serving
    spec:
      containers:
      - name: tf-serving
        image: hossamahmedsalah/tf-serving:resnet 
        ports:
        - containerPort: 8501  # HTTP/REST
        - containerPort: 8500  # gPRC

tf-serving-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: tf-serving-service
  labels:
    app: tf-serving
spec:
  type: LoadBalancer  # Exposes the service externally
  ports:
    - name: grpc
      port: 8500  # Port for gRPC
      targetPort: 8500  # Port on the container
    - name: restapi
      port: 8501  # Port for HTTP/REST
      targetPort: 8501  # Port on the container
  selector:
    app: tf-serving  # Selects the pods with this label

Let's check the nodes Before applying

kubectl get nodes

Let's apply tf-serving-deployment.yaml

kubectl apply -f tf-serving-deployment.yaml

Let's check

kubectl get deployment

Let's apply the service that would work as a loadbalancer

kubectl apply -f tf-serving-service.yaml

Let's check for the external API

kubectl get svc tf-serving-service

check the external address from my browser

3. Monitoring and visualization

created monitoring.config inside models/ to enable it on tensorflow serving.

prometheus_config {
  enable: true,
  path: "/monitoring/prometheus/metrics"
}

modifying run_surver.sh by adding a new flag --monitoring_config_file=/models/monitoring.config

#!/bin/bash

# Run TensorFlow Serving with the specified model and batching configurations
tensorflow_model_server --rest_api_port=8501 --model_config_file=/models/model.config.a --batching_parameters_file=/models/config_batching --enable_batching=true --monitoring_config_file=/models/monitoring.config

# Keep the container running
tail -f /dev/null

a new version of the docker image

sudo docker build -t  hossamahmedsalah/tf-serving:resnet_monitoring .

Pushing

docker push hossamahmedsalah/tf-serving:resnet_monitoring

Modifying the tf-serving-deployment.yaml

image: hossamahmedsalah/tf-serving:resnet_monitoring

Apply the changes

kubectl apply -f tf-serving-deployment.yaml

Let's check it

http://{ip}:8501/monitoring/prometheus/metrics

create a Prometheus Service prom_service.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: prometheus-self
  labels:
    app: prometheus
spec:
  endpoints:
  - interval: 30s
    port: web
  selector:
    matchLabels:
      app: prometheus

create deployment prom_deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  replicas: 2
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
      - name: prometheus
        image: quay.io/prometheus/prometheus:v2.22.1
        ports:
        - containerPort: 9090

Apply

 kubectl apply -f prom_deployment.yaml


helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

helm repo update

helm install prometheus prometheus-community/prometheus-operator

the one that worked

kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml

kubectl apply -f prom_service.yaml

kubectl logs deployment/prometheus

Access Prometheus UI: Since Prometheus is set up as a ClusterIP service, we will need to port-forward to access the web interface.

kubectl port-forward deployment/prometheus 9090

Let's create a new service to balance and expose the prometheus prom_service_balance.yaml

apiVersion: v1
kind: Service
metadata:
  name: prometheus-service  
  labels:
    app: prometheus
spec:
  type: LoadBalancer
  ports:
    - name: web
      port: 9090  # Port exposed externally
      targetPort: 9090  # Port on the container (Prometheus listens on 9090)
      protocol: TCP
  selector:
    app: prometheus

kubectl apply -f prom_service_balance.yaml

let's seeeeee it

create a service to watch tf-serving tf-serving-monitoring.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tf-serving-monitor
  labels:
    app: tf-serving
spec:
  selector:
    matchLabels:
      app: tf-serving
  endpoints:
    - port: "8501"  # Port where TensorFlow Serving is exposed
      interval: 30s  # Scrape interval

kubectl apply -f tf-serving-monitoring.yaml

a modification on tf-serving-monitoring.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tf-serving-monitor
  labels:
    app: tf-serving
spec:
  selector:
    matchLabels:
      app: tf-serving
  endpoints:
    - port: "8501"  # Port where TensorFlow Serving is exposed
      interval: 30s  # Scrape interval
      path: http://35.189.228.88:8501/monitoring/prometheus/metrics

it didn't work as planed but.. enough for now

Scale down and rollout

rollout :is the process of updating or deploying an application in Kubernetes, typically managed by a Deployment resource. It can involve changing the container image, updating environment variables, or modifying resource requests and limits. kubectl rollout pause command is used in Kubernetes to temporarily halt the rollout of a deployment. This command is particularly useful during updates or changes to a deployment when you want to prevent new replicas from being created or old replicas from being terminated while you are making adjustments. to reverse kubectl rollout resume

kubectl rollout resume deployment/prometheus
kubectl rollout resume deployment/prometheus-operator
kubectl rollout resume deployment/tf-serving-deployment

scale down kubectl scale deployment/prometheus --replicas=0 kubectl scale deployment/prometheus-operator --replicas=0 kubectl scale deployment/tf-serving-deployment --replicas=0 now we don't have any pods runing so no services

Scale up

kubectl scale deployment/prometheus --replicas=2
kubectl scale deployment/prometheus-operator --replicas=1
kubectl scale deployment/tf-serving-deployment --replicas=3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

devops_depi Tensorflow serving

download the model

Runing the docker image

After runing the `tensorflow/serving` container we need to serve the models

Batching

kubernetes

1. building docker file for a custom tensorflow/serving image

2. Define Kubernetes Manifests

3. Monitoring and visualization

Scale down and rollout

Scale up

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
README.md		README.md
cat.jpeg		cat.jpeg
image_preprocessing.py		image_preprocessing.py
imagenet1000_clsidx_to_labels.json		imagenet1000_clsidx_to_labels.json
predict_and_map.py		predict_and_map.py
preprocess_predict_map.py		preprocess_predict_map.py
preprocess_predict_map_v.py		preprocess_predict_map_v.py

Folders and files

Latest commit

History

Repository files navigation

devops_depi Tensorflow serving

download the model

Runing the docker image

After runing the tensorflow/serving container we need to serve the models

Batching

kubernetes

1. building docker file for a custom tensorflow/serving image

2. Define Kubernetes Manifests

3. Monitoring and visualization

Scale down and rollout

Scale up

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

After runing the `tensorflow/serving` container we need to serve the models

Packages