Skip to content

albertmaged/devops_depi

 
 

Repository files navigation

devops_depi Tensorflow serving

download the model

  • making a directory

    mkdir -p <dirname>
    mkdir -p resnet 
  • To download the model (must be on Savedmodel format it's a directory)

    curl -L -o <path_to_save_the_moderl.tar.gz>\
    link_to_the_Saved_model_on_kagglehub
    
    
    #!/bin/bash
    curl -L -o resnet.tar.gz \
    https://www.kaggle.com/api/v1/models/tensorflow/resnet-50/tensorFlow2/classification/1/download
    • -o, --output Write to file instead of stdout
  • Extract the file

    tar -xzvf resnet.tar.gz -C resnet/
    
    • -x: Extract the archive.
    • -z: Decompress the archive (since it’s .tar.gz).
    • -v: Verbose output (shows the files being extracted).
    • -f: Specifies the archive file (resnet.tar.gz).
    • -C resnet/: Specifies the directory (resnet/) where the files should be extracted.
  • Check the extracted model content

    ls resnet 
    • output should be like this
    saved_model.pb  variables
    
  • Making a directory for the models that would be served

    mkdir -p models
    • the different versions of the model
      # creating 3 subfolders inside the parent folder models
      # you can ignore the previous line and run this only and it would work
      mkdir -p models/{1..3}
    • let's for the purpose of testing coping the same moder resnet to the three subfolders in the models/
    sudo cp -rf resnet/* models/1/
    sudo cp -rf resnet/* models/2/
    sudo cp -rf resnet/* models/3/

Runing the docker image

  • we would need to use volume so that the models in our VM can be mapped to the container and remain.
# it would pull and open the container termianl 
docker run -it -v $(pwd)/models/:/models -p 8501:8501 --entrypoint /bin/bash tensorflow/serving
  • To delete the containers
  docker rm $(docker container ps -aq)

After runing the tensorflow/serving container we need to serve the models

tensorflow_model_server --rest_api_port=8501 --model_name=resnet --model_base_path=/models/

you should see something like this image

  • download this cat image or anyother iamge from the 1000class in imagenet
wget https://raw.githubusercontent.com/hossamAhmedSalah/devops_depi/refs/heads/main/cat.jpeg
  • create this file request_payload.json as it would be used to send the data to the model to make predictions on it, using this command tocuh request_payload.json

  • You would need to path the image to the tensorflow serving server and resnet expect images to be in a certain shape and dimensions so I made a utility script in python : image_preprocessing.py

  • To use this file you would run python3 script.py <image_path> <mode> make sure you have the libraries installed

    • mode can be append or overwrtie as this script can be used sequentially to preprocess the images before passing it to the server, it save the image in the request_payload.json file.
    • append add the new preprocessed image to the json file.
    • overwrite just delete any previous content and add the new processed image.
  • great let's use the send the image to the server after we had processed it.

curl -d @request_payload.json -H "Content-Type: application/json" -X POST http://localhost:8501/v1/models/resnet:predict
  • you would see a terrifying matrix of predictions image
  • let's postprocess the prediction to make it readable I hope the model would guess the image correctly after all this...anyway
  • here is the 1000 class of iamgenet
  • I used another script that would send the request and map the result to the classes predict_and_map.py
  • let's run it python3 predict_and_map.py image
  • Let's make it more simpler and package everything in one script that is preprocess_predict_map.py image
  • The current version that is runing
    curl http://localhost:8501/v1/models/resnet
    
    image
    • so by default it goes to the models/ directory and select the highest number to be the servable version, let's change this and serve multiple versions at the same time.
      • pause the server inside the container by pressing ctrl+c
      • the config file model.config.a
      model_config_list: {
        config: {
          name: "resnet",
          base_path: "/models/",
          model_platform: "tensorflow",
          model_version_policy: {
            all: { }                                          
          }
        }
      }
      • it should be like this
      • this in the host image
      • this in the container image
      • run this command in the container to server the models following the configurations we made
        tensorflow_model_server --rest_api_port=8501 --model_config_file=/models/model.config.a
      • now let's see the versions by runing this command curl http://localhost:8501/v1/models/resne
      {
        "model_version_status": [
         {
          "version": "3",
          "state": "AVAILABLE",
          "status": {
           "error_code": "OK",
           "error_message": ""
          }
         },
         {
          "version": "2",
          "state": "AVAILABLE",
          "status": {
           "error_code": "OK",
           "error_message": ""
          }
         },
         {
          "version": "1",
          "state": "AVAILABLE",
          "status": {
           "error_code": "OK",
           "error_message": ""
          }
         }
        ]
       }
      
      • To simplfy things further I made another script that can take the version
        python3 preprocess_predict_map_v.py <image> <mode> [<version>]
        image

Batching

  • we would need to create new one with the batching parameters config_batching
max_batch_size { value: 128 }
batch_timeout_micros { value: 0 }
max_enqueued_batches { value: 1000000 }
num_batch_threads { value: 8 }
     
tensorflow_model_server --rest_api_port=8501 --model_config_file=/models/model.config.a --batching_parameters_file=/models/config_batching --enable_batching=true

kubernetes

  1. building docker file for a custom tensorflow/serving image
  • build & push
  1. Define Kubernetes Manifests
  • deployment & service
  1. Monitoring and visualization

1. building docker file for a custom tensorflow/serving image

  • run_server.sh put it
#!/bin/bash

# Run TensorFlow Serving with the specified model and batching configurations
tensorflow_model_server --rest_api_port=8501 --model_config_file=/models/model.config.a --batching_parameters_file=/models/config_batching --enable_batching=true

# Keep the container running
tail -f /dev/null
# Use TensorFlow Serving as base image
FROM tensorflow/serving:latest

# Copy the model and configuration files into the container
COPY models/ /models/

# Create a script to run TensorFlow Serving and keep the container running
COPY run_server.sh /models/run_server.sh
RUN chmod +x /models/run_server.sh

# Set the entrypoint to run the script
ENTRYPOINT ["/models/run_server.sh"]




  • Building
sudo docker build -t  hossamahmedsalah/tf-serving:resnet .

The command tail -f /dev/null is a clever trick to keep a container running indefinitely.

  • What is tail? tail is a Unix command that displays the last few lines of a file. By default, it shows the last 10 lines of a file. The -f option stands for "follow," which means that tail will continue to display new lines as they are added to the file.
  • What is /dev/null? /dev/null is a special file in Unix-like systems that represents a null device. It's a "black hole" where any data written to it is discarded, and it always returns an end-of-file (EOF) when read from. In other words, /dev/null is a file that never contains any data and always appears empty.
  • Pushing the image to docker hub (you need to login)
    docker push hossamahmedsalah/tf-serving:resnet
    
  • to pull it you would use this command
    docker pull hossamahmedsalah/tf-serving:resnet
    

2. Define Kubernetes Manifests

  • tf-serving-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tf-serving-deployment
  labels:
    app: tf-serving
spec:
  replicas: 3  # Number of replicas
  selector:
    matchLabels:
      app: tf-serving
  template:
    metadata:
      labels:
        app: tf-serving
    spec:
      containers:
      - name: tf-serving
        image: hossamahmedsalah/tf-serving:resnet 
        ports:
        - containerPort: 8501  # HTTP/REST
        - containerPort: 8500  # gPRC
  • tf-serving-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: tf-serving-service
  labels:
    app: tf-serving
spec:
  type: LoadBalancer  # Exposes the service externally
  ports:
    - name: grpc
      port: 8500  # Port for gRPC
      targetPort: 8500  # Port on the container
    - name: restapi
      port: 8501  # Port for HTTP/REST
      targetPort: 8501  # Port on the container
  selector:
    app: tf-serving  # Selects the pods with this label
  • Let's check the nodes Before applying
kubectl get nodes

image

  • Let's apply tf-serving-deployment.yaml
kubectl apply -f tf-serving-deployment.yaml

image

  • Let's check
kubectl get deployment

image

  • Let's apply the service that would work as a loadbalancer
kubectl apply -f tf-serving-service.yaml

image

  • Let's check for the external API
kubectl get svc tf-serving-service

image

  • check the external address from my browser image

3. Monitoring and visualization

  • created monitoring.config inside models/ to enable it on tensorflow serving.
prometheus_config {
  enable: true,
  path: "/monitoring/prometheus/metrics"
}
  • modifying run_surver.sh by adding a new flag --monitoring_config_file=/models/monitoring.config
#!/bin/bash

# Run TensorFlow Serving with the specified model and batching configurations
tensorflow_model_server --rest_api_port=8501 --model_config_file=/models/model.config.a --batching_parameters_file=/models/config_batching --enable_batching=true --monitoring_config_file=/models/monitoring.config

# Keep the container running
tail -f /dev/null
  • a new version of the docker image
sudo docker build -t  hossamahmedsalah/tf-serving:resnet_monitoring .
  • Pushing
docker push hossamahmedsalah/tf-serving:resnet_monitoring
  • Modifying the tf-serving-deployment.yaml
image: hossamahmedsalah/tf-serving:resnet_monitoring 
  • Apply the changes
kubectl apply -f tf-serving-deployment.yaml
  • Let's check it
http://{ip}:8501/monitoring/prometheus/metrics

image

  • create a Prometheus Service prom_service.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: prometheus-self
  labels:
    app: prometheus
spec:
  endpoints:
  - interval: 30s
    port: web
  selector:
    matchLabels:
      app: prometheus
  • create deployment prom_deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  replicas: 2
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
      - name: prometheus
        image: quay.io/prometheus/prometheus:v2.22.1
        ports:
        - containerPort: 9090
  • Apply
 kubectl apply -f prom_deployment.yaml

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

helm repo update

helm install prometheus prometheus-community/prometheus-operator

image

  • the one that worked
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml
kubectl apply -f prom_service.yaml

image

kubectl logs deployment/prometheus

image

  • Access Prometheus UI: Since Prometheus is set up as a ClusterIP service, we will need to port-forward to access the web interface.
kubectl port-forward deployment/prometheus 9090

image

  • Let's create a new service to balance and expose the prometheus prom_service_balance.yaml
apiVersion: v1
kind: Service
metadata:
  name: prometheus-service  
  labels:
    app: prometheus
spec:
  type: LoadBalancer
  ports:
    - name: web
      port: 9090  # Port exposed externally
      targetPort: 9090  # Port on the container (Prometheus listens on 9090)
      protocol: TCP
  selector:
    app: prometheus 
kubectl apply -f prom_service_balance.yaml

image

  • let's seeeeee it image

image

image

  • create a service to watch tf-serving tf-serving-monitoring.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tf-serving-monitor
  labels:
    app: tf-serving
spec:
  selector:
    matchLabels:
      app: tf-serving
  endpoints:
    - port: "8501"  # Port where TensorFlow Serving is exposed
      interval: 30s  # Scrape interval
kubectl apply -f tf-serving-monitoring.yaml
  • a modification on tf-serving-monitoring.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tf-serving-monitor
  labels:
    app: tf-serving
spec:
  selector:
    matchLabels:
      app: tf-serving
  endpoints:
    - port: "8501"  # Port where TensorFlow Serving is exposed
      interval: 30s  # Scrape interval
      path: http://35.189.228.88:8501/monitoring/prometheus/metrics

it didn't work as planed but.. enough for now

Scale down and rollout

rollout :is the process of updating or deploying an application in Kubernetes, typically managed by a Deployment resource. It can involve changing the container image, updating environment variables, or modifying resource requests and limits. kubectl rollout pause command is used in Kubernetes to temporarily halt the rollout of a deployment. This command is particularly useful during updates or changes to a deployment when you want to prevent new replicas from being created or old replicas from being terminated while you are making adjustments. image to reverse kubectl rollout resume

kubectl rollout resume deployment/prometheus
kubectl rollout resume deployment/prometheus-operator
kubectl rollout resume deployment/tf-serving-deployment

scale down kubectl scale deployment/prometheus --replicas=0 kubectl scale deployment/prometheus-operator --replicas=0 kubectl scale deployment/tf-serving-deployment --replicas=0 image now we don't have any pods runing so no services image

Scale up

kubectl scale deployment/prometheus --replicas=2
kubectl scale deployment/prometheus-operator --replicas=1
kubectl scale deployment/tf-serving-deployment --replicas=3

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%