Skip to content

Load balancing at a connection level #110

@o-alexandre-felipe

Description

@o-alexandre-felipe

I have an application with two pods and some client from inside the same cluster connecting to them by a service, as far as I know this will do a connection level multiplexing.

There is no reason for the workload to be consistently higher at one pod than another, yet I can see one of the pods receiving nearly 3 times more load than the other over a period of 3 hours.

image

The pod with more load was already running when the other pod started.

My first hypothesis was session stickiness, but a quick test shows that the connections are balanced

for _ in `seq 300` ; 
do
   curl -b cookies.txt -c cookies.txt -s riva-api.riva:8002/metrics | grep '^nv_gpu_utilization'; 
   sleep 0.1; 
done | awk '{print $1}' | sort | uniq -c

My new hypothesis is that python riva client is reusing the connections. Does that make sense or we are guaranteed to start a new connection when calling riva.client.ASRService(auth)?

Here you can find some snippets of the configuration

riva-api (pod) partial definition

apiVersion: apps/v1
kind: Deployment
metadata:
  name: riva-api
  namespace: riva
  labels:
    app: riva-api
    release: riva-api
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: riva-api
      release: riva-api
  template:
    metadata:
      labels:
        app: riva-api
        release: riva-api
    ...
    spec:
      ...
      containers:
        - name: riva-api
          image: nvcr.io/nvidia/riva/riva-speech:2.14.0
          ...

riva-api-online definition

apiVersion: v1
kind: Service
metadata:
  name: riva-api
  namespace: riva
spec:
  ports:
      ...
  selector:
    app: riva-api
    release: riva-api

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions