Skip to content

[Bug] Kyuubi Server might not delete the driver pod after completed #7293

@miaht94

Description

@miaht94

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

Hi Kyuubi Team,

I'm running Kyuubi on k8s and setting kyuubi.kubernetes.spark.cleanupTerminatedDriverPod.kind=ALL in Kyuubi Server but it seems not work as expected. After reviewing the source code in version 1.10.2, specifically the Kubernetes informer initialization logic here, I noticed that the informer is only initialized when configurations like the following are provided:

kyuubi.kubernetes.client.initialize.list=context1:namespace1
kyuubi.kubernetes.authenticate.context1.oauthToken=
kyuubi.kubernetes.authenticate.context1.caCertFile=
...

According to the documentation, kyuubi.kubernetes.client.initialize.list is only documented as available starting from 1.11.0, yet the corresponding initialization code already exists in 1.10.2.

Additionally, this setup feels unnecessarily complex for cases where Kyuubi is running inside Kubernetes and uses the ServiceAccount credentials that was attached to Pod. In such scenarios, users would expect Kyuubi to automatically initialize the Kubernetes informer based on the in-cluster configuration. However, without explicitly setting kyuubi.kubernetes.client.initialize.list, no informer is created, which prevents features like terminated driver pod cleanup from working.

Could you please clarify:

  • Whether this behavior is intended in 1.10.2 ?
  • Kyuubi is expected to work without explicitly setting kyuubi.kubernetes.client.initialize.list (using CA cert, token from pod filesystem) ?

Affects Version(s)

1.10.2

Kyuubi Server Log Output

Kyuubi Engine Log Output

Kyuubi Server Configurations

kyuubi.server.administrators=admin
# Kyuubi Metrics
kyuubi.metrics.enabled=true
kyuubi.metrics.reporters=PROMETHEUS
kyuubi.metrics.prometheus.port=10019

## User provided Kyuubi configurations
kyuubi.kubernetes.namespace=kyuubi
kyuubi.kubernetes.spark.cleanupTerminatedDriverPod.kind=ALL
kyuubi.kubernetes.terminatedApplicationRetainPeriod=PT1M
kyuubi.ha.addresses=zookeeper.zookeeper.svc:2181
kyuubi.ha.namespace=kyuubi
kyuubi.ha.zookeeper.auth.digest=admin:password
kyuubi.frontend.rest.bind.host=0.0.0.0
kyuubi.frontend.connection.url.use.hostname=false
kyuubi.frontend.thrift.binary.bind.port=10009
kyuubi.frontend.thrift.http.bind.port=10010
kyuubi.frontend.rest.bind.port=10099
kyuubi.frontend.mysql.bind.port=3309
kyuubi.frontend.protocols=MYSQL,REST,THRIFT_BINARY

## User provided Kyuubi configurations
kyuubi.engine.pool.size=-1
kyuubi.engine.share.level=CONNECTION
kyuubi.session.close.on.disconnect=true
kyuubi.engine.kubernetes.submit.timeout=PT90S
kyuubi.session.engine.spark.main.resource=local:///opt/kyuubi/externals/engines/spark/kyuubi-spark-sql-engine_2.12-1.10.2.jar

Kyuubi Engine Configurations

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions