Skip to content
This repository was archived by the owner on Feb 1, 2022. It is now read-only.
This repository was archived by the owner on Feb 1, 2022. It is now read-only.

cannot work in namespace #121

@daniel985

Description

@daniel985

when we submit a Job and assign a namespace, it cannot work,
submit like this:
"
kubectl create -f xgboost-operator/config/samples/xgboost-dist/xgboostjob_v1_iris_train.yaml -n aisys
"

and the error message like this:
"
starting the train job
starting to extract system env
extract the Rabit env from cluster : xgboost-dist-iris-test-train-master-0, port: 9991, rank: 0, word_size: 3
start the master node
start listen on 0.0.0.0:9991

RabitTracker Setup Finished
Rabit rank setup with below envs

DMLC_NUM_WORKER=3
DMLC_TRACKER_URI=xgboost-dist-iris-test-train-master-0
DMLC_TRACKER_PORT=9991
DMLC_TASK_ID=0
retry connect to ip(retry time 1): [xgboost-dist-iris-test-train-master-0]
retry connect to ip(retry time 2): [xgboost-dist-iris-test-train-master-0]
retry connect to ip(retry time 3): [xgboost-dist-iris-test-train-master-0]
retry connect to ip(retry time 4): [xgboost-dist-iris-test-train-master-0]
connect to (failed): [xgboost-dist-iris-test-train-master-0]
Socket Connect Error:Connection refused
"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions