Skip to content
This repository was archived by the owner on Feb 1, 2022. It is now read-only.
This repository was archived by the owner on Feb 1, 2022. It is now read-only.

extract_xgbooost_cluster_env() and xgb.rabit.get_rank() get different rank number #106

@wulikai1993

Description

@wulikai1993

I ran distributed training on k8s.

The rank number was got by extract_xgbooost_cluster_env() as in https://github.com/kubeflow/xgboost-operator/blob/master/config/samples/xgboost-dist/train.py#L29

However, xgb.rabit.get_rank() got another rank number as in https://github.com/kubeflow/xgboost-operator/blob/master/config/samples/xgboost-dist/train.py#L57.

There are two things confusing me:

  1. Now that extract_xgbooost_cluster_env() had got the rank number, why usexgb.rabit.get_rank() to get rank number again?
  2. Why are the two rank numbers different?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions