Skip to content
This repository was archived by the owner on Aug 17, 2023. It is now read-only.
This repository was archived by the owner on Aug 17, 2023. It is now read-only.

Fairing with Azure - TrainJob fails when provided with argument "pod_spec_mutators" - TypeError: 'NoneType' object is not subscriptable #562

@pshah16

Description

@pshah16

/kind bug

What steps did you take and what happened:
Running Kubeflow Fairing on Microsoft Azure.

When trying to execute the TrainJob command in the following notebook with 'pod_spec_mutator' argument, it fails with the following error message:

Notebook:
https://github.com/kubeflow/fairing/blob/master/examples/train_job_api/main.ipynb

TypeError: 'NoneType' object is not subscriptable

TypeError                                 Traceback (most recent call last)
<ipython-input-24-a50f7ea4d549> in <module>
      1 job = TrainJob(train, docker_registry=DOCKER_REGISTRY, input_files=["requirements.txt"],base_docker_image = BASE_DOCKER_IMAGE, backend=BackendClass(build_context_source=BuildContext), 
      2               pod_spec_mutators=[get_resource_mutator(cpu=1, memory=2)])
----> 3 job.submit()

/opt/conda/lib/python3.7/site-packages/kubeflow/fairing/ml_tasks/tasks.py in submit(self)
     82         deployer = self._backend.get_training_deployer(
     83             pod_spec_mutators=self._pod_spec_mutators)
---> 84         return deployer.deploy(self.pod_spec)
     85 
     86 

/opt/conda/lib/python3.7/site-packages/kubeflow/fairing/deployers/job/job.py in deploy(self, pod_spec)
     88         self.labels['fairing-id'] = self.job_id
     89         for fn in self.pod_spec_mutators:
---> 90             fn(self.backend, pod_spec, self.namespace)
     91         pod_template_spec = self.generate_pod_template_spec(pod_spec)
     92         pod_template_spec.spec.restart_policy = 'Never'

/opt/conda/lib/python3.7/site-packages/kubeflow/fairing/cloud/azure.py in add_azure_files(kube_manager, pod_spec, namespace)
    207 # Mount Azure Files shared folder so the pod can access its files with a local path
    208 def add_azure_files(kube_manager, pod_spec, namespace):
--> 209     context_hash = pod_spec.containers[0].args[1].split(':')[-1]
    210     secret_name = constants.AZURE_STORAGE_CREDS_SECRET_NAME_PREFIX + context_hash.lower()
    211     if not kube_manager.secret_exists(secret_name, namespace):

TypeError: 'NoneType' object is not subscriptable

What did you expect to happen:
TrainJob should have run successfully without error.

Environment:

Fairing version: kubeflow-fairing==1.0.2
Kubeflow version: Build: dev_local | dashboard: v.0.0.2- | Isolation-mode: multi-user
Kubernetes version: (use kubectl version): v1.21.2
OS (e.g. from /etc/os-release): Ubuntu 20.04 LTS (Focal Fossa)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions