-
Notifications
You must be signed in to change notification settings - Fork 2k
Central driver implementation #13171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
019975f
48b65a3
81d4665
285f94d
458b202
e54cabe
bfa3940
15d185b
623a48b
8de6d93
e85dac2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: ml-pipeline-driver-agent | ||
| data: | ||
| sidecar.container: | | ||
| name: driver-plugin | ||
| image: kind-registry:5000/driver:ci | ||
| imagePullPolicy: IfNotPresent | ||
| env: | ||
| - name: LOG_ACCESS_KEY | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: mlpipeline-minio-artifact | ||
| key: accesskey | ||
| - name: LOG_SECRET_KEY | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: mlpipeline-minio-artifact | ||
| key: secretkey | ||
| ports: | ||
| - containerPort: 8080 | ||
| resources: | ||
| requests: | ||
| cpu: "0.1" | ||
| memory: "64Mi" | ||
| limits: | ||
| cpu: "0.5" | ||
| memory: "0.5Gi" | ||
| securityContext: | ||
| runAsNonRoot: true | ||
| runAsUser: 65534 | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: | ||
| - ALL | ||
| seccompProfile: | ||
| type: RuntimeDefault | ||
| volumeMounts: | ||
| - name: var-run-argo | ||
| mountPath: /kfp/log | ||
| readOnly: false |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: ml-pipeline-driver-agent | ||
| data: | ||
| sidecar.container: | | ||
| name: driver-plugin | ||
| image: kind-registry:5000/driver:ci | ||
| imagePullPolicy: IfNotPresent | ||
| ports: | ||
| - containerPort: 8080 | ||
| env: | ||
| - name: LOG_ACCESS_KEY | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: mlpipeline-minio-artifact | ||
| key: accesskey | ||
| - name: LOG_SECRET_KEY | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: mlpipeline-minio-artifact | ||
| key: secretkey | ||
| resources: | ||
| requests: | ||
| cpu: "0.1" | ||
| memory: "64Mi" | ||
| limits: | ||
| cpu: "0.5" | ||
| memory: "0.5Gi" | ||
| securityContext: | ||
| runAsNonRoot: true | ||
| runAsUser: 65534 | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: | ||
| - ALL | ||
| seccompProfile: | ||
| type: RuntimeDefault | ||
| volumeMounts: | ||
| - name: argo-workflows-agent-ca-certificates | ||
| mountPath: /kfp/certs | ||
| readOnly: true | ||
| - name: var-run-argo | ||
| mountPath: /kfp/log | ||
| readOnly: false |
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere but have it somewhere in the common base. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,6 +29,36 @@ function check_namespace { | |
| return 0 | ||
| } | ||
|
|
||
| function describe_argo_workflows { | ||
| local NAMESPACE=$1 | ||
| echo "===== Argo Workflows Inspection =====" | ||
| for wf in $(kubectl get wf -n "$NAMESPACE" -o json | jq -r '.items[] | select(.status.phase=="Failed" or .status.phase=="Running") | .metadata.name'); do | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i recommend proper long expressive human readable and pronounceable variable names |
||
| echo "Inspected workflow: $wf" | ||
| kubectl get wf "$wf" -n "$NAMESPACE" || true | ||
| pods=$(kubectl get po -n "$NAMESPACE" -l "workflows.argoproj.io/workflow=$wf" -o jsonpath='{.items[*].metadata.name}') | ||
| for pod in $pods; do | ||
| phase=$(kubectl get po "$pod" -n "$NAMESPACE" -o jsonpath='{.status.phase}') | ||
| echo "Inspect Pod: $pod, Status: $phase" | ||
| if [[ "$phase" != "Pending" && "$phase" != "Succeeded" ]]; then | ||
| echo " ---> $pod Logs:" | ||
| if [[ "$pod" == *-agent ]]; then | ||
| kubectl logs "$pod" -n "$NAMESPACE" -c driver-plugin || true | ||
| else | ||
| kubectl logs "$pod" -n "$NAMESPACE" || true | ||
| fi | ||
| fi | ||
| echo " ---> Describe $pod:" | ||
| if [[ "$phase" != "Succeeded" ]]; then | ||
| echo " ---> Describe:" | ||
| kubectl describe po "$pod" -n "$NAMESPACE" | ||
| fi | ||
| done | ||
| done | ||
| echo "===== Argo Workflows data =====" | ||
| kubectl get events -n "${NAMESPACE}" --field-selector involvedObject.kind=Workflow --sort-by='.metadata.creationTimestamp' | ||
| echo "===============================" | ||
| } | ||
|
|
||
| function display_pod_info { | ||
| local NAMESPACE=$1 | ||
|
|
||
|
|
@@ -52,7 +82,13 @@ function display_pod_info { | |
| kubectl describe pod "${POD_NAME}" -n "${NAMESPACE}" | grep -A 100 Events || echo "No events found for pod ${POD_NAME}." | ||
|
|
||
| echo "----- LOGS -----" | ||
| kubectl logs "${POD_NAME}" -n "${NAMESPACE}" || echo "No logs found for pod ${POD_NAME}." | ||
| if [[ "${POD_NAME}" == *-agent* ]]; then | ||
| kubectl logs "${POD_NAME}" -n "${NAMESPACE}" -c driver-plugin || \ | ||
| echo "No logs found for pod ${POD_NAME}." | ||
| else | ||
| kubectl logs "${POD_NAME}" -n "${NAMESPACE}" || \ | ||
| echo "No logs found for pod ${POD_NAME}." | ||
| fi | ||
|
|
||
| echo "===========================" | ||
| echo "" | ||
|
|
@@ -64,6 +100,7 @@ function display_pod_info { | |
|
|
||
| if check_namespace "$NS"; then | ||
| display_pod_info "$NS" | ||
| describe_argo_workflows "$NS" | ||
| else | ||
| exit 0 | ||
| fi | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| package util | ||
|
|
||
| import ( | ||
| "context" | ||
| "fmt" | ||
| "io" | ||
| "os" | ||
|
|
||
| "github.com/sirupsen/logrus" | ||
| ) | ||
|
|
||
| type CtxKey string | ||
|
|
||
| const ( | ||
| contextLoggerKey CtxKey = "driver_log_key" | ||
| ) | ||
|
|
||
| func newFileLogger(logFile string) (*logrus.Logger, io.Closer, error) { | ||
| f, err := os.Create(logFile) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i recommend proper long expressive human readable and pronounceable variable names |
||
| if err != nil { | ||
| return nil, nil, err | ||
| } | ||
|
|
||
| logger := logrus.New() | ||
| logger.Out = io.MultiWriter(os.Stdout, f) | ||
| logger.Formatter = &logrus.TextFormatter{} | ||
| return logger, f, nil | ||
| } | ||
|
|
||
| // WithExistingLogger For testing only | ||
| func WithExistingLogger(ctx context.Context, logger *logrus.Logger) context.Context { | ||
| return context.WithValue(ctx, contextLoggerKey, logger) | ||
| } | ||
|
|
||
| func WithLogger(ctx context.Context, logFile string) (context.Context, io.Closer, error) { | ||
| if ctx == nil { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i recommend proper long expressive human readable and pronounceable variable names |
||
| return nil, nil, fmt.Errorf( | ||
| "error during creation of the logger for logId: %v. ctx can not be nil", | ||
| logFile, | ||
| ) | ||
| } | ||
|
|
||
| if GetLoggerFrom(ctx) != nil { | ||
| return nil, nil, fmt.Errorf("logger already exists in context") | ||
| } | ||
|
|
||
| logger, f, err := newFileLogger(logFile) | ||
| if err != nil { | ||
| return nil, nil, fmt.Errorf( | ||
| "error during creation of the logger for logId: %v details: %w", | ||
| logFile, | ||
| err, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i recommend proper long expressive human readable and pronounceable variable names |
||
| ) | ||
| } | ||
|
|
||
| ctx = context.WithValue(ctx, contextLoggerKey, logger) | ||
|
|
||
| return ctx, f, nil | ||
| } | ||
|
|
||
| func GetLoggerFrom(ctx context.Context) *logrus.Logger { | ||
| v := ctx.Value(contextLoggerKey) | ||
| if v == nil { | ||
| return nil | ||
| } | ||
|
|
||
| logger, ok := v.(*logrus.Logger) | ||
| if !ok { | ||
| return nil | ||
| } | ||
|
|
||
| return logger | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THis is the fith instance of this file, please use a kustomize component, for example in the argo folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this patch not live as patch in the argo folder or so? The goal is to not patch it everywhere 10 times but have it somewhere in the common base.