Skip to content

data-engineering\01-spark\count_trips.py 에러 발생 #1

@dlgldgldgld

Description

@dlgldgldgld

아래 부분에서 에러가 발생 합니다.
lines.repartition(2).first()로 하면 정상 수행됩니다.

사양문제 같은데 정확한 원인을 알고 싶습니다.

>> Source
header = lines.first()

Error Log
Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, host.docker.internal, executor driver): java.net.SocketException: Connection reset by peer: socket write error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions