Skip to content

fix(resolver):Fix classloader issue for Spark/isolated environments#9

Open
ShahimSharafudeen wants to merge 1 commit into
prestodb:masterfrom
ShahimSharafudeen:resolver_1.7_spark_issue_fix
Open

fix(resolver):Fix classloader issue for Spark/isolated environments#9
ShahimSharafudeen wants to merge 1 commit into
prestodb:masterfrom
ShahimSharafudeen:resolver_1.7_spark_issue_fix

Conversation

@ShahimSharafudeen
Copy link
Copy Markdown

@ShahimSharafudeen ShahimSharafudeen commented Apr 10, 2026

Description

This change improves the reliability of Maven Resolver / Plexus initialization in environments with isolated or non-standard classloader hierarchies (e.g., Spark executors, Docker containers, or shaded runtimes).

Previously, the resolver initialization relied solely on the Thread Context ClassLoader (TCCL). In certain distributed or containerized environments, the TCCL may not have visibility into required Maven components, resulting in runtime failures such as:

  • RepositorySystem initialization errors

  • ServiceLoader lookup failures

  • PlexusContainer component discovery failures

This PR introduces a defensive classloader selection mechanism that evaluates multiple candidate classloaders and selects the first one capable of loading the required Maven components.

Motivation and Context

In distributed and containerized environments (such as Spark), classloader isolation can prevent the Thread Context ClassLoader from accessing required Maven components. This leads to runtime failures even when dependencies are present on the classpath, resulting in errors observed in Spark Docker containers during Maven Resolver initialization.

Caused by: org.codehaus.plexus.component.repository.exception.ComponentLookupException: java.util.NoSuchElementException
      role: org.apache.maven.repository.RepositorySystem
  roleHint: 
	at org.codehaus.plexus.DefaultPlexusContainer.lookup(DefaultPlexusContainer.java:267)
	at org.codehaus.plexus.DefaultPlexusContainer.lookup(DefaultPlexusContainer.java:255)
	at org.codehaus.plexus.DefaultPlexusContainer.lookup(DefaultPlexusContainer.java:249)
	at com.facebook.airlift.resolver.ArtifactResolver.buildProjectBuilder(ArtifactResolver.java:218)
	... 40 more
Caused by: java.util.NoSuchElementException
	at org.eclipse.sisu.inject.LocatedBeans$Itr.next(LocatedBeans.java:141)
	at org.eclipse.sisu.inject.LocatedBeans$Itr.next(LocatedBeans.java:1)
	at org.eclipse.sisu.plexus.DefaultPlexusBeans$Itr.next(DefaultPlexusBeans.java:76)
	at org.eclipse.sisu.plexus.DefaultPlexusBeans$Itr.next(DefaultPlexusBeans.java:1)
	at org.codehaus.plexus.DefaultPlexusContainer.lookup(DefaultPlexusContainer.java:263)
	... 43 more

This issue is caused by the changes introduced in the Airlift Resolver 1.7 PR : #2

Testing

Tested on OSS Presto PR by the help of jitpack : prestodb/presto#25295

@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented Apr 10, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: ShahimSharafudeen / name: Shahim Sharafudeen (b70193a)

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Apr 10, 2026

Reviewer's Guide

Refactors Plexus/Maven Resolver initialization in ArtifactResolver to use a defensively selected classloader (instead of blindly using the thread context classloader), validating candidate classloaders by checking for the presence of key Maven and Plexus components to improve reliability in isolated environments like Spark.

Sequence diagram for defensive classloader selection in ArtifactResolver

sequenceDiagram
    participant Caller
    participant ArtifactResolver
    participant Thread
    participant ContextClassLoader
    participant ThisClassLoader
    participant SystemClassLoader
    participant MavenComponents

    Caller->>ArtifactResolver: container()
    activate ArtifactResolver
    ArtifactResolver->>ArtifactResolver: getEffectiveClassLoader()

    ArtifactResolver->>Thread: getContextClassLoader()
    Thread-->>ArtifactResolver: contextClassLoader

    alt contextClassLoader not null
        ArtifactResolver->>ArtifactResolver: canLoadMavenComponents(contextClassLoader)
        ArtifactResolver->>MavenComponents: try load RepositorySystem, DefaultPlexusContainer, RepositorySystem(aether)
        alt all components load
            MavenComponents-->>ArtifactResolver: success
            ArtifactResolver-->>ArtifactResolver: return contextClassLoader
        else some component missing
            MavenComponents-->>ArtifactResolver: ClassNotFoundException
            ArtifactResolver-->>ArtifactResolver: continue selection
        end
    end

    alt no valid contextClassLoader
        ArtifactResolver->>ArtifactResolver: get class loader of ArtifactResolver
        ArtifactResolver-->>ArtifactResolver: thisClassLoader
        alt thisClassLoader not null
            ArtifactResolver->>ArtifactResolver: canLoadMavenComponents(thisClassLoader)
            ArtifactResolver->>MavenComponents: try load components
            alt all components load
                MavenComponents-->>ArtifactResolver: success
                ArtifactResolver-->>ArtifactResolver: return thisClassLoader
            else some component missing
                MavenComponents-->>ArtifactResolver: ClassNotFoundException
                ArtifactResolver-->>ArtifactResolver: continue selection
            end
        end

        ArtifactResolver->>SystemClassLoader: getSystemClassLoader()
        SystemClassLoader-->>ArtifactResolver: systemClassLoader
        alt systemClassLoader not null
            ArtifactResolver->>ArtifactResolver: canLoadMavenComponents(systemClassLoader)
            ArtifactResolver->>MavenComponents: try load components
            alt all components load
                MavenComponents-->>ArtifactResolver: success
                ArtifactResolver-->>ArtifactResolver: return systemClassLoader
            else some component missing
                MavenComponents-->>ArtifactResolver: ClassNotFoundException
                ArtifactResolver-->>ArtifactResolver: continue
            end
        end

        ArtifactResolver-->>ArtifactResolver: fallback to contextClassLoader or thisClassLoader
    end

    ArtifactResolver-->>Caller: PlexusContainer initialized with selected ClassLoader
    deactivate ArtifactResolver
Loading

Class diagram for updated ArtifactResolver classloader logic

classDiagram
    class ArtifactResolver {
        +static PlexusContainer container()
        -static ClassLoader getEffectiveClassLoader()
        -static boolean canLoadMavenComponents(ClassLoader classLoader)
    }
Loading

File-Level Changes

Change Details Files
Introduce defensive classloader selection for Plexus/Maven Resolver initialization instead of relying solely on the thread context classloader.
  • Replace direct use of Thread.currentThread().getContextClassLoader() when creating the ClassWorld with a call to a new getEffectiveClassLoader() helper.
  • Add getEffectiveClassLoader() that evaluates the context, defining-class, and system classloaders in priority order, selecting the first that can load required Maven and Plexus components.
  • Add canLoadMavenComponents(ClassLoader) helper that verifies visibility of core Maven, Plexus, and Maven Resolver classes via Class.forName checks, returning a boolean result.
  • Define fallback behavior in getEffectiveClassLoader() to return the context classloader if no candidate passes validation, otherwise falling back to the defining classloader.
resolver/src/main/java/com/facebook/airlift/resolver/ArtifactResolver.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Consider caching the result of getEffectiveClassLoader() (e.g., in a static field) to avoid repeated Class.forName checks on every container() call, which may be unnecessarily expensive in hot paths.
  • In getEffectiveClassLoader(), the final return contextClassLoader != null ? contextClassLoader : thisClassLoader; can still return null if both are null; it may be safer to always fall back to systemClassLoader or throw a clear exception when no non-null candidate is available.
  • The hard-coded class checks in canLoadMavenComponents tightly couple the selection logic to specific Maven/Plexus types; consider narrowing the set or centralizing these class names as constants so future Maven/Resolver upgrades can be adjusted in one place.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider caching the result of `getEffectiveClassLoader()` (e.g., in a static field) to avoid repeated `Class.forName` checks on every `container()` call, which may be unnecessarily expensive in hot paths.
- In `getEffectiveClassLoader()`, the final `return contextClassLoader != null ? contextClassLoader : thisClassLoader;` can still return `null` if both are `null`; it may be safer to always fall back to `systemClassLoader` or throw a clear exception when no non-null candidate is available.
- The hard-coded class checks in `canLoadMavenComponents` tightly couple the selection logic to specific Maven/Plexus types; consider narrowing the set or centralizing these class names as constants so future Maven/Resolver upgrades can be adjusted in one place.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know there are multiplee types of class loaders. This looks good to me.
Only question is, I assume the checks are in order of the preference of the classloader used? I don't know the difference of these.
So ThreadClassLoader is the basic one followed by Artifact, and System Class loader in that order?

@ShahimSharafudeen
Copy link
Copy Markdown
Author

Didn't know there are multiplee types of class loaders. This looks good to me.
Only question is, I assume the checks are in order of the preference of the classloader used? I don't know the difference of these.
So ThreadClassLoader is the basic one followed by Artifact, and System Class loader in that order?

Yes, the checks are intentionally ordered based on the preferred classloader selection priority.

The method attempts to use the Thread Context ClassLoader first, as this is the standard and expected mechanism in typical JVM and Maven environments. If that classloader does not have visibility into the required Maven components (which can happen in isolated environments such as Apache Spark containers), the logic falls back to the classloader that loaded the ArtifactResolver class. Finally, the System ClassLoader is used as a last fallback.

The method returns the first classloader that successfully loads the required Maven components, ensuring compatibility across both standard and containerized runtime environments.

Copy link
Copy Markdown
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShahimSharafudeen, can you please have a look at the comments by Sourcery here: #9 (review)?

@ShahimSharafudeen ShahimSharafudeen force-pushed the resolver_1.7_spark_issue_fix branch from 58816a9 to b70193a Compare April 13, 2026 07:10
@ShahimSharafudeen
Copy link
Copy Markdown
Author

@ShahimSharafudeen, can you please have a look at the comments by Sourcery here: #9 (review)?

@imjalpreet — The codebase has been updated based on the comments from Sourcery.

return cachedEffectiveClassLoader;
}

ClassLoader fallback = systemClassLoader != null ? systemClassLoader : thisClassLoader;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much idea on this, so I would like to understand why the systemClassLoader is given the highest priority here, and why we are ignoring contextClassLoader?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the code, the order of checks is contextClassLoader > thisClassLoader > systemClassLoader. The systemClassLoader is used as a fallback because it typically has broader class visibility when all validation checks fail. This is not strictly about priority; rather, it is about choosing the best available option when all other options have failed.

@ShahimSharafudeen
Copy link
Copy Markdown
Author

@shrinidhijoshi - Could you please review this PR when you have time? This change depends on resolving the test failure in the Spark Integration CI in Presto.

@ShahimSharafudeen
Copy link
Copy Markdown
Author

@czentgr @imjalpreet - Could you please do one round of review from your side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants