fix(resolver):Fix classloader issue for Spark/isolated environments#9
Conversation
|
|
Reviewer's GuideRefactors Plexus/Maven Resolver initialization in ArtifactResolver to use a defensively selected classloader (instead of blindly using the thread context classloader), validating candidate classloaders by checking for the presence of key Maven and Plexus components to improve reliability in isolated environments like Spark. Sequence diagram for defensive classloader selection in ArtifactResolversequenceDiagram
participant Caller
participant ArtifactResolver
participant Thread
participant ContextClassLoader
participant ThisClassLoader
participant SystemClassLoader
participant MavenComponents
Caller->>ArtifactResolver: container()
activate ArtifactResolver
ArtifactResolver->>ArtifactResolver: getEffectiveClassLoader()
ArtifactResolver->>Thread: getContextClassLoader()
Thread-->>ArtifactResolver: contextClassLoader
alt contextClassLoader not null
ArtifactResolver->>ArtifactResolver: canLoadMavenComponents(contextClassLoader)
ArtifactResolver->>MavenComponents: try load RepositorySystem, DefaultPlexusContainer, RepositorySystem(aether)
alt all components load
MavenComponents-->>ArtifactResolver: success
ArtifactResolver-->>ArtifactResolver: return contextClassLoader
else some component missing
MavenComponents-->>ArtifactResolver: ClassNotFoundException
ArtifactResolver-->>ArtifactResolver: continue selection
end
end
alt no valid contextClassLoader
ArtifactResolver->>ArtifactResolver: get class loader of ArtifactResolver
ArtifactResolver-->>ArtifactResolver: thisClassLoader
alt thisClassLoader not null
ArtifactResolver->>ArtifactResolver: canLoadMavenComponents(thisClassLoader)
ArtifactResolver->>MavenComponents: try load components
alt all components load
MavenComponents-->>ArtifactResolver: success
ArtifactResolver-->>ArtifactResolver: return thisClassLoader
else some component missing
MavenComponents-->>ArtifactResolver: ClassNotFoundException
ArtifactResolver-->>ArtifactResolver: continue selection
end
end
ArtifactResolver->>SystemClassLoader: getSystemClassLoader()
SystemClassLoader-->>ArtifactResolver: systemClassLoader
alt systemClassLoader not null
ArtifactResolver->>ArtifactResolver: canLoadMavenComponents(systemClassLoader)
ArtifactResolver->>MavenComponents: try load components
alt all components load
MavenComponents-->>ArtifactResolver: success
ArtifactResolver-->>ArtifactResolver: return systemClassLoader
else some component missing
MavenComponents-->>ArtifactResolver: ClassNotFoundException
ArtifactResolver-->>ArtifactResolver: continue
end
end
ArtifactResolver-->>ArtifactResolver: fallback to contextClassLoader or thisClassLoader
end
ArtifactResolver-->>Caller: PlexusContainer initialized with selected ClassLoader
deactivate ArtifactResolver
Class diagram for updated ArtifactResolver classloader logicclassDiagram
class ArtifactResolver {
+static PlexusContainer container()
-static ClassLoader getEffectiveClassLoader()
-static boolean canLoadMavenComponents(ClassLoader classLoader)
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider caching the result of
getEffectiveClassLoader()(e.g., in a static field) to avoid repeatedClass.forNamechecks on everycontainer()call, which may be unnecessarily expensive in hot paths. - In
getEffectiveClassLoader(), the finalreturn contextClassLoader != null ? contextClassLoader : thisClassLoader;can still returnnullif both arenull; it may be safer to always fall back tosystemClassLoaderor throw a clear exception when no non-null candidate is available. - The hard-coded class checks in
canLoadMavenComponentstightly couple the selection logic to specific Maven/Plexus types; consider narrowing the set or centralizing these class names as constants so future Maven/Resolver upgrades can be adjusted in one place.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider caching the result of `getEffectiveClassLoader()` (e.g., in a static field) to avoid repeated `Class.forName` checks on every `container()` call, which may be unnecessarily expensive in hot paths.
- In `getEffectiveClassLoader()`, the final `return contextClassLoader != null ? contextClassLoader : thisClassLoader;` can still return `null` if both are `null`; it may be safer to always fall back to `systemClassLoader` or throw a clear exception when no non-null candidate is available.
- The hard-coded class checks in `canLoadMavenComponents` tightly couple the selection logic to specific Maven/Plexus types; consider narrowing the set or centralizing these class names as constants so future Maven/Resolver upgrades can be adjusted in one place.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
1349d7f to
58816a9
Compare
czentgr
left a comment
There was a problem hiding this comment.
Didn't know there are multiplee types of class loaders. This looks good to me.
Only question is, I assume the checks are in order of the preference of the classloader used? I don't know the difference of these.
So ThreadClassLoader is the basic one followed by Artifact, and System Class loader in that order?
Yes, the checks are intentionally ordered based on the preferred classloader selection priority. The method attempts to use the Thread Context ClassLoader first, as this is the standard and expected mechanism in typical JVM and Maven environments. If that classloader does not have visibility into the required Maven components (which can happen in isolated environments such as Apache Spark containers), the logic falls back to the classloader that loaded the ArtifactResolver class. Finally, the System ClassLoader is used as a last fallback. The method returns the first classloader that successfully loads the required Maven components, ensuring compatibility across both standard and containerized runtime environments. |
imjalpreet
left a comment
There was a problem hiding this comment.
@ShahimSharafudeen, can you please have a look at the comments by Sourcery here: #9 (review)?
58816a9 to
b70193a
Compare
@imjalpreet — The codebase has been updated based on the comments from Sourcery. |
| return cachedEffectiveClassLoader; | ||
| } | ||
|
|
||
| ClassLoader fallback = systemClassLoader != null ? systemClassLoader : thisClassLoader; |
There was a problem hiding this comment.
I don't have much idea on this, so I would like to understand why the systemClassLoader is given the highest priority here, and why we are ignoring contextClassLoader?
There was a problem hiding this comment.
As per the code, the order of checks is contextClassLoader > thisClassLoader > systemClassLoader. The systemClassLoader is used as a fallback because it typically has broader class visibility when all validation checks fail. This is not strictly about priority; rather, it is about choosing the best available option when all other options have failed.
|
@shrinidhijoshi - Could you please review this PR when you have time? This change depends on resolving the test failure in the Spark Integration CI in Presto. |
|
@czentgr @imjalpreet - Could you please do one round of review from your side? |
Description
This change improves the reliability of Maven Resolver / Plexus initialization in environments with isolated or non-standard classloader hierarchies (e.g., Spark executors, Docker containers, or shaded runtimes).
Previously, the resolver initialization relied solely on the Thread Context ClassLoader (TCCL). In certain distributed or containerized environments, the TCCL may not have visibility into required Maven components, resulting in runtime failures such as:
RepositorySystem initialization errors
ServiceLoader lookup failures
PlexusContainer component discovery failures
This PR introduces a defensive classloader selection mechanism that evaluates multiple candidate classloaders and selects the first one capable of loading the required Maven components.
Motivation and Context
In distributed and containerized environments (such as Spark), classloader isolation can prevent the Thread Context ClassLoader from accessing required Maven components. This leads to runtime failures even when dependencies are present on the classpath, resulting in errors observed in Spark Docker containers during Maven Resolver initialization.
This issue is caused by the changes introduced in the Airlift Resolver 1.7 PR : #2
Testing
Tested on OSS Presto PR by the help of jitpack : prestodb/presto#25295