[WIP / experiment] Specialization of std.foldl(std.mergePatch, arr, target)#254
Closed
JoshRosen wants to merge 21 commits intodatabricks:masterfrom
Closed
[WIP / experiment] Specialization of std.foldl(std.mergePatch, arr, target)#254JoshRosen wants to merge 21 commits intodatabricks:masterfrom
std.foldl(std.mergePatch, arr, target)#254JoshRosen wants to merge 21 commits intodatabricks:masterfrom
Conversation
This saves some extra allocations.
…ergePatch; add new tests
Still need to add more tests for plus field handling; currently lack full mutation coverage for the added code there. Also need more cases for order preservation during field merges.
Contributor
Author
|
On further reflection, I think this ends up changing lazy evaluation semantics for But I think there's still substantial room to speed this up via other potential optimizations that I spotted while digging into this, including optimizations to speed up field name checks. I also spotted a pre-existing bug in our I'll fix both of those in separate PRs. |
std.foldl(std.mergePatch, arr, target)std.foldl(std.mergePatch, arr, target)
Contributor
Author
|
Closing this, as I managed to gain even larger speedups via the optimizations in #258 and those don't change semantics. I may end up repurposing parts of the specialization testing bits in a future PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
I have marked this PR as a draft because I want to do one more round of self-review on the added test cases and test coverage before merging, plus possibly clean up some of the comments. I'm opening the PR now for feedback.
Overview
This PR adds a specialization for
std.foldl(std.mergePatch, patches, target), improving performance when applying large numbers of patches to an object.Motivation
This pattern is sometimes used for flattening or reshaping large lists of configurations.
As a simplified toy example:
yields
{ "aws": { "us-west-1": { "auth": { "servers": 2 }, "webapp": { "servers": 1 } } }, "azure": { "us-west-2": { "auth": { "servers": 2 } } } }In some cases we use this pattern to flatten very large lists of objects. In profiling, I noticed significant time spent in
mergePatchand began looking for optimizations.This PR's optimized implementation
This PR adds a specialized
FoldlMergePatchfunction to optimize this pattern. This function is not directly invokable by end users; instead, it's automatically injected using the specialization framework from #119 / 0bd255a.In psuedocode, the optimized implementation does the following:
Objpatches overwrite the target, hence special handling here.Nullthen it removes the field, so we can ignore the target and all earlier patch values when we see a null.+:fields present anywhere in the object or its children).mergePatchtests added in Fix a bug in hidden field handling in std.mergePatch #250 for details.This optimized implementation has the same O(n²) worst-case asymptotic complexity as the unoptimized
std.foldl(std.mergePatch, ..., ...)approach, but can be significantly faster in practice because it produces much less garbage: it avoids building and discarding intermediate merge results and therefore avoids having to repeatedly recompute the intermediate merge target's visible fields.I also play several tricks to reduce object allocations, including:
value0in the resultingVal.Obj.Array[Val]for holding patch values when computing the set of values that will participate in a sub-merge at a given ancestor path: I pass around asizefield to denote the "valid" size of this array, avoiding copies for trimming it to size.cleanObject, which is a perhaps-overly-complex optimized equivalent of the regular megePatch'srecSinglehelper function: this is used for recursively removing non-visible or explicitly-null keys frompatchobjects when they don't merge with the target, but it also has the side effect of dropping+:modifiers. My implementation optimizes for the common scenario where patches are purely additive by avoiding new object allocation when the cleaning would be a no-op.Correctness testing
I added a new dedicated test suite for this optimization.
I noticed that our existing unit tests don't meaningfully exercise specialization because the StaticOptimizer usually ends up constant/static-folding the test cases. To address this, I added a new internal
disableStaticApplyForBuiltinFunctionssetting which we can set in specific tests.I also added an internal
disableBuiltinSpecializationsetting for disabling specialization.Combining these together, I added a test helper which compares specialized and non-specialized execution and asserts that the answers are equal with field ordering preserved.
Warning
As we saw in #250, merge patch's implicit behaviors can be subtle and I'm still not 100% sure that I've faithfully covered all cases, which is why I've kept this marked as a draft to give me a chance to revisit the tests with fresh eyes. In particular, I'm not 100% certain that I've handled standard vs.
unhidevisibility correctly. For some reason, not yet clear to me,sjsonnet's mergePatch creates fields withUnhidevisibility. This might constrain my ability to do thecleanObjectoptimizations, but it might also be a behavior difference w.r.t. the official jsonnet implementation and thus perhaps something we could change to more faithfully match their behavior.Performance testing
🚧 I'll attach some of my microbenchmarks later.
In end-to-end tests on a very complex real-world input, this PR's patch cut ~20% of overall wallclock time.