-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Introduce PrimitiveArrayBuilder::build(), avoid use of ArrayData
#9305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b3bbc1c to
12525b3
Compare
PrimitiveArrayBuilder::build(), avoid use of ArrayData
|
run benchmark builder |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark builder |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤔 seems to get slower. Will investigate |
|
run benchmark builder |
|
🤖 |
| values: ScalarBuffer<T::Native>, | ||
| nulls: Option<NullBuffer>, | ||
| ) -> Self { | ||
| Self::assert_compatible(&data_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does an extra check that wasn't there before when using build_unchecked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not relevant if we removed all build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think I have a way to remove this
|
🤖: Benchmark completed Details
|
|
And it's the other way around (so noise confirmed) |
Yeah. I thought of some way to make this faster though, so working on that now |
fcebc95 to
11d7c80
Compare
6b8125c to
6895106
Compare
| values_builder: Vec<T::Native>, | ||
| null_buffer_builder: NullBufferBuilder, | ||
| data_type: DataType, | ||
| /// Optional data type override (e.g. to add timezone or precision/scale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This time I tried switching to use Option so we only pay the extra type check when there is actually a datatype override
| /// not [PrimitiveArray::is_compatible] with the builder's primitive type | ||
| /// `T`. | ||
| pub fn with_data_type(self, data_type: DataType) -> Self { | ||
| assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assert is done as part of the PrimitiveArray::with_type later on
I also updated the tests to show that
|
run benchmark builder |
|
🤖 |
|
🤖: Benchmark completed Details
|
This reverts commit 6895106.
|
run benchmark builder |
|
🤖 |
|
(testing without changes to the benchmark) |
|
🤖: Benchmark completed Details
|

Which issue does this PR close?
ArrayDatawith direct Array construction, when possible #9298Note if this looks good I will file a ticket about adding
buildto the other array buildersRationale for this change
While reviewing #9303 from @Dandandan I noticed that using the primitive builders to create arrays was non ideal for two reasons:
DataType(which while not super expensive is total overhead)Vecunecessairly)If this approach is accepted, I will make a ticket to track adding
buildto the other buildersWhat changes are included in this PR?
finishandfinish_clonedto avoid using ArrayDatabuildwhich consumes the builderThis is similar to the build methods added to the other builders here
{Null,Boolean,}BufferBuilder#9155Are these changes tested?
Yes by CI and new doc tests
I will also run benchmarks
Are there any user-facing changes?