-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Preserve input field nullability in ArrayAgg return field #19868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Preserve input field nullability in ArrayAgg return field #19868
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR attempts to preserve the input field's nullability in the ArrayAgg aggregate function's return field. When the input field is non-nullable, the result field would be marked as non-nullable; when the input is nullable, the result field would remain nullable.
Changes:
- Added a
return_fieldmethod override to theArrayAggimplementation that sets the result field's nullability to match the input field's nullability - Added a unit test to verify that the schema metadata correctly reflects the input field's nullability
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
668ce8c to
b833445
Compare
114f5bc to
2b2a41b
Compare
- Override return_field() to preserve input field nullability in list elements - Add input_nullable field to all ArrayAgg accumulators - List itself remains nullable (NULL for empty groups) - Inner elements nullability now matches input field nullability
024785b to
24e57cb
Compare
- Separate state() from evaluate() in all three accumulators - state() always uses nullable=true for inner elements (matches state_fields()) - evaluate() uses input_nullable to preserve nullability for final output - Fix clippy: use std::slice::from_ref() instead of .clone() on FieldRef
- Fix array_sort return_type() to preserve input inner field nullability - Fix clippy: use Arc::clone() instead of .clone() on FieldRef - Update unnest_with_redundant_columns snapshot for non-nullable inner
|
|
||
| fn state(&mut self) -> Result<Vec<ScalarValue>> { | ||
| Ok(vec![self.evaluate()?]) | ||
| // State uses nullable inner elements to match state_fields() schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like something we should be fixing in state_fields() instead of having this handling code
| impl Accumulator for DistinctArrayAggAccumulator { | ||
| fn state(&mut self) -> Result<Vec<ScalarValue>> { | ||
| Ok(vec![self.evaluate()?]) | ||
| // State uses nullable inner elements to match state_fields() schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| } | ||
|
|
||
| let mut result = vec![self.evaluate()?]; | ||
| // State uses nullable inner elements to match state_fields() schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too
- Make return_type() return error since return_field() is implemented - Simplify comments in return_field() - Revert array_sort changes (unrelated to original issue) - Revert spark/collect.rs changes (UDAF return_field not fixed)
84c066f to
402927a
Compare
|
ok ill update this all |
|
@GaneshPatil7517 Please refrain from closing a discussion when no action has been taken or no comment was left. My comments regarding the state related functions don't seem to have been addressed. |
Which issue does this PR close?
Fixes #19768
Rationale for this change
Currently,
ArrayAggalways produces an output with data type: