Skip to content

Use partition createTime in the event listener path#193

Merged
haroldjimenez merged 11 commits into
mainfrom
feature/use-partition-create-time
Jul 1, 2025
Merged

Use partition createTime in the event listener path#193
haroldjimenez merged 11 commits into
mainfrom
feature/use-partition-create-time

Conversation

@HamzaJugon
Copy link
Copy Markdown
Contributor

@HamzaJugon HamzaJugon commented Jun 9, 2025

In version 3.6.1 we implemented the use of Hive's createTime when scheduling partitions for the table discovery path. Meaning when a tag is added onto a table, we use the createTime when setting creationTimestamp which is used to set the cleanupTimestamp.

In this PR we are adding this for the Event path. So when partitions are scheduled based on events, we also use the hive createTime for creationTimestamp.

@HamzaJugon HamzaJugon changed the title Feature/use partition create time Use partition createTime in the event listener path Jun 9, 2025
@HamzaJugon HamzaJugon marked this pull request as ready for review June 10, 2025 15:29

private LocalDateTime getPartitionCreationTime(String databaseName, String tableName, String partitionName) {
try (HiveClient hiveClient = hiveClientFactory.newInstance()) {
PartitionInfo partitionInfo = hiveClient.getSinglePartitionInfo(databaseName, tableName, partitionName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if partitionName is null I assume we get some nullpointer somewhere which we catch and fall back to now.
So probably we don't need to check like: https://github.com/ExpediaGroup/beekeeper/pull/193/files#diff-2f5698d75962cb0b22b55102d7a0c93fd475ed3f7e9540c2b0148ca80cc2cae4R148

}
}

public PartitionInfo getSinglePartitionInfo(String databaseName, String tableName, String partitionName) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's partitionName?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partitionName is constructed locally by BK into a string from raw event data. I added getSinglePartitionInfo method that takes this partitionName to query just 1 partition for event-driven scheduling, rather than fetching all partitions like the table discovery path does.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a line of javadoc please what the format is, so anyone calling knows what to expect. With an example.

Copy link
Copy Markdown
Contributor

@patduin patduin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments please have a look

@haroldjimenez haroldjimenez merged commit 38f906a into main Jul 1, 2025
2 checks passed
@haroldjimenez haroldjimenez deleted the feature/use-partition-create-time branch July 1, 2025 21:10
@haroldjimenez haroldjimenez restored the feature/use-partition-create-time branch July 2, 2025 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants