Use partition createTime in the event listener path#193
Conversation
|
|
||
| private LocalDateTime getPartitionCreationTime(String databaseName, String tableName, String partitionName) { | ||
| try (HiveClient hiveClient = hiveClientFactory.newInstance()) { | ||
| PartitionInfo partitionInfo = hiveClient.getSinglePartitionInfo(databaseName, tableName, partitionName); |
There was a problem hiding this comment.
if partitionName is null I assume we get some nullpointer somewhere which we catch and fall back to now.
So probably we don't need to check like: https://github.com/ExpediaGroup/beekeeper/pull/193/files#diff-2f5698d75962cb0b22b55102d7a0c93fd475ed3f7e9540c2b0148ca80cc2cae4R148
| } | ||
| } | ||
|
|
||
| public PartitionInfo getSinglePartitionInfo(String databaseName, String tableName, String partitionName) { |
There was a problem hiding this comment.
partitionName is constructed locally by BK into a string from raw event data. I added getSinglePartitionInfo method that takes this partitionName to query just 1 partition for event-driven scheduling, rather than fetching all partitions like the table discovery path does.
There was a problem hiding this comment.
Add a line of javadoc please what the format is, so anyone calling knows what to expect. With an example.
patduin
left a comment
There was a problem hiding this comment.
Added comments please have a look
In version
3.6.1we implemented the use of Hive'screateTimewhen scheduling partitions for the table discovery path. Meaning when a tag is added onto a table, we use thecreateTimewhen settingcreationTimestampwhich is used to set thecleanupTimestamp.In this PR we are adding this for the Event path. So when partitions are scheduled based on events, we also use the hive
createTimeforcreationTimestamp.