Rebase twitter's commit onto prestodb master#248
Open
beinan wants to merge 129 commits into
Open
Conversation
When SORTED_WRITE_TO_TEMP_PATH_ENABLED is true, we would require a temporary path for sorted writes.
Soft memory limits are default memory limits given to each query that can be overridden using session properties up to the hard limit set by the existing configuration properties. Having soft limits makes it easier to migrate a workload to lower memory limits by allowing only the queries that require higher limits to specify them while defaulting other queries to lower limits. Available soft memory limit configuration properties: query.soft-max-memory-per-node query.soft-max-total-memory-per-node query.soft-max-total-memory query.soft-max-memory
Adding a configuration to handle compression codec for handling orc and dwrf storage format. Use hive.orc_compression_codec to override the generic compression codec for orc and dwrf storage format. The reason to add an extra configuration was the unavailability of uniform support of all compression codec across all storage formats. The ZSTD compression codec is only available for orc and dwrf storage format.
We have need for this function in several places, and it is purely geometric.
Adds a parent abstract class to PrestoS3FileSystemMetricsCollector so that other SDK clients can share the metrics collector support. Adds reporting for client retry pause time indicating how long the thread was asleep between request retries in the client itself. Fixes the reporting client timings. Previously, when the client retried a request only the first request timings would be recorded in the stats. Now, all request timings are reported individually.
Previously, an instance of PrestoS3FileSystemStats instance was created in PrestoS3ClientFactory which means it would not report S3 client stats to the instance registered with JMX. This would only have affected PrestoS3Select clients. Now the same metric instance is shared with PrestoS3FileSystem
In SHOW FUNCTIONS results, list the built-in functions first, and then the SQL functions, in alphabetical order of the qualified function names.
Minor variable renames
Page sink commit mechanism is a general connector capability and is not restricted only for partition commit.
It can be used not only to commit lifespans or physical partitions. In fact it can be used to commit any page sink write.
Co-authored-by: Andrii Rosa <andriirosa@fb.com>
Tasks in spark are often retried and run speculatively, thus the commit protocol required for table writes to avoid data corruption Co-authored-by: Andrii Rosa <andriirosa@fb.com>
A footer consists of two parts. - offset of each stripe's start location. - footer's total size in bytes.
TestRowBasedSerialization sometimes fails calling createRandomLongDecimalsBlock with less than 10 positions. We should allow blocks with less than 10 positions to be created if there are such needs. This commit removes the check to enforce the positionCount check, and comments were added to suggest the user use a larger positinCount when desired nullRate > 0.
We skip the index files.
…ld failure on java 11
… Parquet schema mismatch checking (twitter-forks#245) * Compare type by (name,type) pair rather than (index,type) pair during Parquet schema mismatch checking * add unit test for parquet schema mismatch checker
cf7de87 to
3ff4f01
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.