DO NOT MERGE: CI-3898: Add rules, build and allow-failure tests for adb-8.x by Daniil290620 · Pull Request #1455 · arenadata/gpdb

Daniil290620 · 2025-04-07T07:29:16Z

Here are some reminders before you submit the pull request

Add tests for the change
Document changes
Communicate in the mailing list if needed
Pass make installcheck
Review a PR in return to support the community

This is the last patch for replacing pygresql with psycopg2 in Greenplum. This patch mainly targets the gpMgmt tools. Benefits for replacing pygresql with psycopg2. - Psycopg2 is maintained actively we have encountered bugs that haven't been fixed by the upstream yet, e.g., https://github.com/greenplum-db/gpdb/pull/13953. - Psycopg2 is provided by Rocky Linux and Ubuntu. That is to say, we don't need to vendor it ourselves. - Last but not least, we got a chance to clean up leacy codes during the removal process, e.g., https://github.com/greenplum-db/gpdb/pull/15983. After this patch, we need to do the following things. - Add psycopg2 as a dependency of the rpm/deb package. - Remove the pygresql source code tarball from the gpdb repo. - Tidy up READMEs and requirements.txt files. --------- Co-authored-by: Chen Mulong <chenmulong@gmail.com> Co-authored-by: Xiaoxiao He <hxiaoxiao@vmware.com> Co-authored-by: zhrt123 <hzhang2@vmware.com> Co-authored-by: Piyush Chandwadkar <pchandwadkar@vmware.com> Co-authored-by: Praveen Kumar <36772398+kpraveen457@users.noreply.github.com>

* Fully support Rocky9 platform for GP7 * Align the format of exttable outfile to Rocky9 * Fix compile_gpdb_clients_windows failed * Fix gpMgmt tests failure to support rocky9/oel9/rhel9 * Fix gpMgmt/bin/gpload_test/gpload2 tests on Rocky9 Authored-by: Ning Wu <ningw@vmware.com>

The PX layer in pgcrypto is handling digest padding on its own uniformly for all backend implementations. Starting with OpenSSL 3.0.0, DecryptUpdate doesn't flush the last block in case padding is enabled so explicitly disable it as we don't use it. This will be backpatched to all supported version once there is sufficient testing in the buildfarm of OpenSSL 3. Reviewed-by: Peter Eisentraut, Michael Paquier Discussion: https://postgr.es/m/FEF81714-D479-4512-839B-C769D2605F8A@yesql.se Backpatch-through: 9.6

OpenSSL 3 introduced the concept of providers to support modularization, and moved the outdated ciphers to the new legacy provider. In case it's not loaded in the users openssl.cnf file there will be a lot of regress test failures, so add alternative outputs covering those. Also document the need to load the legacy provider in order to use older ciphers with OpenSSL-enabled pgcrypto. This will be backpatched to all supported version once there is sufficient testing in the buildfarm of OpenSSL 3. Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/FEF81714-D479-4512-839B-C769D2605F8A@yesql.se Backpatch-through: 9.6

The following routines are called within pgcrypto when handling digests but there were no checks for failures: - EVP_MD_CTX_size (can fail with -1 as of 3.0.0) - EVP_MD_CTX_block_size (can fail with -1 as of 3.0.0) - EVP_DigestInit_ex - EVP_DigestUpdate - EVP_DigestFinal_ex A set of elog(ERROR) is added by this commit to detect such failures, that should never happen except in the event of a processing failure internal to OpenSSL. Note that it would be possible to use ERR_reason_error_string() to get more context about such errors, but these refer mainly to the internals of OpenSSL, so it is not really obvious how useful that would be. This is left out for simplicity. Per report from Coverity. Thanks to Tom Lane for the discussion. Backpatch-through: 9.5

This has previously not been a problem (that anyone ever reported), but in future OpenSSL versions (3.0.0), where legacy ciphers are/can be disabled, this is the place where this is reported. So we need to catch the error here, otherwise the higher-level functions would return garbage. The nearby encryption code already handled errors similarly. Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/9e9c431c-0adc-7a6d-9b1a-915de1ba3fe7@enterprisedb.com Backpatch-through: 9.6

This reverts commit d2559fe.

When execute inject fault `checkpoint_after_redo_calculated` or 'checkpoint_control_file_updated' in a newly promoted mirror node, connection failed error occurs due to the fatal log 'mirror is being promoted'. It means when connection state is MIRROR_READY but the role change to primary during promoting, the connection will be declined to avoid confusion. Wait for segment promotion finished and accept connection before inject fault. Co-authored-by: Xing Guo higuoxing@gmail.com

This issue is environment-sensitive, where the SSL tests could fail in various way by feeding on defaults provided by sslcert, sslkey, sslrootkey, sslrootcert, sslcrl and sslcrldir coming from a local setup, as of ~/.postgresql/ by default. Horiguchi-san has reported two failures, but more advanced testing from me (aka inclusion of garbage SSL configuration in ~/.postgresql/ for all the configuration parameters) has showed dozens of failures that can be triggered in the whole test suite. History has showed that we are not good when it comes to address such issues, fixing them locally like in dd87799, and such problems keep appearing. This commit strengthens the entire test suite to put an end to this set of problems by embedding invalid default values in all the connection strings used in the tests. The invalid values are prefixed in each connection string, relying on the follow-up values passed in the connection string to enforce any invalid value previously set. Note that two tests related to CRLs are required to fail with certain pre-set configurations, but we can rely on enforcing an empty value instead after the invalid set of values. Reported-by: Kyotaro Horiguchi Reviewed-by: Andrew Dunstan, Daniel Gustafsson, Kyotaro Horiguchi Discussion: https://postgr.es/m/20220316.163658.1122740600489097632.horikyota.ntt@gmail.com backpatch-through: 10

backport of commit f0d2c65 to releases 11 and 12 This means the SSL tests will fail on machines with extremely old versions of OpenSSL, but we don't know of anything trying to run such tests. The ability to build is not affected. Discussion: https://postgr.es/m/73e646d3-8653-1a1c-0a39-739872b591b0@dunslane.net

…250) In the previous https://github.com/greenplum-db/gpdb/pull/14144, we finished the support of compression transmission for readable gpfdist external table. In this PR, I developed the code for zstd compression transmission for the writable gpfdist external table. The pr involves three aspects of changes. The first part is adding decompression code for gpfdist. The second part is adding compression code for gpdb. Other modifications mainly involve the regression test for writable gpfdist external table. Finally, to use the zstd compression to transfer data for a writable external table, we need to use the flag --compress to turn on the compression transmission for gpfdist.

fix a compiler warning.

rename master to coordinator in gpfdist **Signed-off-by:** Yongtao Huang <yongtaoh@vmware.com>

In an extreme case, two gpfdist occupy one port(ipv4/ipv6). The reason for this problem includes two aspects. The first one is that 'bind' is not mutually exclusive. And another is that when listening to a port fail, gpfdist will try the same port with a different protocol(ipv4 or ipv6). So the PR fixes the problem by changing the handling for failed listening behavior.

Previously, header_callback(), used to get and parse HTTP headers, took only first status header from all header blocks to show it to user. The problem is that we can have multiple messages with multiple header blocks. In case of error, only first (successful) status is shown to user, which is not informative. This patch fixes header_callback(), which now process all status headers.

@water32

…#14698) Compression is very useful for data transmission with low bandwidth and high efficiency. But compression is a computationally intensive task and may increase the cost of time in the compression stage. To improve the efficiency of data transmission with compression, multi-thread data compression and transmission are introduced in this PR. The feature makes each segment served by a single thread. In the main thread, requests are read and further parsed into specific data structures. The data is read into a buffer, then a sub-thread is created to compress data and send the compressed data. To avoid conflict between threads, each request can be only served by one sub-thread. What's more, all public resources are accessed by a main thread to avoid waiting caused by thread synchronization. To reduce the extra cost incurred by thread switch, we limit the max number of threads according to the number of CPU cores. After the benchmark test in a local development environment, we proved two promotions: gpfdist saves about 60%~70% bandwidth occupation when compression is open. multi-thread compression has more efficient transmission. If you want multi-thread transmission with compression, both flag '--compress' and '--multi_thread' is required. To gain more efficiency, we recommend setting the max-length flag (-m) to be 1M. Co-authored by: @water32

This commit fix some typos in gpfdist, such as: - a request just start -> a request just start**s** - ret will be a error number -> ret will be **an** error number - It is for compress data in buffer. -> It is for compress**ing** data in buffer. et al. And delete outdated comment. Signed-off-by: Yongtao Huang <yongtaoh@vmware.com>

…7X) (#15724) The default value for session clean up timeout is 300s, but in heavy workload, this default value is not long enough to wait for segment next POST request to arrive. Then it will cause "400 invalid request due to wrong sequence number" failure. So this commit add one option '-k' to set this value for gpfdist. The default is 300, the valid range would be (300..86400).

…fication. (#15980) To simplify the ssl authentication of gpfdist, we provide one more configuration option to gpfdist and changed the gpdb side SSL identity authentication behavior when verify_gpfdist_cert=off. The new command flag is ssl_verify_peer. If the flag is 'on', gpfdist will be forced to check the identity of clients, and CA certification file (root.crt) is necessary to be laid in the SSL certification directory. If the flag is 'off', even the GUC verify_gpfdists_cert=on, gpfdist will not check the identity of clients, and it doesn't matter whether the CA certification file exists on the gpfdist side. And SSL certificate files and SSL private key files on the gpdb segments' side are no longer necessary. This greatly simplifies the configuration burden when using the gpfdists protocol for data transfer. Note that The default value of ssl_verify_peer is 'on'. Another change is regarding the changes to gpdb identity authentication when verify_gpfdist_cert=off. When verify_gpfdist_cert=off, the CA certification file is unnecessary for gpdb segments. Finally, the existence checks for the certification files and private key files on the GPDB segment are no longer mandatory. This means that we can only determine if the files are configured correctly based on the error message received during the SSL connection. The table below demonstrates the effectiveness of using these two types of indicators.： verify_gpfdists_cert = on, ssl_verify_peer=on: Certification files, CA certification files, and private key files are necessary for both gpfdist and gpdb segments. verify_gpfdists_cert = on, ssl_verify_peer=off: Certification file and private key file for gpfdist are necessary, CA certification files are necessary for gpdb segments. verify_gpfdists_cert = off, ssl_verify_peer=on: Certification files and private key files are necessary for gpdb segments, while CA certification files are unnecessary. A certification file, CA certification file and private key file are necessary for gpfdist. verify_gpfdists_cert = off, ssl_verify_peer=off: Certification file and private key file are necessary for gpfdist. SSL related files are not necessary for gpdb segments.

Gpdb is going to support rocky9/rhel9/oel9. In this operation system, the default openssl version is 3.*, which doesn't support tls 1.0 and tls 1.1. Thus, we have to do some adjustments for gpfdist regression tests.

OpenSSL 3.0 does not allow to use self-signed certificates with IP address as host. Adding alternative IP address or hostname requires to generate certificates via config file ('-config' option). All additional IPs or hosts should be written in the "alt_names" section. To avoid generating config file for OpenSSL script uses "localhost" instead of 127.0.0.1. Patch changes host "127.0.0.1" to "localhost" in gpfdist's ssl test and in the generate_certs.sh script. Such changes successfully work with OpenSSL 1.1.1.

This is the last patch for replacing pygresql with psycopg2 in Greenplum. This patch mainly targets the gpMgmt tools. Benefits for replacing pygresql with psycopg2. - Psycopg2 is maintained actively we have encountered bugs that haven't been fixed by the upstream yet, e.g., https://github.com/greenplum-db/gpdb/pull/13953. - Psycopg2 is provided by Rocky Linux and Ubuntu. That is to say, we don't need to vendor it ourselves. - Last but not least, we got a chance to clean up leacy codes during the removal process, e.g., https://github.com/greenplum-db/gpdb/pull/15983. After this patch, we need to do the following things. - Add psycopg2 as a dependency of the rpm/deb package. - Remove the pygresql source code tarball from the gpdb repo. - Tidy up READMEs and requirements.txt files. --------- Co-authored-by: Chen Mulong <chenmulong@gmail.com> Co-authored-by: Xiaoxiao He <hxiaoxiao@vmware.com> Co-authored-by: zhrt123 <hzhang2@vmware.com> Co-authored-by: Piyush Chandwadkar <pchandwadkar@vmware.com> Co-authored-by: Praveen Kumar <36772398+kpraveen457@users.noreply.github.com>

Attach gpdb cluster logs to allure report This is partial cherry-pick from commit 85efddc, other parts were ported in commit 1e1b6c8.

it is needed since gp-concourse-cluster-provisioner has changed it's code to set up a cluster with cdw instead of mdw, change standby_master to standby_coordinator, also it change its MASTER_DIRECTORY to /data/gpdata/coordination from /data/gpdata/master [GPR-1052] Authored-by: Shaoqi Bai <bshaoqi@vmware.com>

This reverts commit 41646c8.

In some gpexpand tests, checks if the expand is interrupted after a set time, but sometimes gpexpand may expand faster than the set time and the test will fail. This patch adds a 5-second sleep after the expansion of the first table to stimulate a long expansion without increasing the amount of data. To add sleep, a fault injector in the reindex_relation is used, which occurs during the expansion of the table.

…ch) (#14978) This pr can fix the bug reported in [issue 14644](https://github.com/greenplum-db/gpdb/issues/14644) **Bug Detail:** After analyzing a table with a dropped subclass, the value on the coordinator may differ from the segments. See the SQL result with bug below: ``` gpadmin=# CREATE TYPE test_type2 AS (a int, b text); CREATE TYPE gpadmin=# CREATE TABLE test_tbl2 OF test_type2; CREATE TABLE gpadmin=# CREATE TABLE test_tbl2_subclass () INHERITS (test_tbl2); CREATE TABLE gpadmin=# DROP TABLE test_tbl2_subclass; DROP TABLE gpadmin=# analyze; ANALYZE gpadmin=# select gp_segment_id, relhassubclass from pg_class where relname = 'test_tbl2'; gp_segment_id | relhassubclass ---------------+---------------- -1 | t (1 row) gpadmin=# select gp_segment_id, relhassubclass from gp_dist_random('pg_class') where relname = 'test_tbl2'; gp_segment_id | relhassubclass ---------------+---------------- 2 | f 1 | f 0 | f (3 rows) ``` **Correct Behavior:** the value of `relhassubclass `on the coordinator should be the same as segments. ``` gpadmin=# DROP TABLE test_tbl2_subclass; DROP TABLE gpadmin=# analyze; ANALYZE gpadmin=# select gp_segment_id, relhassubclass from pg_class where relname = 'test_tbl2'; gp_segment_id | relhassubclass ---------------+---------------- -1 | f (1 row) gpadmin=# select gp_segment_id, relhassubclass from gp_dist_random('pg_class') where relname = 'test_tbl2'; gp_segment_id | relhassubclass ---------------+---------------- 2 | f 1 | f 0 | f (3 rows) ``` Signed-off-by: Yongtao Huang <yongtaoh@vmware.com>

Daniil290620 · 2025-04-07T12:03:17Z

bender build

Daniil290620 · 2025-04-15T06:35:58Z

bender build

* Added GUCs to support JIT using ORCA costing model 1. CREATING NEW GUCs Following GUCs were added for ORCA, specifically for JIT compilation. Now JIT flags can be set using ORCA specific JIT GUCs or Planner specific JIT GUCs. 1. optimizer_jit 2. optimizer_jit_above_cost 3. optimizer_jit_inline_above_cost 4. optimizer_jit_optimize_above_cost JIT compilation is triggered based on the cost of a plan. Earlier, the cost of plan generated by ORCA/planner was compared with the following JIT specific costing GUCs (defined for planner): 1. jit_above_cost 2. jit_optimize_above_cost 3. jit_inline_above_cost Based on the certain conditions, JIT flags were set in the planned statement. Thus, we did not have individual control for ORCA, as there were common GUCs for them. Since the costing model of ORCA and Planner are different (Planner cost usually higher), setting the JIT flags based on the common JIT costing GUCs could lead to false triggering of JIT. To prevent such situations and gain control over triggering of JIT compilation for ORCA, separate GUCs based on ORCA costing are created for ORCA. 2. Setting default values of GUCs Based on the TPC-H and other runs, for queries which perform better with JIT, following default values were finalized for GUCs. The complete analysis is available at the JIRA link(GPQP-204). GUC NAME DEFAULT VALUE(plan cost) 1. optimizer_jit_above_cost 7500 2. optimizer_jit_inline_above_cost 37500 3. optimizer_jit_optimize_above_cost 37500 * combining compute_Jit_flag() for optimizer and planner * Made format changes and addressed review comments * Addressing review comments- reverted changes in pg_jit_available() * Addressing review comments- variable name change Updated variable name from "optimizer_jit" to "optimizer_jit_enabled". * Updating default values of JIT GUCs for ORCA Estimated and assigned default values to JIT GUCs for ORCA. Based on the TPC-H and other runs, for queries which perform better with JIT, following default values were finalized for GUCs. The complete analysis is available at the JIRA link(GPQP-204). GUC NAME DEFAULT VALUE(plan cost) 1. optimizer_jit_above_cost 7500 2. optimizer_jit_inline_above_cost 37500 3. optimizer_jit_optimize_above_cost 37500 --------- Co-authored-by: nishant sharma <nishants4@nishants46M6RY.vmware.com>

For gpdb7 will only support python3 and we also update the gpMgmt/requirements-dev.txt to remove the stick the modules to the python2 compatible versions introduced by the commit 68a8237. [GPR-1153] Co-authored-by: Anusha Shakarad <shakarada@vmware.com> Co-authored-by: Ning Wu <ningw@vmware.com>

In the scenario "gprecoverseg should not give warning if pg_basebackup is running for the up segments", after stopping the primary segments, there is no waiting for switching to mirrors, and therefore the step "the user suspend the walsender on the primary on content 2" fails. This patch adds the "user can start transactions" step after stop primary segments to wait for switching to mirrors. As is done in other scenarios.

Minor fixes for running tests on commit 32fba77 The Dockerfile and scripts for running behave tests are taken from adb-7.2.0. The README.Ubuntu.bash required for container build has also been updated. Additional packages and configuration flags have been added to Dockerfile.ubuntu SSH key verification is disabled, which increases the stability of behave tests. Disabled the inclusion of dependencies during the build in compile_gpdb.bash because we don't use it. In the script for running ORCA unit tests, the xerces build is disabled because it is not required. Fix the use of the old python version in mirrors_mgmt_utils. In explain_format, fix passing the test with ORCA in tests with jit. Changes are taken from 7e86a6b

Daniil290620 · 2025-04-23T07:24:27Z

bender build

Daniil290620 · 2025-04-23T08:52:18Z

bender build

Daniil290620 · 2025-04-23T09:13:35Z

bender build

Daniil290620 · 2025-04-23T09:52:37Z

bender build

Daniil290620 · 2025-04-23T11:36:24Z

bender build

Daniil290620 · 2025-04-23T12:11:16Z

bender build

Daniil290620 · 2025-04-23T13:04:51Z

bender build

Daniil290620 · 2025-04-23T14:26:54Z

bender build

Daniil290620 · 2025-04-23T15:04:06Z

bender build

higuoxing and others added 27 commits April 2, 2025 19:31

Revert "Add testcase for FIPS mode operations in pgcrypto"

a4c6078

This reverts commit d2559fe.

fix compiler warning (#14910)

06d9ac0

fix a compiler warning.

inclusive terminology for Greenplum 7 in gpfdist (#14987)

0b69820

rename master to coordinator in gpfdist **Signed-off-by:** Yongtao Huang <yongtaoh@vmware.com>

fix test (#16277)

f9d4491

Gpdb is going to support rocky9/rhel9/oel9. In this operation system, the default openssl version is 3.*, which doesn't support tls 1.0 and tls 1.1. Thus, we have to do some adjustments for gpfdist regression tests.

Add allure report to behave tests

6073756

Attach gpdb cluster logs to allure report This is partial cherry-pick from commit 85efddc, other parts were ported in commit 1e1b6c8.

Revert "gpexpand: improve timeout tests"

7974fec

This reverts commit 41646c8.

KnightMurloc force-pushed the ADBDEV-7257-ci branch from ea66987 to b0d254f Compare April 14, 2025 13:23

Tao-T and others added 5 commits April 22, 2025 11:08

inited JIT support for plan generated by ORCA

010992b

KnightMurloc force-pushed the ADBDEV-7257-ci branch from 3a09dab to 1680669 Compare April 23, 2025 04:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DO NOT MERGE: CI-3898: Add rules, build and allow-failure tests for adb-8.x#1455

DO NOT MERGE: CI-3898: Add rules, build and allow-failure tests for adb-8.x#1455
Daniil290620 wants to merge 32 commits intoadb-8.xfrom
ADBDEV-7257-ci

Daniil290620 commented Apr 7, 2025

Uh oh!

Daniil290620 commented Apr 7, 2025

Uh oh!

Daniil290620 commented Apr 15, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

Conversation

Daniil290620 commented Apr 7, 2025

Here are some reminders before you submit the pull request

Uh oh!

Daniil290620 commented Apr 7, 2025

Uh oh!

Daniil290620 commented Apr 15, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Daniil290620 commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants