I ran the pay-later workload on four different versions of the database. On the 25.2 Beta, I ran the tests with and without write buffering. For each version, I let the test run for 30 minutes and I repeated each test four times.
Mean Latency (the latency experienced by a cycle of the workload averaged out over the 30 minutes of the workload execution) consistently decreased while at the same time "Workload Executions/s" consistently increased.
| Version (Advanced Clusters) | Cluster Size (AWS us-east-2) | Total vCPUs | Concurrency | Average "Mean Latency" (Per Workload Execution) | Average "p99 Latency" (Per Workload Execution) | Average "Mean Latency Writes / Second" (included in Workload) | Average "Workload Executions / Second" | SQL Queries / Second | Percent Increase from baseline (24.1.18) |
|---|---|---|---|---|---|---|---|---|---|
| 24.1.18 | 6 Nodes x 4vCPU Per Node 800GiB Per Node | 24 | 40 | 267 | 534 | 141 | 149 | 2834 | Baseline |
| 24.3.12 | 6 Nodes x 4vCPU Per Node 800GiB Per Node | 24 | 40 | 250 | 451 | 133 | 160 | 3035 | 7.09% |
| 25.1.5 | 6 Nodes x 4vCPU Per Node 800GiB Per Node | 24 | 40 | 231 | 290 | 126 | 173 | 3285 | 15.91% |
| v25.2.0-beta.3 | 6 Nodes x 4vCPU Per Node 800GiB Per Node | 24 | 40 | 197 | 359 | 115 | 202 | 3846 | 35.71% |
| v25.2.0-beta.3 (write buffereing) | 6 Nodes x 4vCPU Per Node 800GiB Per Node | 24 | 40 | 188 | 335 | 110 | 212 | 4033 | 42.30% |
The data set included 4 tables and is approximitely 2.5 TiB.
| Table | Number of Rows | Notes |
|---|---|---|
| "user" | 100,000,000 | Unique index on "billing_email_id" |
| account | 100,000,000 | One-to-one relationship with "user". |
| accountevent | 1,000,000,000 | One-to-many relationship with account |
| ledger | 1,000,000,000 | One-to-many relationship with account |
To populate the database, you can choose to restore a backup or import the data. In both cases, the data in stored in S3.
The database can be restored from a backup stored in S3. There are 2 versions of the backup available:
| Database Version | Backup Location | Restore Command |
|---|---|---|
| 24.1 | s3://nollen-bucket/workloads/backup-pay-later-database-24.1/ | RESTORE DATABASE workload FROM LATEST in 's3://nollen-bucket/workloads/backup-pay-later-database-24.1/?AWS_ACCESS_KEY_ID={Key ID}&AWS_SECRET_ACCESS_KEY={Key}'; |
| 24.3 | s3://nollen-bucket/workloads/backup-pay-later-database-24.3/ | RESTORE DATABASE workload FROM LATEST in 's3://nollen-bucket/workloads/backup-pay-later-database-24.3/?AWS_ACCESS_KEY_ID={Key ID}&AWS_SECRET_ACCESS_KEY={Key}'; |
If you would rather import the data rather than restoring a backup start by creating the DDL
Start by creating the tables DDL.
| Table | Import Command |
|---|---|
| "user" | IMPORT INTO "user" |
| account | IMPORT INTO account |
| accountevent | IMPORT INTO accountevent |
| ledger | IMPORT INTO ledger |
BEFORE RUNNING THE WORKLOAD, the following table must be created in the workload database:
CREATE TABLE public.dbworkload_syncup (
thread INT8 NOT NULL,
thread_starting_loop BOOL NULL DEFAULT false,
CONSTRAINT dbworkload_syncup_pkey PRIMARY KEY (thread ASC)
);
Run the workload from a server that has access to the database. You'll need to have dbworkload installed and running on your server.
nohup dbworkload run -w pay_later.py --uri $CRDB_ADVANCED_URL -c 40 -d 1800 > dbworkload-run.log 2>&1 &
Due to the way that dbworkload runs, the "setup" step, which is a lengthy process, is included in the final average outout. Therefore, that final average output cannot be used. I manually collect the individual output and manully compute the averages.
The workloadpay-later is designed to simulate a small portion of the load test run by a customer and performs the following 19 statements (9 select, 6 insert):
- get_customer_by_email
- get_account_events
- get_account_status
- get_account_and_events_1yr
- get_ledger_details
- get_ledger_total
- get_merchant_info
- get_chargebacks
- get_ledger_balances_by_year
- put_user_and_details (inserts 1 user record, 1 account record, 5 accountevent records and 3 ledger records)
