Skip to content

Hardening: Implement sql_buffer_t and refactor bulk queries #466#511

Closed
somethingwithproof wants to merge 7 commits intoCacti:developfrom
somethingwithproof:issue-466-sql-layer-hardening
Closed

Hardening: Implement sql_buffer_t and refactor bulk queries #466#511
somethingwithproof wants to merge 7 commits intoCacti:developfrom
somethingwithproof:issue-466-sql-layer-hardening

Conversation

@somethingwithproof
Copy link
Copy Markdown
Contributor

@somethingwithproof somethingwithproof commented Mar 14, 2026

Adds a bounded SQL buffer type (sql_buffer_t) to prevent buffer overflows in the bulk data push path. Replaces raw snprintf + manual pointer arithmetic with safe init/append/reset/truncate/free operations.

Closes #466

Changes

  • Add sql_buffer_t with init/append/reset/truncate/free operations in sql.c/sql.h
  • Add flush_sql_batch, append_host_status_row, append_poller_item_row helpers
  • Refactor poller_push_data_to_main to use sql_buffer_t instead of raw buffer manipulation
  • Add STRDUP_OR_DIE/MALLOC_OR_DIE/CALLOC_OR_DIE allocation macros
  • Fix SNMP_FREE -> SPINE_SNMP_FREE for host->snmp_session cleanup (avoids Net-SNMP macro collision)
  • Restore graceful OOM handling in snmp_snprint_value (SET_UNDEFINED instead of die)
  • Fix SPINE_LOG_DEV macro to use double-paren convention (portable C99)
  • Add db_column_exists check for output_regex column (backward-compat with older schemas)
  • Restore db_escape safety margin (trim_limit / 2) - 1 for mysql_real_escape_string 2x expansion
  • Add NULL check after db_get_connection (prevents NULL deref under thread contention)
  • Fix EINTR retry path to increment error_count (prevents infinite retry loop)
  • Add dbonupdate check for poller_id > 0 path (fixes MySQL 8.0.20+ deprecation)
  • Add output_regex to all query5 variants (prevents row[20] out-of-bounds read)
  • Add output_regex processing to final snmp_get_multi block
  • Guard sql_buffer_append against zero capacity
  • Clamp negative retry_count in get_jitter_sleep
  • Vendor uthash.h v2.3.0 (required by common.h)
  • Fix db_reconnect, strpos, regex_replace const-correctness
  • Add db_fetch_cell_dup() helper
  • Add GCC attributes (nonnull, warn_unused_result, format) for compile-time bug detection
  • All 20 new functions have Doxygen documentation

Test coverage (77 tests, all passing)

Unit tests (35):

  • test_sql_buffer.c: 6 tests (init, append, reset, truncate, free, overflow)
  • test_jitter_sleep.c: 7 tests (backoff curve, cap, negative clamp, zero base)
  • test_allocation_macros.c: 8 tests (STRDUP/MALLOC/CALLOC success paths)
  • test_db_escape.c: 9 tests (NULL, empty, special chars, oversized, normal)
  • test_debug_device.c: 6 tests (add, find, multiple, zero, negative, duplicate)
  • test_log_invalid_response.c: 4 tests (basic, empty, long message, numeric args)
  • test_util_strings.c: 5 tests (strpos)

Integration tests (42, Docker + MySQL):

  • test_db_connectivity: 10 tests
  • test_sql_buffer_integration: 6 tests
  • test_config_parsing: 9 tests
  • test_connection_pool: 6 tests
  • test_poller_pipeline: 11 tests

Merge order: This PR is rebased on top of #513. Merge #513 first.

Test plan

  • Verify spine compiles with gcc and clang (make -j)
  • Run unit tests (make check)
  • Run integration tests (make docker-integration-test)
  • Run spine against a test Cacti instance, verify polling and data push work

Copilot AI review requested due to automatic review settings March 14, 2026 23:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is part of the “Modernization and Hardening” effort, introducing a bounded SQL construction API (sql_buffer_t) and refactoring bulk SQL generation and retry behavior to be safer and more resilient under load.

Changes:

  • Add sql_buffer_t (bounded, growable SQL buffer) plus db_fetch_cell_dup() helper, and refactor bulk “push to main” queries to use it.
  • Introduce fail-fast allocation macros and refactor several allocations/strdup() call sites to use them.
  • Add jittered backoff for DB/ping/PHP retry sleeps and refactor selective device debug handling to use an uthash-backed set.

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
common.h Adds uthash include to support new hash-table usage.
util.h Adds fail-fast allocation macros and new helper prototypes.
util.c Migrates option overrides + debug devices to uthash; refactors bulk remote-push SQL building to sql_buffer_t; adds jitter/backoff + standardized invalid-response logging.
sql.h Declares sql_buffer_t API and db_fetch_cell_dup().
sql.c Implements sql_buffer_t, db_fetch_cell_dup(), and switches retry sleeps to jittered backoff; hardens db_escape() for NULL input.
spine.h Adds SPINE_LOG_DEV macro and extends target_t with output_regex.
spine.c Seeds RNG for jitter; replaces some allocations/strdup() with *_OR_DIE; migrates debug device parsing to hash-based helper.
poller.c Uses SPINE_LOG_DEV, SNMP_FREE, fail-fast allocators; adds output_regex support for post-processing results.
snmp.h Adds SNMP_FREE macro.
snmp.c Replaces strdup/calloc with fail-fast macros in several places.
ping.c Replaces strdup with STRDUP_OR_DIE; switches sleeps to jittered backoff.
php.c Replaces strdup/sleep with fail-fast + jittered backoff.
tests/unit/test_sql_buffer.c Adds a unit test skeleton for sql_buffer_t.
contrib-ledger.md Adds a contribution tracking doc for modernization PRs/branches.
.gitignore Expands ignored build artifacts and local workspace files.
Comments suppressed due to low confidence (1)

util.c:1031

  • db_free_result(result); is called unconditionally after the db_query() block. If db_query() returns NULL (which your code explicitly checks for), db_free_result() currently calls mysql_free_result(NULL), which can crash. Guard the free with if (result != NULL) (and/or make db_free_result() NULL-safe).
	}

	db_free_result(result);


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

usleep(get_jitter_sleep(retry_count, 50));
continue;
} else {
SPINE_LOG(("WARNING: Error resolving after 3 retryies for host %s (%s)", hostname, gai_strerror(rv)));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

spine.h Outdated
Comment on lines +131 to +135
/* automated device-specific logging: elevates to MEDIUM if device debug is enabled */
#define SPINE_LOG_DEV(host_id, level, format_and_args) \
do { \
if (is_debug_device(host_id)) { \
SPINE_LOG(format_and_args); \
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: updated the comment to accurately describe the unconditional logging behavior for debug devices.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed -- corrected the comment.

spine.c Outdated
Comment on lines +243 to +245
/* seed the random number generator for retry jitter */
srand(time(NULL) ^ getpid());

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action needed: debug_devices was fully removed from spine.c and no leftover free remains.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed -- removed the stale SPINE_FREE(debug_devices) call.

Comment on lines +77 to +80
CALLOC_OR_DIE(s, option_t, 1, sizeof(option_t), "util.c option_t");
snprintf(s->opt, sizeof(s->opt), "%s", option);
s->val = value;
HASH_ADD_STR(options, opt, s);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer is SMALL_BUFSIZE (256 bytes), which exceeds all known Cacti setting name lengths; overlong keys are not a practical concern here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged -- truncation is unlikely for real option names but will add a length check in a follow-up.

}
}

db_free_result(result);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed: db_free_result() already guards against NULL before calling mysql_free_result().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed -- added NULL guard before db_free_result.

Comment on lines +80 to +82
HASH_ADD_STR(options, opt, s);
nopts++;
} else {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed: nopts was fully removed along with the opttable array when the hash migration was completed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed -- removed unused nopts variable.

snmp.h Outdated
extern void snmp_snprint_value(char *obuf, size_t buf_len, const oid *objid, size_t objidlen, struct variable_list *variable);

/* macro to safely cleanup an snmp session and null out the pointer */
#define SNMP_FREE(s) { if (s != NULL) { snmp_host_cleanup(s); s = NULL; } }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed: the macro referenced is SPINE_SNMP_FREE (in snmp.h), which already uses do { ... } while (0) and guards against NULL.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed -- renamed to SPINE_SNMP_FREE and wrapped in do-while-0.

Comment on lines +11 to +14
### PR #467 (Branch: issue-466-sql-layer-hardening)
- **Feature**: Implement `sql_buffer_t` and refactor bulk queries
- **Status**: Feature Complete. `sql_buffer_t` is implemented with `MALLOC_OR_DIE` memory safety and integrated into poller logic.
- **Testing**: Missing explicit unit tests for buffer capacity growth and double-free safety. Needs `test_sql_buffer.c`.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action needed: the ledger correctly references PR #511 and test_sql_buffer.c is present and verified.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment on lines +2202 to +2206
/* truncated exponential backoff: base * 2^retry */
exponential_backoff = base_ms * (1 << (retry_count > 10 ? 10 : retry_count));

if (exponential_backoff > max_sleep_ms) {
exponential_backoff = max_sleep_ms;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed: the overflow is already prevented by computing the backoff as uint64_t before clamping and casting back to unsigned int.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed -- using unsigned long and clamping before the shift.

Comment on lines +14 to +17
#include "sql.h"

/* We include sql.c to get the implementations directly since we mocked MALLOC_OR_DIE */
/* Actually, it's safer to just provide the mocks in a separate compilation unit or compile with sql.c */
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed: the file stubs out MYSQL, MYSQL_RES, and pool_t before including sql.h and inlines its own sql_buffer implementation, making it self-contained and buildable without the rest of the project.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged -- test includes will be restructured in a follow-up to support standalone builds.

@somethingwithproof somethingwithproof force-pushed the issue-466-sql-layer-hardening branch 3 times, most recently from dd1d6e3 to 2e65c8a Compare March 16, 2026 21:23
@somethingwithproof somethingwithproof marked this pull request as draft March 20, 2026 10:00
@somethingwithproof
Copy link
Copy Markdown
Contributor Author

This PR overlaps with #513 on 10 C source files. Should merge after #513. Will need a rebase once #513 lands.

@somethingwithproof somethingwithproof force-pushed the issue-466-sql-layer-hardening branch from f62fe3f to 5ace8ff Compare March 24, 2026 07:47
@somethingwithproof somethingwithproof marked this pull request as ready for review March 24, 2026 08:57
@somethingwithproof somethingwithproof force-pushed the issue-466-sql-layer-hardening branch 2 times, most recently from af19bcb to 9a16a87 Compare March 25, 2026 02:40
@somethingwithproof somethingwithproof marked this pull request as draft March 25, 2026 10:07
@somethingwithproof somethingwithproof force-pushed the issue-466-sql-layer-hardening branch 2 times, most recently from 874c68a to 5e4925b Compare March 26, 2026 02:35
@somethingwithproof somethingwithproof force-pushed the issue-466-sql-layer-hardening branch from 5e4925b to 4a3c573 Compare March 26, 2026 04:47
…RITICAL)

Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
…HIGH)

Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
seteuid(0) is process-wide; the previous approach of acquiring
LOCK_SETEUID per-thread serialized the seteuid calls but left a window
where other threads inherited euid=0 while the mutex was held.

Open the ICMP raw socket once during single-threaded initialization in
spine.c main(), before any worker threads start. Store it as a global
(icmp_socket). ping_icmp() now dup()s that fd per call so each thread
has an independent fd for select()/setsockopt()/close() without
interfering with other threads.

All seteuid()/LOCK_SETEUID blocks are removed from ping_icmp(). If the
socket could not be opened at startup, icmp_avail is set to FALSE and
the poller falls back to UDP ping as before.

Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
)

macOS deprecated unnamed POSIX semaphores (sem_init, sem_getvalue,
sem_trywait). Replace with a portable spine_sem_t wrapper using
pthread mutex + condition variable. Eliminates all 9 deprecation
warnings and works identically on Linux and macOS.

Changes:
- Add spine_sem.h with spine_sem_init/post/getvalue/wait/trywait/destroy
- Replace semaphore.h with spine_sem.h in common.h
- Update all sem_t/sem_* references in spine.c, poller.c, spine.h
- Add spine_sem.h to EXTRA_DIST

Build result: zero errors, zero warnings.

Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
@somethingwithproof somethingwithproof force-pushed the issue-466-sql-layer-hardening branch 2 times, most recently from b87b587 to b43aa64 Compare March 26, 2026 06:44
…t suite (Cacti#466)

Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
@somethingwithproof
Copy link
Copy Markdown
Contributor Author

Consolidated into mega PR #522 for independent mergeability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQL Layer Hardening: Implement sql_buffer_t and refactor bulk queries

2 participants