Skip to content

PT-2237 - fixed PostgreSQL DB logs collection (Reopen)#1099

Open
BON4 wants to merge 24 commits into3.xfrom
PT-2237_fix_pg_log_collector
Open

PT-2237 - fixed PostgreSQL DB logs collection (Reopen)#1099
BON4 wants to merge 24 commits into3.xfrom
PT-2237_fix_pg_log_collector

Conversation

@BON4
Copy link
Copy Markdown
Contributor

@BON4 BON4 commented Mar 31, 2026

What was added:

  • To keep changes minimal, the existing test TestIndividualFiles was modified to cover the new case.
    The test was refactored because the previous implementation couldn't handle files with variable names.
    In /pgdata/<cluster_name>/pg_log, logs are named according to the day/week, so filenames are not known at runtime.

  • A new function getAllFilesFromDirectory was added to dumper.go.
    Its purpose is the same as the existing getIndividualFiles, but it supports extracting multiple files from a directory. It is useful because pg log filenames can vary, and there can be multiple log files.

  • The contributed code is licensed under GPL v2.0
  • Contributor Licence Agreement (CLA) is signed
  • util/update-modules has been ran
    (/lib has not been changed)
  • Documentation updated
  • Test suite update

BON4 added 18 commits January 6, 2026 16:15
This fix ensures `pt-k8s-debug-collector` collects postgreSql databse
logs which is stored in `/pgdata/<cluster_name>/pg_log`.
A test `TestIndividualFiles` was refactored, and modified for a new
case with new feature.
This refactor includes replacing all of the `kubectl` cli calls
with golang sdk for k8s. Additionaly dumper now has new structure,
new logger, tar file path controll, and multithreaded approach
for downloading and exporting files form multiple pods.
…cona-toolkit into PT-2237_fix_pg_log_collector
Copy link
Copy Markdown
Collaborator

@svetasmirnova svetasmirnova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pgo test fails:

=== RUN   TestCollectorRunner/Operator_pgo/TestIndividualFiles/Resource_pgo
    main_test.go:401: 
        	Error Trace:	/home/sveta/src/percona/percona-toolkit/src/go/pt-k8s-debug-collector/main_test.go:401
        	            				/home/sveta/go/pkg/mod/github.com/stretchr/testify@v1.11.1/suite/suite.go:115
        	Error:      	Received unexpected error:
        	            	exit status 2
        	Test:       	TestCollectorRunner/Operator_pgo/TestIndividualFiles/Resource_pgo
    main_test.go:403: 
        	Error Trace:	/home/sveta/src/percona/percona-toolkit/src/go/pt-k8s-debug-collector/main_test.go:403
        	            				/home/sveta/go/pkg/mod/github.com/stretchr/testify@v1.11.1/suite/suite.go:115
        	Error:      	Preprocessor Check
        	Test:       	TestCollectorRunner/Operator_pgo/TestIndividualFiles/Resource_pgo
        	Messages:   	test pgo_pg_logs_exist
        	            	resource:pgo
        	            	namespace: pgo
        	            	output is not as expected
        	            	Output: 
        	            	Wanted: [.log]
=== RUN   TestCollectorRunner/Operator_pgo/TestIndividualFiles/Resource_auto
    main_test.go:401: 
        	Error Trace:	/home/sveta/src/percona/percona-toolkit/src/go/pt-k8s-debug-collector/main_test.go:401
        	            				/home/sveta/go/pkg/mod/github.com/stretchr/testify@v1.11.1/suite/suite.go:115
        	Error:      	Received unexpected error:
        	            	exit status 2
        	Test:       	TestCollectorRunner/Operator_pgo/TestIndividualFiles/Resource_auto
    main_test.go:403: 
        	Error Trace:	/home/sveta/src/percona/percona-toolkit/src/go/pt-k8s-debug-collector/main_test.go:403
        	            				/home/sveta/go/pkg/mod/github.com/stretchr/testify@v1.11.1/suite/suite.go:115
        	Error:      	Preprocessor Check
        	Test:       	TestCollectorRunner/Operator_pgo/TestIndividualFiles/Resource_auto
        	Messages:   	test pgo_pg_logs_exist
        	            	resource:auto
        	            	namespace: pgo
        	            	output is not as expected
        	            	Output: 
        	            	Wanted: [.log]

Since we will deprecate pgo soon as per EOL note here: https://docs.percona.com/percona-operator-for-postgresql/1.6.0/, you can simply remove this test.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes PostgreSQL log collection in the Kubernetes debug collector by enabling extraction of all files from a log directory (where filenames vary at runtime), and updates the integration test to validate that at least one PostgreSQL .log file is included in the produced dump archive.

Changes:

  • Added support for dumping an entire directory of “individual files” (used for PostgreSQL pg_log) instead of only known, fixed filenames.
  • Added pg log directory mappings for PG v1 (“pgo”) and PG v2 (“pgv2”) resources.
  • Refactored TestIndividualFiles to run namespace-specific assertions and added checks for PostgreSQL log presence.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/go/pt-k8s-debug-collector/main_test.go Refactors individual-file tests and adds checks that PG .log files exist in the archive.
src/go/pt-k8s-debug-collector/dumper/resources.go Registers PostgreSQL log directories (pg_log) for PG v1 and PG v2 dump collection.
src/go/pt-k8s-debug-collector/dumper/kube_utils.go Introduces streaming tar execution helpers used for directory/file extraction.
src/go/pt-k8s-debug-collector/dumper/individual_files.go Reworks individual file dumping to support env substitution and directory extraction via tar streaming.
src/go/pt-k8s-debug-collector/dumper/dumper.go Normalizes CR type handling when matching pods and dumping summaries/files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +194 to +197
func (d *Dumper) executeInPodStream(ctx context.Context, command []string, pod corev1.Pod, container string, stdin io.Reader) (io.ReadCloser, bytes.Buffer, error) {
stdinFlag := stdin != nil
var stderr bytes.Buffer

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executeInPodStream returns stderr as a bytes.Buffer value while the goroutine passed &stderr to exec.StreamWithContext may still be writing to it. Copying/returning a bytes.Buffer concurrently with writes is a data race and can also cause callers to see incomplete/empty stderr. Consider allocating stderr as *bytes.Buffer (or io.Writer) and returning that pointer instead of copying the buffer.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed err buffer completely. Error logging inside a streaming gorutine will be enough.

Comment on lines +220 to +229
defer pw.Close()

if err := exec.StreamWithContext(ctx, remotecommand.StreamOptions{
Stdin: stdin,
Stdout: pw,
Stderr: &stderr,
Tty: false,
}); err != nil && !errors.Is(err, context.Canceled) {
log.Errorf("error while streaming files from pod: %s", err.Error())
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the streaming goroutine, exec.StreamWithContext errors are only logged and the pipe is closed with pw.Close(), so callers may see a clean EOF and treat failures (e.g., missing dir/file, tar exit != 0) as success. Consider propagating the stream error via pw.CloseWithError(err) and/or returning an error channel / waiting for the goroutine so tarFromPod/callers can reliably detect failures.

Suggested change
defer pw.Close()
if err := exec.StreamWithContext(ctx, remotecommand.StreamOptions{
Stdin: stdin,
Stdout: pw,
Stderr: &stderr,
Tty: false,
}); err != nil && !errors.Is(err, context.Canceled) {
log.Errorf("error while streaming files from pod: %s", err.Error())
}
err := exec.StreamWithContext(ctx, remotecommand.StreamOptions{
Stdin: stdin,
Stdout: pw,
Stderr: &stderr,
Tty: false,
})
if err != nil && !errors.Is(err, context.Canceled) {
log.Errorf("error while streaming files from pod: %s", err.Error())
_ = pw.CloseWithError(err)
return
}
_ = pw.Close()

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +17 to +21
func (d *Dumper) getContainerEnvMap(ctx context.Context, namespace, podName, containerName string) (map[string]string, error) {
pod, err := d.clientSet.CoreV1().Pods(namespace).Get(ctx, podName, metav1.GetOptions{})
if err != nil {
return nil, err
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getContainerEnvMap does a live Pods().Get(...) call per pod/container even though exportJob already includes a full corev1.Pod (with Spec.Containers[].Env) from the informer/lister. This adds avoidable API-server load during dumps; consider parsing env vars from job.Pod.Spec instead of re-fetching the Pod.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +146 to +149
dst := d.PodIndividualFilesPath(
job.Pod.Namespace,
job.Pod.Name,
path.Join(tarFolder, path.Base(hdr.Name)),
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

processDir flattens all extracted files into tarFolder using only path.Base(hdr.Name). If the directory contains nested subdirectories or duplicate basenames, files can be overwritten and structure is lost. Consider preserving the relative path from the tar header (e.g., clean hdr.Name and join it under tarFolder) to avoid collisions.

Suggested change
dst := d.PodIndividualFilesPath(
job.Pod.Namespace,
job.Pod.Name,
path.Join(tarFolder, path.Base(hdr.Name)),
// Preserve the relative path from the tar header while ensuring it
// cannot escape the intended destination directory.
relPath := path.Clean(hdr.Name)
// Normalize common tar prefixes like "./"
relPath = strings.TrimPrefix(relPath, "./")
// Prevent path traversal outside tarFolder by stripping leading "../"
for strings.HasPrefix(relPath, "../") {
relPath = strings.TrimPrefix(relPath, "../")
}
// Skip entries that do not resolve to a meaningful relative path
if relPath == "" || relPath == "." {
continue
}
dst := d.PodIndividualFilesPath(
job.Pod.Namespace,
job.Pod.Name,
path.Join(tarFolder, relPath),

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +121 to +129
func (d *Dumper) processDir(
ctx context.Context,
job exportJob,
container, tarFolder, dir string,
) error {

tr, rc, _, err := d.tarFromPod(ctx, job.Pod, container, "-C", dir, ".")
if err != nil {
return err
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions adding a getAllFilesFromDirectory function in dumper.go, but the implementation here introduces processDir/tarFromPod instead. Consider updating the PR description (or naming) to match the actual approach so future maintainers can find the relevant code quickly.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants