Skip to content

FluentBit 4.2.2 Release#228

Merged
nr-rkallempudi merged 4 commits into
mainfrom
NR-517619-rkallempudi
Feb 16, 2026
Merged

FluentBit 4.2.2 Release#228
nr-rkallempudi merged 4 commits into
mainfrom
NR-517619-rkallempudi

Conversation

@nr-rkallempudi
Copy link
Copy Markdown
Contributor

@nr-rkallempudi nr-rkallempudi commented Feb 4, 2026

PR Summary: E2E Test Optimization & Fluent Bit 4.2.2 Upgrade

🎯 Overview

This PR optimizes E2E test execution by parallelizing tests across OS families and upgrades Fluent Bit to version 4.2.2. The changes significantly improve test performance and reliability while adding support for newer SLES versions and fixing critical issues with Amazon Linux 2023.


📦 Key Changes

1. E2E Test Performance Optimization

Files: .github/workflows/run_e2e_tests.yml, .github/workflows/run_prerelease.yml, ansible/provision-and-execute-tests/Makefile

  • Parallelized provisioning and testing by OS family (APT, YUM, SLES) instead of sequential execution
  • Split into 3 parallel provision jobs → 3 parallel test jobs → merge results
  • Performance impact: Tests now run concurrently instead of sequentially, significantly reducing total execution time
  • Added new Makefile targets for parallel execution:
    • prerelease-linux-provision-{apt,yum,sles}
    • prerelease-linux-test-{apt,yum,sles}
    • prerelease-linux-merge-all
  • Each OS family (debian/ubuntu, amazonlinux/centos/rocky, sles) provisions and tests independently

2. Fluent Bit Version Upgrade

File: versions/common.yml

  • Upgraded from 4.2.04.2.2
  • Added New Relic Fluent Bit Output Plugin configuration:
    • nrFbOutputPluginVersion: 3.4.0
    • Optional nrFbOutputPluginTag for custom/beta release tags

3. New SLES Distribution Support

Files: versions/sles_15.6.yml, versions/sles_15.7.yml, versions/sles_15.4.yml, versions/sles_15.5.yml

  • Added: SLES 15.6 (ami-06efc45c7e3e16e77) and SLES 15.7 (ami-0f409b75e8c70d0e4)
  • Updated AMIs: SLES 15.4 and 15.5 with new AMI IDs
  • Renamed: All version files from .ymlx.yml extension

4. Amazon Linux 2023 Critical Fixes

File: ansible/provision-and-execute-tests/playbook-run-tests.yml

Multiple fixes specific to AL2023's stricter security policies:

  • SELinux Permissive Mode: Sets setenforce 0 to prevent blocking Fluent Bit's journal access
  • systemd-journal Group Access: Adds fluent-bit user to systemd-journal group (mandatory on AL2023)
  • Journal Directory Permissions: Forces /var/log/journal to 02755 with recursion
  • Service Restarts: Restarts both systemd-journald and fluent-bit to apply permissions
  • Increased SSM Timeout: From 600s → 900s for ARM instances which can be slower
  • Added Retry Logic: 10 retries for SSM connections with extended timeouts

5. Port Conflict Prevention

File: ansible/provision-and-execute-tests/playbook-run-tests.yml:146-192

Added critical cleanup block to prevent "bind: Address already in use" errors:

  • Stops Fluent Bit service before configuration
  • Force kills any remaining fluent-bit processes
  • Waits for TCP ports (5140, 5170) to be released (handles TCP TIME_WAIT)
  • Added port initialization wait with 120s timeout
  • Includes debug logging for service status and journal logs on failure

6. Integration Test Improvements

Files: integration-tests/test-suite/tcp.test.js, integration-tests/test-suite/syslog.test.js

  • Promisified socket operations: TCP and UDP writes now return promises
  • Proper error handling: Socket errors now reject promises and fail tests immediately
  • Explicit port parsing: parseInt(port) to prevent connection issues
  • Awaited writes: Tests now await socket operations before checking logs
  • Error visibility: Console logs socket errors with context

7. Results Processor Enhancements

File: integration-tests/controller-scripts/resultsProcessor.js

  • XML validation: Checks for empty files and valid XML structure before processing
  • Partial report handling: Skips partial merged reports (apt.xml, yum.xml, sles.xml)
  • Enhanced error handling: Try-catch blocks with detailed error messages
  • Better logging: Progress logs showing test counts and failures
  • Exit codes: Process exits with code 1 on fatal errors

8. New Playbook for Merging Partial Results

File: ansible/provision-and-execute-tests/playbook-merge-partial-results.yml

  • Downloads partial test reports from GitHub release (apt.xml, yum.xml, sles.xml)
  • Merges them into final combined report
  • Uploads final report back to GitHub release
  • Enables parallel test execution with centralized result aggregation

9. GitHub Workflow Improvements

Files: .github/workflows/pull_request.yml, .github/workflows/merge_to_main.yml, .github/workflows/new_relic.yml

  • Added permissions: Explicit contents: write and pull-requests: read
  • Plugin version tracking: Extracts and passes nrFbOutputPluginVersion and nrFbOutputPluginTag from matrix
  • Default tag logic: Falls back to v{version} if custom tag not specified
  • Conditional SLES execution: SLES jobs only run if sles_matrix != '[]'

10. AWS Configuration

File: ansible/provision-and-execute-tests/Makefile:4-5

  • Adaptive retry mode: AWS_RETRY_MODE=adaptive
  • Max attempts: AWS_MAX_ATTEMPTS=10
  • Handles SSM throttling gracefully during parallel execution

🔧 Technical Details

Parallel Execution Flow

┌─────────────────────────────────────────────────────────────┐
│                    Spin Up Test Executors                   │
└──────────────────────┬──────────────────────────────────────┘
                       │
        ┌──────────────┼──────────────┐
        ▼              ▼              ▼
┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ Provision    │ │ Provision│ │ Provision    │
│ APT          │ │ YUM      │ │ SLES         │
│ (deb, ubu)   │ │ (al,cent)│ │ (sles)       │
└──────┬───────┘ └────┬─────┘ └──────┬───────┘
       │              │              │
       ▼              ▼              ▼
┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ Test APT     │ │ Test YUM │ │ Test SLES    │
│ Distros      │ │ Distros  │ │ Distros      │
└──────┬───────┘ └────┬─────┘ └──────┬───────┘
       │              │              │
       └──────────────┼──────────────┘
                      ▼
              ┌──────────────┐
              │ Merge All    │
              │ Results      │
              └──────┬───────┘
                     ▼
              ┌──────────────┐
              │ Report       │
              │ Results      │
              └──────────────┘

AL2023 Permission Requirements

Amazon Linux 2023 enforces stricter security policies compared to AL2:

  1. SELinux: Active by default, blocks journal access
  2. systemd-journal group: Membership is mandatory (not optional)
  3. Journal persistence: Requires explicit directory setup

Without these fixes, Fluent Bit cannot read systemd logs on AL2023.


📝 Migration Notes

If you need to run tests manually:

# Old way (sequential)
make ansible/provision-and-execute-tests/prerelease-linux PRE_RELEASE_NAME=test-release

# New way (parallel)
make ansible/provision-and-execute-tests/prerelease-linux-provision-apt PRE_RELEASE_NAME=test-release
make ansible/provision-and-execute-tests/prerelease-linux-provision-yum PRE_RELEASE_NAME=test-release
make ansible/provision-and-execute-tests/prerelease-linux-provision-sles PRE_RELEASE_NAME=test-release

# Then run tests
make ansible/provision-and-execute-tests/prerelease-linux-test-apt PRE_RELEASE_NAME=test-release TEST_REPORT_NAME=report.xml
make ansible/provision-and-execute-tests/prerelease-linux-test-yum PRE_RELEASE_NAME=test-release TEST_REPORT_NAME=report.xml
make ansible/provision-and-execute-tests/prerelease-linux-test-sles PRE_RELEASE_NAME=test-release TEST_REPORT_NAME=report.xml

# Finally merge
make ansible/provision-and-execute-tests/prerelease-linux-merge-all PRE_RELEASE_NAME=test-release TEST_REPORT_NAME=report.xml

@nr-rkallempudi nr-rkallempudi marked this pull request as ready for review February 4, 2026 17:30
@nr-rkallempudi nr-rkallempudi force-pushed the NR-517619-rkallempudi branch 8 times, most recently from 2531f97 to a385651 Compare February 5, 2026 10:38
@nr-rkallempudi nr-rkallempudi changed the title Test PR: triggering E2E tests with fluentbit output plugin: newrelic-fluent-bi… FluentBit 4.2.2 Release Feb 5, 2026
@maya-jha
Copy link
Copy Markdown
Contributor

maya-jha commented Feb 5, 2026

@copilot

Copy link
Copy Markdown

Copilot AI commented Feb 5, 2026

@maya-jha I've opened a new pull request, #229, to work on those changes. Once the pull request is ready, I'll request review from you.

@maya-jha
Copy link
Copy Markdown
Contributor

maya-jha commented Feb 5, 2026

@copilot:review

Copy link
Copy Markdown

Copilot AI commented Feb 5, 2026

@maya-jha I've opened a new pull request, #230, to work on those changes. Once the pull request is ready, I'll request review from you.

Comment thread .github/workflows/pull_request.yml
Comment thread versions/common.yml Outdated
@nr-rkallempudi nr-rkallempudi force-pushed the NR-517619-rkallempudi branch 2 times, most recently from 012146a to 1ba3b24 Compare February 6, 2026 02:54
Comment thread versions/sles_15.4.yml
packages:
- arch: x86_64
ami: ami-054bba390120adf1d
ami: ami-0c00d482c766145ab
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you were able to find an AMI for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is available in us-east-2

Name: suse-sles-15-sp4-sapcal-v20260124-hvm-ssd-x86_64
Description: SUSE Linux Enterprise Server 15 SP4 for SAP CAL (HVM, 64-bit, SSD Backed)
State: available
Architecture: x86_64
Platform: SUSE Linux
Owner: amazon (013907871322)
Public: Yes
Created: 2026-01-27
Deprecation Date: 2028-01-27
Root Volume: 10 GB gp3 EBS
Boot Mode: uefi-preferred

maya-jha
maya-jha previously approved these changes Feb 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 7, 2026

Test Results

 1 files  ± 0   2 suites   - 2   1m 37s ⏱️ - 2m 9s
 7 tests ± 0   5 ✅ ± 0  2 💤 ±0  0 ❌ ±0 
14 runs   - 14  10 ✅  - 10  4 💤  - 4  0 ❌ ±0 

Results for commit 41316b1. ± Comparison against base commit 8245dd1.

♻️ This comment has been updated with latest results.

…t-output-3.3.1-beta-rhel8-test

triggering E2E tests with fluentbit output plugin: newrelic-fluent-bit-output-3.3.1-beta-rhel8-test

Update common.yml

update amis for sles

amis

removing tag as code already handled plugin version in case of missing tag
Comment thread .github/workflows/run_e2e_tests.yml Fixed
Comment thread .github/workflows/run_e2e_tests.yml Fixed
Comment thread .github/workflows/run_e2e_tests.yml Fixed
Comment thread .github/workflows/run_e2e_tests.yml Fixed
Comment thread .github/workflows/run_e2e_tests.yml Fixed
Comment thread .github/workflows/run_prerelease.yml Fixed
Comment thread .github/workflows/run_prerelease.yml Fixed
Comment thread .github/workflows/run_prerelease.yml Fixed
Comment thread .github/workflows/run_prerelease.yml Fixed
Comment thread .github/workflows/run_prerelease.yml Fixed
@nr-rkallempudi nr-rkallempudi force-pushed the NR-517619-rkallempudi branch 3 times, most recently from f1ac538 to 11eb62e Compare February 9, 2026 02:21
@nr-rkallempudi nr-rkallempudi force-pushed the NR-517619-rkallempudi branch 7 times, most recently from 5fb2d62 to b41c0f0 Compare February 11, 2026 04:45
Comment thread .github/workflows/run_prerelease.yml Fixed
@nr-rkallempudi nr-rkallempudi force-pushed the NR-517619-rkallempudi branch 3 times, most recently from 41316b1 to 3e63082 Compare February 11, 2026 14:18
Comment thread .github/workflows/run_prerelease.yml Outdated
# Parallel provisioning by OS family - all 3 run simultaneously
provision_linux_apt:
name: Provision APT distros (debian, ubuntu)
needs: [ spin_up_test_executor_instances, sign_suse_packages ]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may not need sign_suse_packages here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more improvement we can do is to also branch out spinning test instances for distro. e.g. spin_up_test_executor_instances for apt, yum but don't have to do it as part of this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread .github/workflows/run_prerelease.yml Outdated

provision_linux_yum:
name: Provision YUM distros (amazonlinux, centos, rocky)
needs: [ spin_up_test_executor_instances, sign_suse_packages ]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may not need sign_suse_packages here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +4 to +5
contents: write
pull-requests: read
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed?

chdir: "{{ fluent_bit_path }}/build"
async: 1200 # timeout 20m
poll: 60 # poll every 60s
async: 1800 # timeout 30m
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it take more than 20 minutes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes it is timing out and breaking the entire pipeline. increasing helped.

ansible.builtin.get_url:
url: "{{ nr_fb_output_plugin_url }}"
dest: "{{ nr_fb_output_plugin_download_file_path }}"
register: download_fluentbit_output_plugin
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename and put yum in the name if its for yum?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This linux.yml task file is used by ALL Linux distros:

APT-based (Debian, Ubuntu)
YUM-based (CentOS, RHEL, Rocky, Amazon Linux)
Zypper-based (SLES, openSUSE)

# strategy: free
# Using free strategy for parallel execution across hosts
# Note: any_errors_fatal is incompatible with free strategy
strategy: free
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it continue to run even if there are failures?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With strategy: free:

Parallel execution - Each host executes tasks independently without waiting for others
Failure isolation - If a host fails, it's removed from the play, but other hosts continue
No global halt - Unlike the default linear strategy, one host's failure doesn't block others
Incompatible with any_errors_fatal - As noted in line 4, you cannot use any_errors_fatal: true with strategy: free

Comment thread .github/workflows/run_prerelease.yml Outdated

merge_linux_test_results:
name: Merge all test results (Linux + Windows)
needs: [ test_linux_apt_distros, test_linux_yum_distros, test_linux_sles_distros, provision_and_execute_tests_windows ]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be better to report windows results separately like earlier.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

container_make_target: "ansible/provision-and-execute-tests/${{ inputs.infra_agent_env }}-linux-test-sles PRE_RELEASE_NAME=${{ inputs.gh_release_name }} TEST_REPORT_NAME=${{ inputs.linux_test_report_name }}"
secrets: inherit

merge_linux_test_results:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still says merge linux only.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@nr-rkallempudi nr-rkallempudi force-pushed the NR-517619-rkallempudi branch 4 times, most recently from 45ca27e to 60ab01d Compare February 13, 2026 07:48
Comment thread .github/workflows/run_prerelease.yml Outdated
- name: Upload signed asset
run:
gh release upload ${{ inputs.pre_release_name }} packages/* --clobber
uses: nick-fields/retry@v3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use commit sha here like we did for fluent bit output?

Comment thread .github/workflows/run_prerelease.yml Outdated
reporter: jest-junit

- name: Windows Tests Report Summary
uses: EnricoMi/publish-unit-test-result-action@v2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit sha.

maya-jha
maya-jha previously approved these changes Feb 13, 2026
@nr-rkallempudi nr-rkallempudi force-pushed the NR-517619-rkallempudi branch 2 times, most recently from 32f3951 to 5c48ab3 Compare February 13, 2026 11:15
maya-jha
maya-jha previously approved these changes Feb 13, 2026
@nr-rkallempudi nr-rkallempudi merged commit abee8fa into main Feb 16, 2026
164 of 166 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants