Skip to content

[MchLogToolkitGo] - V3: backend unificado com seleção entre arquivo e Graylog UDP#16

Merged
mateusmetzker merged 18 commits intodevfrom
feat/v3-graylog-udp
May 7, 2026
Merged

[MchLogToolkitGo] - V3: backend unificado com seleção entre arquivo e Graylog UDP#16
mateusmetzker merged 18 commits intodevfrom
feat/v3-graylog-udp

Conversation

@JVVeiga
Copy link
Copy Markdown
Member

@JVVeiga JVVeiga commented May 5, 2026

[MchLogToolkitGo] - V3: backend unificado com seleção entre arquivo e Graylog UDP

Descrição

Introduz a versão V3 do MchLogToolkit como backend unificado. O serviço escolhe o destino dos logs em runtime via mchlogcorev3.DestinationConfig.Protocol:

  • ProtocolFile (default): mantém o mesmo layout <basePath>/<service>/<level>/<level>.log e a mesma JSON shape do V2. Comportamento bit-a-bit equivalente — o pipeline atual de rsyslog tailando /applog/... continua funcionando sem alteração.
  • ProtocolGraylogUDP: envia GELF 1.1 via UDP direto para um input do Graylog. Pensado para dev/qa que querem concentrar logs no Graylog sem depender de rsyslog.

Estratégia adotada:

  1. Nova interface mchlogcore.Transport (mais Closer opcional) define o contrato compartilhado por V1/V2/V3.
  2. Facade mchlogcore.LogType refatorado para dispatch via current Transport no lugar do if/else por versão.
  3. Novo pacote mchlogcorev3 com DestinationConfig (Protocol/Addr/Source/DisableGZIP), strategy interna (destination interface), graylogUDP (sobre github.com/Graylog2/go-gelf, com rate-limit em stderr de 1 linha/60s nas falhas de envio) e fileDestination (wrapper fino sobre mchlogcorev2, zero duplicação até V1/V2 serem retirados).
  4. V1 e V2 não foram tocados (git diff main -- mchlogcorev1/ mchlogcorev2/ é vazio). Default segue V1, então quem não chamar SetVersion/Configure não percebe nada.
  5. API pública do Logger inalterada (NewLogger, Initialize, Info, Debug, Warn, Error, Fatal, Test).

Adicionar protocolos futuros (graylog-tcp, syslog-udp, splunk-hec, …) requer apenas estender Configure + criar a implementação interna; o enum LogVersion não precisa bumpar.

Tipo de Mudança

  • Correção de Bug: Esta alteração resolve um problema existente no projeto.
  • Nova Funcionalidade: Adição de uma nova funcionalidade que enriquece o projeto.
  • Melhoria de Código: Refatoração que melhora a performance ou legibilidade do código sem alterar funcionalidades.
  • Breaking Change: Alteração significativa que afeta a compatibilidade com versões anteriores, requerendo atenção especial durante a integração.
  • Internacionalização e Localização: Adição ou melhoria no suporte de múltiplos idiomas ou adaptações regionais.
  • Desempenho: Melhorias que aumentam a eficiência e velocidade do sistema.
  • Segurança: Correções ou melhorias que aumentam a segurança do projeto.
  • Dependências: Atualizações ou adições em bibliotecas e pacotes externos.
  • Testes: Adição ou melhoria nos testes do projeto, incluindo testes unitários, bdd, etc.
  • Infraestrutura/DevOps: Mudanças relacionadas à configuração de infraestrutura, CI/CD, ferramentas de desenvolvimento, alteração de config etc.

Checklist

  • Testei o código localmente
  • Revisei o código (self-review)
  • Comentei meu código, especialmente em áreas difíceis de entender
  • Fiz alterações correspondentes na documentação
  • Minhas alterações não geram novos warnings
  • Novos e existentes testes unitários passam localmente com minhas alterações
  • Chequei se o PR cumpre os critérios de aceitação da issue relacionada

Como testar as alterações

Pré-requisitos: Go 1.22.1+, acesso ao serv-graylog.gaudium.lan:12201 (input GELF UDP "Log dos Ambientes" já em produção).

git fetch origin && git checkout feat/v3-graylog-udp

# build + vet + fmt + tests + cobertura
go mod tidy
go build ./...
go vet ./...
gofmt -l .
go test -cover ./...

Resultado esperado: build clean, sem warnings de vet/fmt, 84 testes verdes, cobertura V3 91.6%.

Teste end-to-end via consumidor (MchSyncherCache, branch feat/log-v3-graylog):

LOG_DESTINATION=graylog-udp \
LOG_ADDR=serv-graylog.gaudium.lan:12201 \
LOG_SOURCE=mchsynchercache-<env>-<host> \
go run ./cmd/mchsynchercache

Em seguida, no Graylog UI, buscar application_name:mchsynchercache — devem aparecer mensagens com log_id=mchsynchercache-mchlog-info, level_name=info, source=<LOG_SOURCE>. Já validado em kappaqa durante o desenvolvimento.

Para teste sem Graylog real, o repo inclui scaffold de integração:

docker run -d -p 12201:12201/udp graylog/graylog:5.0
GRAYLOG_TEST_ADDR=localhost:12201 go test -tags=integration -v ./mchlogcorev3/...

Impactos no Deploy

Nenhum impacto direto na biblioteca (é uma lib Go consumida via go get). Para os serviços consumidores:

  • O default permanece V1 — serviços que apenas atualizam a dependência sem chamar mchlogcorev3.Configure continuam gravando em arquivo no formato V1.
  • Quem migrar para V3 com Protocol: ProtocolFile mantém o mesmo on-disk layout e JSON shape do V2; rsyslog → Graylog não precisa de mudança.
  • Quem migrar para V3 com Protocol: ProtocolGraylogUDP precisa que o host do serviço tenha rota UDP para serv-graylog.gaudium.lan:12201 (já existe para os hosts onde apiproxy roda hoje).

Dependências

  • Nova: github.com/Graylog2/go-gelf v0.0.0-20170811154226-7ebf4f536d8f (pseudo-version do master; lib oficial GELF/UDP em Go).
  • Existentes mantidas: github.com/rs/zerolog v1.33.0 e indiretas.

Issues

N/A — feature interna.

Observações Adicionais

  • Auditoria de segurança feita antes do PR (security-auditor): zero achados HIGH/MEDIUM dentro do trust model do projeto.
  • Code review feito antes do PR (code-reviewer): 4 important issues identificados — todos endereçados (resource leak em Initialize re-entry, edge cases em serviceFromPath, aliasing em contentToMap, godocs de ascendStackFrame e fileDestination.Close).
  • V1 e V2 permanecem disponíveis para retrocompatibilidade. Plano é retirá-los em PR futuro depois que serviços migrarem para V3 com ProtocolFile.
  • README atualizado com a nova seção V3 (uso, mapping de campos no Graylog, falhas de envio, guidance).

JVVeiga added 16 commits May 4, 2026 19:37
Define a Transport strategy interface so V1, V2 and the upcoming V3
(network) share one contract for the methods they all implement
(LogSubject and GetFileNameFromStreamName).

Resource cleanup lives in a separate optional Closer interface.
Backends that hold resources requiring explicit release (e.g. UDP
sockets in V3) implement Close; backends that rely on process exit
(V1, V2) do not need to be modified.

Both interfaces are declared in mchlogcore. Backends satisfy them
implicitly via Go structural typing, avoiding any import cycle.
Replace the if/else dispatch in LogType with a package-level
current Transport populated by SetVersion. Adding a new version
(V3 network) now means extending one switch in transportFor and
one in InitializeMchLog instead of touching every method.

GetIP keeps its V1-only semantics via type assertion against
*mchlogcorev1.LogType. The init log message ("MchLogToolkit
initialized version=...") is preserved.

Adds a Close method on the facade that delegates to the active
transport only when it implements the optional Closer interface
(future V3 will). For V1 and V2 it is a no-op, so the existing
file backends are not modified.

Behavior is unchanged for existing services that do not call
SetVersion: V1 stays the default. All previous tests stay green.

New tests in mchlogcore lock in:
  - SetVersion(V1) selects *mchlogcorev1.LogType
  - SetVersion(V2) selects *mchlogcorev2.LogType
  - GetIP returns "" when not running V1
  - Close is a no-op for V1 and V2 (no Closer impl)
Introduces the V3 network backend skeleton without any wire I/O yet.
Public surface:

  * Protocol type with ProtocolGraylogUDP as the only value (others
    will plug in by extending the dispatch in Configure).
  * NetworkConfig{Protocol, Addr, Source, DisableGZIP}. Addr and
    Source are required; the toolkit deliberately does not auto-detect
    Source so the consuming service controls how its logs identify
    themselves in Graylog (pod name, env-aware composition, etc.).
  * Configure(cfg) validates required fields, applies the Protocol
    default, rejects unknown protocols, and stores the result.
  * ActiveConfig / IsConfigured for the transport layer (and tests).
  * DefaultSource() helper returning os.Hostname() or "unknown" for
    callers that prefer the hostname behavior without composing
    Source manually.

DisableGZIP uses the inverse-flag pattern so the zero value preserves
the documented default of GZIP-on without ambiguity between "false
explicit" and "not set".

Tests cover defaults, mandatory-field rejection, unknown-protocol
rejection, DisableGZIP=true honored, and ActiveConfig before Configure.
Adds the github.com/Graylog2/go-gelf dependency (the master tree;
this fork has no /v2/ module path despite the GitHub branch).

Introduces:

  * levelToSyslog: maps the toolkit's log levels to GELF/syslog
    severities (fatal=2, error=3, warn=4, info=6, debug/test=7).
    Unknown levels default to INFO so a typo never silences output.

  * buildGELFMessage: builds a *gelf.Message from the payload that
    transports receive (the []byte JSON produced by formatLog, plus
    map[string]any / map[string]string variants for the init log).

Field naming follows the agreed Graylog mapping:
  - host = cfg.Source (controlled by the consuming service)
  - short_message = payload "message"
  - level = syslog severity
  - _application_name = service name from NewLogger
  - _log_id = "<service>-mchlog-<level>"   (mirrors V1/V2 directory
    layout so old filter habits keep working)
  - _level_name = textual level
  - _file/_line/_trace = renamed from payload "source"/"line"/"trace"
    to avoid colliding with Graylog's "source" column (which comes
    from host)
  - any other payload key becomes "_<key>" in Extra
  - _error = errLog.Error() when errLog != nil

Tests cover level mapping (all six levels + unknown), required GELF
1.1 fields, custom field composition, error promotion, map content
type, missing message, invalid JSON, and JSON serialization sanity.
Implements the V3 backend on top of github.com/Graylog2/go-gelf:

  * graylogUDP holds the gelf.Writer, the active NetworkConfig and the
    service name. Satisfies mchlogcore.Transport (LogSubject and
    GetFileNameFromStreamName) and mchlogcore.Closer (Close).
  * Initialize(path) extracts the service from the path (matching the
    "<basePath>/<service>/" convention V1/V2 already use), dials the
    GELF UDP writer, applies CompressionType from cfg.DisableGZIP, and
    publishes the global MchLog. Configure must run first.
  * LogSubject builds the GELF Message via buildGELFMessage and sends
    via WriteMessage. Send errors do not propagate to the caller; they
    are reported by warnOnce.
  * GetFileNameFromStreamName returns "udp://<addr>/<subject>" purely
    as a logical descriptor for tests and observability — no real file.
  * Close is idempotent; closes the writer once and short-circuits on
    subsequent calls.
  * warnOnce rate-limits stderr warnings to one line per 60s so a
    Graylog outage cannot flood logs.

Tests cover the happy path against a local UDP listener (asserting
GELF 1.1 fields including _application_name and _log_id), the logical
descriptor format, idempotent Close, and the Configure-first guard.
Adds the V3 LogVersion constant alongside V1 and V2 and extends the
internal switch in transportFor and InitializeMchLog. V3 routes calls
to mchlogcorev3.MchLog (which was prepared by mchlogcorev3.Configure
and mchlogcorev3.Initialize).

Initialization order is the subtle point: SetVersion(V3) marks intent
but mchlogcorev3.MchLog is still nil until Initialize succeeds, so
InitializeMchLog now re-binds the package-level current Transport
after the backend init returns. The transport methods on *graylogUDP
already tolerate a nil receiver, so a misordered SetVersion alone
cannot panic the caller.

If V3 Initialize fails (Configure not called, dial failure, bad path),
the error is reported once on stderr and the boot info message is
skipped — V1 and V2 paths are unaffected.

A new test in mchlogcore drives the full chain: Configure →
SetVersion(V3) → InitializeMchLog → MchLog.LogSubject, asserting the
GELF datagram on a local UDP listener and verifying that the boot
info message identifies V3.
Pushes V3 coverage from 74% to 87.8%, well above the 80% target.

End-to-end (UDP listener + JSON decode) tests:
  - All six toolkit levels round-trip with the right syslog severity
    and the right _level_name and _log_id in the datagram.
  - DisableGZIP=true emits raw JSON datagrams (no gzip magic).
  - DisableGZIP zero-value emits gzipped datagrams (default).
  - LogSubject with empty subject does not produce any datagram.

Internal helper tests:
  - Nil receiver tolerated by LogSubject, GetFileNameFromStreamName
    and Close (defensive guard against pre-Initialize dispatch).
  - contentToMap covers string-as-JSON, reflect fallback for
    map[string]int, unsupported type rejection, and nil rejection.
  - serviceFromPath handles trailing slash, ./ prefix, empty input,
    and bare separator.
  - stringify handles non-string and nil.

warnOnce stays uncovered until T11 (failure-mode tests intentionally
exercise the stderr rate limiter).
V3 coverage rises to 95.4%. warnOnce reaches 100%.

Tests force buildGELFMessage to fail (sending an int as content,
which contentToMap rejects) since UDP writes themselves are
fire-and-forget and rarely surface errors to the caller.

  - LogSubject does not panic on a build failure.
  - Stderr captured via os.Pipe: 100 consecutive failures inside
    the warn window produce exactly one "GELF UDP send failed" line.
  - After the warn window expires (test resets lastWarn directly,
    same package access), a new failure produces a new line.
  - Initialize without prior Configure returns an error mentioning
    Configure so callers know what to fix.
  - Initialize with an empty path errors out instead of running with
    a blank service name.
Adds a //go:build integration test that sends a GELF datagram to a
real Graylog instance. It is excluded from the default test suite so
CI does not require external infrastructure. To run locally:

  docker run -d --name graylog-test -p 12201:12201/udp graylog/graylog:5.0
  GRAYLOG_TEST_ADDR=localhost:12201 go test -tags=integration -v ./mchlogcorev3/...

The test skips itself when GRAYLOG_TEST_ADDR is unset so the tagged
build still passes on machines without Graylog running.
Documents the new network backend, including:
  - Code example for switching to V3 (Configure + SetVersion).
  - NetworkConfig field reference (Addr, Source, Protocol, DisableGZIP)
    with the rule that Source is required and caller-provided.
  - Field mapping table from toolkit payload to GELF wire format to
    Graylog UI columns (host/source, _application_name, _log_id,
    _level_name, _file, _line, _error).
  - Example Graylog searches that exercise application_name, log_id,
    and source.
  - Failure handling note: silent drop + rate-limited stderr warn,
    no automatic file fallback.
  - Guidance on when NOT to use V3 (production should keep V1/V2).
Aligns whitespace per gofmt -l on the helper functions and struct
field declarations introduced in the coverage tests.
…structure MchLog

V3 was scoped too narrowly to network. Extends it to a unified backend
that selects between protocols at Configure time. This is the
foundation for adding ProtocolFile in the next commit and eventually
retiring V1/V2 once services migrate.

Public surface changes:
  * NetworkConfig -> BackendConfig (still in mchlogcorev3 package).
  * MchLog: was *graylogUDP global; now LogType (struct) holding an
    internal backend strategy populated by Initialize. The methods on
    *LogType (LogSubject, GetFileNameFromStreamName, Close) delegate
    via a RWMutex-guarded interface field, so the public API remains
    identical for callers.
  * mchlogcore: V3 dispatch returns &mchlogcorev3.MchLog (struct
    address) instead of the old *graylogUDP pointer.

Validation in Configure now switches on Protocol:
  * ProtocolFile (new default) requires nothing.
  * ProtocolGraylogUDP requires Addr and Source.
  * Unknown Protocol returns an explicit error.

graylogUDP becomes an internal type (still nil-receiver safe).
warnOnce, mu, lastWarn are now reachable via type assertion in tests.

All existing tests updated to pass Protocol: ProtocolGraylogUDP
explicitly. config_test.go rewritten around the new defaults.

BREAKING CHANGE: mchlogcorev3.NetworkConfig renamed to BackendConfig
(unreleased — only affects in-flight V3 callers on this branch).
Confirms V3 with ProtocolFile produces files identical to V2:

  * Layout: <basePath>/<service>/<level>/<level>.log.
  * JSON shape: {message, level, source, line, trace, timestamp}.
  * Errors prefixed with err_, written to err_<level>/err_<level>.log.
  * GetFileNameFromStreamName returns the real on-disk path (delegated).
  * Close is a no-op and idempotent (V2 underneath has no Close).

These are the contract a service migrating from V2 to V3 with
ProtocolFile depends on; the tests pin them explicitly so future
internal changes (e.g. inlining V2 logic into V3) cannot regress
the on-disk format silently.
V3 is now framed as the unified backend, not a network-only family.
The section now documents:

  * Both protocols side by side: ProtocolFile (V2-equivalent layout
    and JSON) and ProtocolGraylogUDP (GELF over UDP).
  * BackendConfig field reference clarifying which fields apply to
    which protocol (Addr/Source/DisableGZIP only for GraylogUDP).
  * Graylog field mapping table kept for the network mode.
  * "When to use each mode" guidance: ProtocolFile for production,
    ProtocolGraylogUDP for dev/qa.
  * Migration note from V1/V2 to V3 with ProtocolFile (no behavior
    diff). V1/V2 stay available for now.
Aligns terminology with the convention already used by services
consuming the toolkit (see mch-log-graylog-go MCH_LOG_DESTINATION
and the LOG_DESTINATION env var introduced in MchSyncherCache).
"Destination" describes what the field selects (where logs go)
better than "Backend".

Renames:
  * BackendConfig    -> DestinationConfig (struct)
  * fileBackend      -> fileDestination (internal struct)
  * newFileBackend   -> newFileDestination
  * internal "backend" interface -> "destination" interface
  * LogType.impl is typed against the new interface name
  * README references and test files updated to match

No behavior change. All tests still pass.

BREAKING CHANGE: BackendConfig renamed to DestinationConfig
(unreleased — only affects in-flight V3 callers on this branch).
Plugs three concrete issues raised in review and tightens defensive
guards in two places:

  * Initialize is now safe to call repeatedly. The previous active
    destination is closed (outside the write lock) before the new one
    is installed, preventing the GELF UDP socket from leaking on
    re-initialization.

  * serviceFromPath rejects degenerate paths ("", ".", "..", "C:")
    so Initialize fails explicitly instead of producing nonsensical
    service names like "_log_id=.-mchlog-info". Backslashes are now
    normalized manually because filepath.ToSlash is a no-op on Unix.

  * contentToMap copies map[string]any into a fresh map on entry so
    the builder cannot accidentally mutate (or be mutated by) the
    caller's map.

  * graylogUDP.LogSubject now documents that ascendStackFrame is a
    no-op for this destination — _file/_line are taken from the
    payload (where logger.go already populated them via runtime.Caller).

  * fileDestination.Close godoc clarifies that no flush happens; the
    file backend never buffers writes.

  * Adds compile-time assertions confirming *mchlogcorev3.LogType
    satisfies both Transport and Closer.

  * Drops the redundant time.Now().UTC() call in buildGELFMessage:
    UnixNano is monotonic regardless of timezone.

Tests cover the new path-normalization cases and the re-entry close.

No public API changes.
@JVVeiga JVVeiga changed the title feat(v3): unified backend selectable between file and Graylog UDP [MchLogToolkitGo] - V3: backend unificado com seleção entre arquivo e Graylog UDP May 5, 2026
@JVVeiga JVVeiga self-assigned this May 5, 2026
JVVeiga added 2 commits May 5, 2026 14:35
Renames "backend" -> "destino" / "destinos" across godoc and README
to match the public API rename (DestinationConfig, fileDestination,
internal destination interface). No code change.
Adds three tests prompted by the QA review:

  * TestInitializeDialFailureReturnsError exercises the dial-time
    error path in Initialize (Addr without colon → ResolveUDPAddr
    fails) and asserts the wrapped "dial GELF UDP" message.
  * TestWriterWriteMessageFailureWarns reproduces a runtime UDP
    write failure (closes the underlying gelf.Writer socket and
    sends after) so warnOnce is exercised on the post-build path,
    not only the build path.
  * TestLogTypeLogSubjectBeforeInitialize confirms that the public
    facade (mchlogcorev3.MchLog) is safe to use before Initialize:
    LogSubject is a no-op, GetFileNameFromStreamName returns "",
    Close returns nil, no panic.

Also updates the docstring of TestRateLimitedWarnOneLinePerWindow to
clarify which failure path is exercised (buildGELFMessage), and adds
a comment near resetConfig explaining why the package's tests cannot
use t.Parallel (shared globals: activeCfg, MchLog.impl, mchlogcorev2).

V3 coverage rises from 91.6% to 93.8%.
@mateusmetzker mateusmetzker merged commit d92c117 into dev May 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants