OpenAI filter sends 3 consecutive system messages → 400 on strict chat templates (e.g. Qwen3 via llama.cpp), silently dropping all messages

## Summary

`OpenAIFilterer.Filter` builds its request with **three consecutive `system` messages** followed by the user message. Many chat templates only allow a **single, leading** system message and raise on anything else. When the backend enforces this (e.g. Qwen3 served by `llama.cpp` with `--jinja`), every request returns **HTTP 400** before the model runs. Because the filter returns `FilterOnFailure` on any error — and `FilterOnFailure` defaults to `true` — **every message is silently filtered out.** For an OpenAI-filter → Discord pipeline, this looks like the integration has simply stopped forwarding anything.

Version: **v1.3.2** (latest). Backend: `llama.cpp` server (`ghcr.io/ggml-org/llama.cpp:server-*`) with `--jinja`, Qwen3-family GGUF.

## Error returned by the backend

```
POST ".../chat/completions": 400 Bad Request
{"error":{"code":400,"message":"Unable to generate parser for this template. Automatic parser generation failed:
------------
While executing CallExpression at line 85, column 32 in source:
...first %}\n            {{- raise_exception('System message must be at the beginnin...
                                           ^
Error: Jinja Exception: System message must be at the beginning.","type":"invalid_request_error"}}
```

Followed by:

```
WARN  error filtering with OpenAIFilterer in step number 5: ... 400 Bad Request ...
INFO  message ... was filtered in step 5 by OpenAIFilterer
```

## Root cause

In [`filter_openai.go`](https://github.com/tyzbit/acars-processor/blob/v1.3.2/filter_openai.go), the request is assembled as:

```go
chatCompletion, err := client.Chat.Completions.New(context.TODO(),
    openai.ChatCompletionNewParams{
        Messages: openai.F([]openai.ChatCompletionMessageParamUnion{
            openai.SystemMessage(OpenAISystemPrompt),     // system #1
            openai.SystemMessage(o.UserPrompt),           // system #2
            openai.SystemMessage(OpenAIFinalInstructions),// system #3
            openai.UserMessage(ms),
        }),
        ...
```

Three `system` messages in a row. Strict templates (Qwen3, and others) reject non-leading/repeated system messages via `raise_exception(...)`, so the call 400s. On that error the function returns `o.FilterOnFailure`, which defaults to `true`, so the message is dropped.

## Suggested fix

The sibling **annotator** already does this correctly — it concatenates everything into a **single** system message, see [`annotator_openai.go:125`](https://github.com/tyzbit/acars-processor/blob/v1.3.2/annotator_openai.go#L125):

```go
openai.SystemMessage(OpenAIAnnotatorFirstInstructions + a.UserPrompt + OpenAIAnnotatorFinalInstructions),
openai.UserMessage("Here is the message to evaluate:\n" + msg),
```

The filter should mirror this: collapse `OpenAISystemPrompt + o.UserPrompt + OpenAIFinalInstructions` into one `SystemMessage` (or move the instructions into the user turn). That keeps a single leading system message and works across both lenient and strict templates.

## Impact / severity

- Silent: with `FilterOnFailure: true` (the **default**), there is no user-facing error — messages just stop flowing, easily mistaken for "no matching traffic."
- Affects any OpenAI-compatible backend that enforces single-leading-system templates; notably `llama.cpp --jinja` with Qwen3 models, a common self-hosted setup that the README's `http://llama-server:8080/v1` examples point at.

## Repro

1. Run `llama.cpp` server with a Qwen3 GGUF and `--jinja`.
2. Configure an `OpenAI` filter step pointing at it (`URL: http://.../v1`, any `UserPrompt`).
3. Send any message with text. Backend returns 400 (`System message must be at the beginning`); the message is filtered out.

## Workaround (until fixed)

Override the model's chat template to a lenient one (e.g. `--chat-template chatml`) so repeated system messages are accepted — at the cost of the model's native (thinking/tool-call) template — or set `FilterOnFailure: false` to fail open. Neither is a real fix; the message construction above is the bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI filter sends 3 consecutive system messages → 400 on strict chat templates (e.g. Qwen3 via llama.cpp), silently dropping all messages #45

Summary

Error returned by the backend

Root cause

Suggested fix

Impact / severity

Repro

Workaround (until fixed)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

OpenAI filter sends 3 consecutive system messages → 400 on strict chat templates (e.g. Qwen3 via llama.cpp), silently dropping all messages #45

Description

Summary

Error returned by the backend

Root cause

Suggested fix

Impact / severity

Repro

Workaround (until fixed)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions