Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
<div align="center">

# KERNO

Check warning on line 3 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (KERNO)

### The production incident diagnosis engine for Kubernetes

**Your cluster broke. Your dashboards are green. Users are paging.**
**Run `kerno doctor`. 30 seconds. Root cause. Plain English.**

Check warning on line 8 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (kerno)

<sub>Same single binary runs on bare metal, VMs, EC2, GCE - wherever Linux lives.</sub>

Expand All @@ -18,30 +18,30 @@

[**Quick Start**](#quick-start) · [**How It Works**](#how-it-works) · [**Features**](#features) · [**Kubernetes**](#kubernetes-deployment) · [**Docs**](docs/architecture.md)

<img src="demo.gif" alt="kerno doctor demo" width="900" />

Check warning on line 21 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (kerno)

</div>

---

## What is Kerno?

Check warning on line 27 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (Kerno)

Kerno is a **Kubernetes-native incident diagnosis engine** built on eBPF.

Check warning on line 29 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (Kerno)
It runs as a DaemonSet on every node, watches the kernel - not your app - and answers a single question on demand:

> *Why is production broken right now?*

```bash
kubectl -n kerno-system exec ds/kerno -- kerno doctor

Check warning on line 35 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (kerno)

Check warning on line 35 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (kerno)

Check warning on line 35 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (kerno)
```

30 seconds later you get a ranked diagnostic report with **plain-English causes, evidence, ETAs, and copy-paste fix steps** - no dashboards to wire, no query language to learn, no agents in your app.

The kernel knows minutes before your APM. Hours before your users. Kerno makes that visible.

Check warning on line 40 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (Kerno)

**Same binary outside Kubernetes too.** `curl | bash` it onto any bare-metal box, EC2 instance, or systemd VM and `sudo kerno doctor` works exactly the same.

## Why Kerno?

Check warning on line 44 in README.md

View workflow job for this annotation

GitHub Actions / Spell check

Unknown word (Kerno)

It's 3am. PagerDuty fires. Latency is up, error budget is burning, and every dashboard you own is **green**.

Expand Down Expand Up @@ -530,6 +530,40 @@

---

### Enterprise Deployments

Kerno supports enterprise proxy environments and custom CA certificates for AI providers.

Example configuration:

```yaml
ai:
proxy: http://corp-proxy.internal:8080
ca_cert_file: /etc/kerno/corp-ca.crt
insecure_skip_verify: false
timeout: 30s
```

#### Proxy Support

If `ai.proxy` is not configured, Kerno automatically honors:

- `HTTPS_PROXY`
- `HTTP_PROXY`
- `NO_PROXY`

#### Custom CA Certificates

Use `ai.ca_cert_file` to append additional CA certificates without replacing the system trust store.

This is commonly required in enterprise environments using TLS-inspecting corporate proxies.

#### TLS Verification Errors

Kerno returns actionable TLS verification errors including hostname and certificate verification details to simplify debugging enterprise proxy and CA configurations.

---

## Configuration

Kerno works with **zero configuration**. For custom setups, mount a `config.yaml` or use `KERNO_*` env vars:
Expand Down
9 changes: 7 additions & 2 deletions internal/ai/anthropic.go
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,12 @@ func NewAnthropicProvider(cfg ProviderConfig) *AnthropicProvider {
model: model,
maxTokens: maxTokens,
temperature: temp,
client: &http.Client{},
client: NewHTTPClient(
cfg.Timeout,
cfg.Proxy,
cfg.CACertFile,
cfg.InsecureSkipVerify,
),
}
}

Expand Down Expand Up @@ -96,7 +101,7 @@ func (p *AnthropicProvider) Complete(ctx context.Context, req CompletionRequest)

resp, err := p.client.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("anthropic API call failed: %w", err)
return nil, fmt.Errorf("anthropic request failed: %w", formatHTTPError(err))
}
defer resp.Body.Close()

Expand Down
90 changes: 90 additions & 0 deletions internal/ai/http_client.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
// Copyright 2026 Optiqor contributors
// SPDX-License-Identifier: Apache-2.0

package ai

import (
"crypto/tls"
"crypto/x509"
"errors"
"fmt"
"net/http"
"net/url"
"os"
"time"
)

func NewHTTPClient(
timeout time.Duration,
proxy string,
caCertFile string,
insecureSkipVerify bool,
) *http.Client {
//nolint:gosec // InsecureSkipVerify is intentionally configurable for local/dev and air-gapped environments.
tlsConfig := &tls.Config{
InsecureSkipVerify: insecureSkipVerify,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#47 asked for proxy and custom CA, not a TLS-verification bypass. insecure_skip_verify is a footgun in a tool that runs as root, and it's not in the issue's scope. the nolint and the "local/dev" comment don't change that someone sets it in prod and forgets. drop it, or raise it on #47 first.

}

if caCertFile != "" {
certPool, err := x509.SystemCertPool()
if err != nil || certPool == nil {
certPool = x509.NewCertPool()
}
//nolint:gosec // CA certificate path is intentionally user-configurable via trusted config.
caCert, err := os.ReadFile(caCertFile)
if err == nil {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CA load failures are swallowed here. if os.ReadFile errors or AppendCertsFromPEM returns false (bad path, wrong perms, malformed PEM), the function silently keeps the system pool and returns a client as if nothing happened. the operator then gets a confusing TLS failure with no hint the CA was never loaded. Validate os.Stats the file but doesn't parse it. since NewHTTPClient can't return an error today, either change the signature to return one, or validate the PEM in config so a bad cert fails fast.

ok := certPool.AppendCertsFromPEM(caCert)
if ok {
tlsConfig.RootCAs = certPool
}
}
}

transport := &http.Transport{
Proxy: http.ProxyFromEnvironment,
TLSClientConfig: tlsConfig,
}

if proxy != "" {
proxyURL, err := url.Parse(proxy)
if err == nil {
transport.Proxy = http.ProxyURL(proxyURL)
}
}

return &http.Client{
Timeout: timeout,
Transport: transport,
}
}

func formatHTTPError(err error) error {
if err == nil {
return nil
}

var unknownAuthErr x509.UnknownAuthorityError
if errors.As(err, &unknownAuthErr) {
return fmt.Errorf(
"TLS verification failed for certificate subject %q: %w. "+
"If your environment uses a corporate MITM proxy, configure ai.ca_cert_file",
unknownAuthErr.Cert.Subject,
err,
)
}

var hostnameErr x509.HostnameError
if errors.As(err, &hostnameErr) {
return fmt.Errorf(
"TLS hostname verification failed for host %q: %w",
hostnameErr.Host,
err,
)
}

return fmt.Errorf(
"HTTP request failed: %w. "+
"If using a corporate proxy or custom CA, configure ai.ca_cert_file or ai.proxy",
err,
)
}
178 changes: 178 additions & 0 deletions internal/ai/http_client_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
// Copyright 2026 Optiqor contributors
// SPDX-License-Identifier: Apache-2.0

package ai

import (
"context"
"encoding/pem"
"net/http"
"net/http/httptest"
"net/url"
"os"
"testing"
"time"
)

func TestHTTPClient_TLSVerificationFails(t *testing.T) {
server := httptest.NewTLSServer(
http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("ok"))
}),
)
defer server.Close()

client := NewHTTPClient(
5*time.Second,
"",
"",
false,
)

req, err := http.NewRequestWithContext(
context.Background(),
http.MethodGet,
server.URL,
nil,
)
if err != nil {
t.Fatalf("creating request: %v", err)
}

resp, err := client.Do(req)
if resp != nil {
defer resp.Body.Close()
}
if err == nil {
t.Fatal("expected TLS verification error, got nil")
}
}

func TestHTTPClient_InsecureSkipVerify(t *testing.T) {
server := httptest.NewTLSServer(
http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("ok"))
}),
)
defer server.Close()

client := NewHTTPClient(
5*time.Second,
"",
"",
true,
)
req, err := http.NewRequestWithContext(
context.Background(),
http.MethodGet,
server.URL,
nil,
)
if err != nil {
t.Fatalf("creating request: %v", err)
}
resp, err := client.Do(req)
if err != nil {
t.Fatalf("expected successful request, got error : %v", err)
}
defer resp.Body.Close()

if resp.StatusCode != http.StatusOK {
t.Fatalf("expected status 200, got %d", resp.StatusCode)
}
}

func TestHTTPClient_CustomCA(t *testing.T) {
server := httptest.NewTLSServer(
http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("ok"))
}),
)
defer server.Close()

// Export server cert as PEM
cert := server.Certificate()

pemData := pem.EncodeToMemory(&pem.Block{
Type: "CERTIFICATE",
Bytes: cert.Raw,
})

tmpFile, err := os.CreateTemp("", "kerno-ca-*.crt")
if err != nil {
t.Fatalf("creating temp cert file: %v", err)
}
defer os.Remove(tmpFile.Name())

if _, err := tmpFile.Write(pemData); err != nil {
t.Fatalf("writing cert file: %v", err)
}

if err := tmpFile.Close(); err != nil {
t.Fatalf("closing cert file: %v", err)
}

client := NewHTTPClient(
5*time.Second,
"",
tmpFile.Name(),
false,
)

req, err := http.NewRequestWithContext(
context.Background(),
http.MethodGet,
server.URL,
nil,
)
if err != nil {
t.Fatalf("creating request: %v", err)
}

resp, err := client.Do(req)
if err != nil {
t.Fatalf("expected successful request with custom CA, got : %v", err)
}
defer resp.Body.Close()

if resp.StatusCode != http.StatusOK {
t.Fatalf("expected status 200, got %d", resp.StatusCode)
}
}

func TestHTTPClient_CustomProxy(t *testing.T) {
client := NewHTTPClient(
5*time.Second,
"http://localhost:8080",
"",
false,
)

transport, ok := client.Transport.(*http.Transport)
if !ok {
t.Fatal("expected *http.Transport")
}

req := &http.Request{
URL: &url.URL{
Scheme: "https",
Host: "example.com",
},
}

proxyURL, err := transport.Proxy(req)
if err != nil {
t.Fatalf("proxy function returned error: %v", err)
}

if proxyURL == nil {
t.Fatal("expected proxy URL, got nil")
}

if proxyURL.String() != "http://localhost:8080" {
t.Fatalf("unexpected proxy URL: %s", proxyURL.String())
}
}
9 changes: 7 additions & 2 deletions internal/ai/ollama.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,12 @@ func NewOllamaProvider(cfg ProviderConfig) *OllamaProvider {
model: model,
maxTokens: maxTokens,
temperature: temp,
client: &http.Client{},
client: NewHTTPClient(
cfg.Timeout,
cfg.Proxy,
cfg.CACertFile,
cfg.InsecureSkipVerify,
),
}
}

Expand Down Expand Up @@ -91,7 +96,7 @@ func (p *OllamaProvider) Complete(ctx context.Context, req CompletionRequest) (*

resp, err := p.client.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("ollama API call failed (is Ollama running at %s?): %w", p.endpoint, err)
return nil, fmt.Errorf("ollama request failed: %w", formatHTTPError(err))
}
defer resp.Body.Close()

Expand Down
9 changes: 7 additions & 2 deletions internal/ai/openai.go
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,12 @@ func NewOpenAIProvider(cfg ProviderConfig) *OpenAIProvider {
model: model,
maxTokens: maxTokens,
temperature: temp,
client: &http.Client{},
client: NewHTTPClient(
cfg.Timeout,
cfg.Proxy,
cfg.CACertFile,
cfg.InsecureSkipVerify,
),
}
}

Expand Down Expand Up @@ -96,7 +101,7 @@ func (p *OpenAIProvider) Complete(ctx context.Context, req CompletionRequest) (*

resp, err := p.client.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("openai API call failed: %w", err)
return nil, fmt.Errorf("openai request failed: %w", formatHTTPError(err))
}
defer resp.Body.Close()

Expand Down
Loading