File connector (CSV only for now) by alex-thc · Pull Request #367 · adiom-data/dsync

alex-thc · 2026-02-11T00:29:28Z

Summary by CodeRabbit

New Features
- Added a local file-based connector for CSV sources/sinks with configurable format, delimiter, and batch-size, available via CLI flags.
Documentation
- Added an end-to-end connector guide, checklist, and reference materials to standardize connector creation, registration, testing, and verification.
Tests
- Added comprehensive file connector tests covering read/write, pagination, error cases, and a new toggle to skip sink-update tests.
Chores
- Registered the File connector in the connector registry.
Style
- Added selective static-analysis suppressions and utilities to improve CSV/file handling.

coderabbitai · 2026-02-11T00:29:49Z

Warning

Rate limit exceeded

@alex-thc has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 15 minutes and 44 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Adds a new CSV file-based connector (implementation, utilities, tests), CLI registration and flags, documentation/checklists for creating dsync connectors, a test-harness flag to skip WriteUpdates tests, and gosec suppression comments in two connector files.

Changes

Cohort / File(s)	Summary
Documentation & Skill Files `\.factory/skills/create-dsync-connector/SKILL.md`, `\.factory/skills/create-dsync-connector/checklist.md`, `\.factory/skills/create-dsync-connector/references.md`	New guide, checklist, and references describing connector structure, registration, testing, and verification steps.
File Connector Implementation `connectors/file/connector.go`, `connectors/file/util.go`	New CSV file connector and helpers: URI parsing, namespace↔path mapping, CSV row counting, JSON/BSON conversion, plan generation, ListData/WriteData, writer lifecycle (Teardown), sentinel errors and defaults.
File Connector Tests `connectors/file/connector_test.go`	Comprehensive tests for connection creation, GeneratePlan, ListData, WriteData, JSON/BSON formats, custom delimiters, pagination, nested namespaces, error cases, and end-to-end scenarios.
Connector Registration & CLI `internal/app/options/connectorflags.go`	Registers "File" connector (recognizes `file://`), adds `FileFlags` (format, delimiter, batch-size), validates delimiter, and wires creation to `fileconnector.NewConn`.
Test Framework `pkg/test/connector.go`	Adds `SkipWriteUpdatesTest bool` to `ConnectorTestSuite` and conditionally skips WriteUpdates tests when set.
Lint Suppressions `connectors/airbyte/helpers.go`, `connectors/s3/connector.go`	Added gosec disable comments to suppress false positives; no behavioral changes.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant FileConn as FileConnector
  participant FS as FileSystem
  participant Test as TestHarness

  Client->>FileConn: GeneratePlan(request)
  FileConn->>FS: Stat / list files (dir or single file)
  FS-->>FileConn: file list + sizes
  FileConn-->>Client: Partitions (namespace, cursor, estimatedCount)

  Client->>FileConn: ListData(partition, cursor)
  FileConn->>FS: Open/read file from offset
  FS-->>FileConn: CSV bytes
  FileConn-->>Client: Documents (JSON/BSON) + next cursor

  Test->>FileConn: WriteData(namespace, documents)
  FileConn->>FS: Create/append CSV, write header/rows
  FS-->>FileConn: ack
  FileConn-->>Test: WriteResult

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

s3vectors support as sink #357: Similar changes to connector registration flow in internal/app/options/connectorflags.go.
Fakesource #292: Another PR adding a connector and wiring CLI flags/registration.

Suggested reviewers

adiom-mark

Poem

🐰 I hopped through files by lantern-light,
commas and headers tucked in tight,
namespaces stitched from path to dot,
rows become docs — I celebrated a lot,
🥕 the connector hums into the night.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.77% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: a new file connector implementation that currently supports CSV format only.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch file_connector

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@connectors/file/connector.go`:
- Around line 418-437: The CSV header construction in the loop over docs uses
map iteration (for k := range doc) which is nondeterministic; to fix, collect
the non-id keys from doc into a slice, sort.Strings that slice before appending
to writer.header (preserve special handling for "id" or "_id" so id appears
first if present), then set writer.header and call
writer.writer.Write(writer.header) as before; update the block around the docs
iteration that computes header (references: writer.header, docs, doc, header
slice, namespace, writer.writer.Write) so CSV columns are produced in a stable,
deterministic order.

🧹 Nitpick comments (7)

internal/app/options/connectorflags.go (1)

667-690: delimiterStr on line 668 is only written to but never read.

The Destination: &delimiterStr writes the CLI string into the local variable, but the actual assignment to settings.Delimiter happens in the Action callback. The delimiterStr variable itself is never used after that. This works because the Action receives the value v directly, but it's slightly misleading.

Note that if the user doesn't pass --delimiter, the Action won't fire and settings.Delimiter stays 0. This is safe only because NewConn (line 115 in connector.go) defaults 0 to DefaultDelimiter. This implicit coupling is worth being aware of.

Also, a multi-byte UTF-8 character (e.g., —) passed as --delimiter would pass the len(v) != 1 byte-length check despite being a single Unicode code point. If only ASCII delimiters are intended, this is fine; otherwise, consider using utf8.RuneCountInString(v) != 1.

connectors/file/connector_test.go (1)

405-405: Fragile type assertion for Teardown().

conn.(interface{ Teardown() }).Teardown() relies on the concrete type having a Teardown() method that isn't part of the ConnectorServiceHandler interface. If the method signature ever changes, this will panic at runtime rather than failing at compile time. This pattern is repeated on lines 405, 446, 487, and 532.

Consider adding Teardown to a small interface or at least adding a comment explaining why this cast is necessary.
connectors/file/connector.go (5)
97-136: Dead code: line 104 path-empty check is unreachable.

parseFileConnectionString already returns an error when the path is empty (line 70-72), so settings.Path can never be empty at line 104. The check and ErrPathRequired are unreachable after a successful parse.

241-260: countRecords reads the entire file into memory.

reader.ReadAll() loads the full CSV content. This method is called from both GeneratePlan (which walks a directory, calling it per file) and GetNamespaceMetadata. For large CSV files or directories with many files, this can cause significant memory pressure during planning.

A line-counting approach (reading row by row and discarding) would be more memory-efficient if this becomes an issue.

290-352: All data returned in a single response — no pagination.

ListData loads the entire CSV with ReadAll() and returns everything at once with NextCursor: nil. For very large files, this means the entire dataset must fit in memory simultaneously (both as CSV records and serialized bytes).

This is acceptable for a v1 CSV connector, but worth noting that it will not scale to large files. If pagination is added later, the cursor bytes (currently holding the file path) would need to also encode an offset.

491-499: Teardown ignores errors from file.Close().

On certain filesystems (e.g., NFS), Close() can fail even after a successful Flush(). Since Teardown() has no return value, consider at least logging errors:
🔧 Proposed fix
 func (c *connector) Teardown() {
 	c.writeMutex.Lock()
 	defer c.writeMutex.Unlock()
 
-	for _, writer := range c.writers {
+	for ns, writer := range c.writers {
 		writer.writer.Flush()
-		writer.file.Close()
+		if err := writer.file.Close(); err != nil {
+			slog.Warn("failed to close CSV file", "namespace", ns, "err", err)
+		}
 	}
 	c.writers = make(map[string]*csvFileWriter)
 }
80-95: Namespace-to-path round-trip may break on namespaces that contain filesystem-sensitive characters.

pathToNamespace replaces path separators with . and namespaceToPath reverses it. This creates an ambiguity: a file named a.b.csv in the root directory produces namespace a.b, which namespaceToPath converts to a/b.csv (a subdirectory). This could silently misroute writes for flat files with dots in their names.

Given that this is a known design choice (dots = hierarchy), just worth keeping in mind as a documented limitation.

connectors/file/connector.go

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@connectors/file/util.go`:
- Around line 35-43: pathToNamespace currently maps file names with dots into
ambiguous namespaces (e.g., pathToNamespace -> "my.data") that namespaceToPath
can't round-trip; modify the mapping to be reversible by encoding dots in file
names before replacing path separators (or choose a reversible delimiter
scheme), e.g., escape or percent-encode literal '.' characters from base file
names inside pathToNamespace and decode them in namespaceToPath; update the
implementations of pathToNamespace and namespaceToPath together (referencing the
functions pathToNamespace and namespaceToPath) so that any encoding/escaping
rule is applied consistently and ensures round-trip fidelity between namespace
and file path.

🧹 Nitpick comments (4)

internal/app/options/connectorflags.go (1)
667-696: Delimiter validation uses byte length, not rune/character length.

len(v) != 1 on line 682 checks byte count. A single multi-byte UTF-8 character (e.g., · or →) would be rejected even though it's one character. If only ASCII delimiters are intended, this is fine but could be made explicit. If arbitrary single-character delimiters should be supported, use utf8.RuneCountInString(v) != 1 instead.

Additionally, rune(v[0]) on line 685 only converts the first byte, which would be incorrect for multi-byte characters — though currently guarded by the len(v) != 1 check.
Suggested fix for multi-byte support (if desired)
+	"unicode/utf8"
...
 		Action: func(_ *cli.Context, v string) error {
-			if len(v) != 1 {
+			if utf8.RuneCountInString(v) != 1 {
 				return fileconnector.ErrInvalidDelimiter
 			}
-			settings.Delimiter = rune(v[0])
+			r, _ := utf8.DecodeRuneInString(v)
+			settings.Delimiter = r
 			return nil
 		},
connectors/file/util.go (1)

52-91: countCSVRows does not handle bare \r (classic Mac) line endings.

Lines 72-73 treat \r outside quotes as a non-newline character. Files using bare \r as line endings (no \n) will be counted as a single row. This is acceptable per RFC 4180 (which specifies CRLF), but worth noting if the connector may ingest files from varied sources.
connectors/file/connector.go (2)
443-463: Header inference from first document only — subsequent docs with extra keys lose data silently.

If a later batch contains documents with keys not present in the first document, those columns are silently dropped since the header was already written. This is a reasonable CSV constraint, but could surprise users with heterogeneous data. Consider logging a warning when a document has keys not in the header.

489-491: Missing blank line before WriteUpdates.

Line 488 ends writeCSV and line 489 starts WriteUpdates with no separator.
Nitpick fix
 	writer.writer.Flush()
 	return writer.writer.Error()
 }
+
 func (c *connector) WriteUpdates(context.Context, *connect.Request[adiomv1.WriteUpdatesRequest]) (*connect.Response[adiomv1.WriteUpdatesResponse], error) {

coderabbitai · 2026-02-11T18:19:11Z

connectors/file/util.go

+func pathToNamespace(basePath, filePath, extension string) string {
+	relPath, err := filepath.Rel(basePath, filePath)
+	if err != nil {
+		relPath = filepath.Base(filePath)
+	}
+	relPath = strings.TrimSuffix(relPath, extension)
+	relPath = strings.ReplaceAll(relPath, string(filepath.Separator), ".")
+	return relPath
+}


⚠️ Potential issue | 🟡 Minor

Namespace ↔ path mapping is ambiguous when filenames contain dots.

A file like /base/my.data.csv produces namespace my.data, which namespaceToPath would resolve back to /base/my/data.csv — a different path. This round-trip asymmetry could cause issues when writing data to a namespace derived from a source file with dots in its name.

🤖 Prompt for AI Agents

In `@connectors/file/util.go` around lines 35 - 43, pathToNamespace currently maps file names with dots into ambiguous namespaces (e.g., pathToNamespace -> "my.data") that namespaceToPath can't round-trip; modify the mapping to be reversible by encoding dots in file names before replacing path separators (or choose a reversible delimiter scheme), e.g., escape or percent-encode literal '.' characters from base file names inside pathToNamespace and decode them in namespaceToPath; update the implementations of pathToNamespace and namespaceToPath together (referencing the functions pathToNamespace and namespaceToPath) so that any encoding/escaping rule is applied consistently and ensures round-trip fidelity between namespace and file path.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@connectors/file/connector.go`:
- Around line 7-8: Remove the invalid top-of-file gosec directive
`//gosec:disable G304` in connectors/file/connector.go (same issue as in
connectors/s3/connector.go); rely on the existing inline suppressions
(`//nolint:gosec`) that are already placed on the specific os.Open/os.Create
calls (the sites around the code that open/create files) instead of a no-op
global directive.

In `@connectors/s3/connector.go`:
- Around line 7-8: The gosec suppression "//gosec:disable G117" is using a
non-existent rule ID (G117) so it is ineffective; update the directive by either
removing the suppression line or replacing "G117" with a valid gosec rule ID
(e.g., "G304" or "G404") that matches the actual false positive, ensuring the
comment text remains "//gosec:disable <VALID_RULE>" and leaving any surrounding
code and behavior unchanged.

🧹 Nitpick comments (2)

connectors/file/connector.go (2)
479-481: Nested/complex values will serialize as Go-syntax strings (e.g., map[k:v]).

When val is a nested map or slice (common in BSON/JSON data), fmt.Sprintf("%v", val) produces Go's default representation, not valid JSON. Consider using json.Marshal for non-scalar values to produce a more useful output in the CSV cell.
♻️ Proposed fix
 		if val != nil {
-			row[i] = fmt.Sprintf("%v", val)
+			switch v := val.(type) {
+			case string:
+				row[i] = v
+			case nil:
+				// leave empty
+			default:
+				// For complex types (maps, slices), use JSON encoding
+				if b, err := json.Marshal(v); err == nil {
+					row[i] = string(b)
+				} else {
+					row[i] = fmt.Sprintf("%v", v)
+				}
+			}
495-506: Flush errors are silently lost during teardown.

csv.Writer.Flush() stores errors internally, retrievable via writer.writer.Error(). If the flush fails (e.g., disk full), the error is never checked or logged here.
♻️ Proposed fix
 	for ns, writer := range c.writers {
 		writer.writer.Flush()
+		if err := writer.writer.Error(); err != nil {
+			slog.Warn("failed to flush CSV writer", "namespace", ns, "err", err)
+		}
 		if err := writer.file.Close(); err != nil {
 			slog.Warn("failed to close CSV file", "namespace", ns, "err", err)
 		}
 	}

coderabbitai · 2026-02-11T18:38:09Z

connectors/file/connector.go

+//gosec:disable G304 -- This is a false positive
+


⚠️ Potential issue | 🟡 Minor

Same invalid gosec directive syntax as noted in connectors/s3/connector.go.

//gosec:disable G304 is not recognized by gosec. The actual suppression is handled correctly via inline //nolint:gosec comments on the specific os.Open/os.Create calls (lines 223, 291, 429). This top-of-file directive is a no-op.

🤖 Prompt for AI Agents

In `@connectors/file/connector.go` around lines 7 - 8, Remove the invalid top-of-file gosec directive `//gosec:disable G304` in connectors/file/connector.go (same issue as in connectors/s3/connector.go); rely on the existing inline suppressions (`//nolint:gosec`) that are already placed on the specific os.Open/os.Create calls (the sites around the code that open/create files) instead of a no-op global directive.

coderabbitai · 2026-02-11T18:38:09Z

connectors/s3/connector.go

+//gosec:disable G117 -- This is a false positive
+


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the actual file and context cat -n connectors/s3/connector.go | head -20

Repository: adiom-data/dsync

Length of output: 446

🏁 Script executed:

# Search for gosec directives in the codebase rg "//gosec:" --type=go -n rg "#nosec" --type=go -n

Repository: adiom-data/dsync

Length of output: 1113

🏁 Script executed:

# Check for G117 references rg "G117" --type=go -n rg -i "g117" --type=go -n

Repository: adiom-data/dsync

Length of output: 215

🌐 Web query:

gosec rule IDs complete list official documentation

💡 Result:

Web search failed: Server error: no LLM provider could handle the message

🌐 Web query:

gosec suppression syntax //gosec:disable vs #nosec``

💡 Result:

In gosec, you can suppress findings inline using either #nosec or //gosec:disable.

#nosec (primary / documented form)

Format: #nosec [RuleList] [-- Justification]

Placement: on the same line where the issue is reported (or as a block comment to suppress within a section).

Examples:

InsecureSkipVerify: true, // #nosec G402 -- controlled test environment

/* #nosec G401 -- acceptable in this context */ (pkg.go.dev)

//gosec:disable (alternative directive)

Works similarly to #nosec and supports rule + justification.

Example:

//gosec:disable G101 -- This is a false positive (pkg.go.dev)

Auditing note

You can run gosec while ignoring in-source suppressions via:

gosec -nosec=true ./... (pkg.go.dev)

Citations:

1: https://pkg.go.dev/github.com/securego/gosec/v2?utm_source=openai

2: https://pkg.go.dev/github.com/securego/gosec/v2?utm_source=openai

3: https://pkg.go.dev/github.com/securego/gosec/v2?utm_source=openai

🌐 Web query:

gosec G117 rule identifier what does G117 mean

💡 Result:

In gosec, a rule ID like G### is just the tool’s internal identifier for a specific check. In the official securego/gosec rule list, the “G1xx” series currently includes rules like G101–G115 (with G113 retired), and there is no official rule named G117. [1]

So if you’re seeing G117 in output, it typically means one of these is true:

You’re not actually running upstream securego/gosec (you’re using a fork/wrapper that adds or renumbers rules).

Another scanner is emitting a “gosec-compatible” ID that doesn’t exist in upstream gosec.

You can confirm what rules your installed binary knows by checking the upstream list and comparing it to your tool/version. [1]

Sources: official gosec README rule list. [1]

[1] (github.com)

Citations:

1: https://github.com/securego/gosec?utm_source=openai

🌐 Web query:

complete list gosec rule IDs G101 G102 G103 G104 G105 G106 G107 G108 G109 G110 G111 G112 G113 G114 G115 G116 G117 G118 G119

💡 Result:

gosec rule IDs G101–G119 (upstream securego/gosec)

Rule ID Meaning / status

G101 Look for hard coded credentials (github.com)

G102 Bind to all interfaces (github.com)

G103 Audit the use of unsafe block (github.com)

G104 Audit errors not checked (github.com)

G105 Retired: math/big.Int.Exp (CVE fixed) (github.com)

G106 Audit the use of ssh.InsecureIgnoreHostKey (github.com)

G107 URL provided to HTTP request as taint input (github.com)

G108 Profiling endpoint automatically exposed on /debug/pprof (github.com)

G109 Potential integer overflow via strconv.Atoi result conversion to int16/32 (github.com)

G110 Potential DoS via decompression bomb (github.com)

G111 Potential directory traversal (github.com)

G112 Potential slowloris attack (github.com)

G113 Retired: math/big.Rat.SetString overflow (CVE-2022-23772; old Go versions) (github.com)

G114 Use of net/http serve function without support for setting timeouts (github.com)

G115 Potential integer overflow when converting between integer types (github.com)

G116 Trojan Source / bidirectional Unicode text control characters (scans full file content) (pkg.go.dev)

G117 Not defined in upstream gosec rule list/docs (no current rule with this ID) (github.com)

G118 Not defined in upstream gosec rule list/docs (no current rule with this ID) (github.com)

G119 Not defined in upstream gosec rule list/docs (no current rule with this ID) (github.com)

Citations:

1: https://github.com/securego/gosec?utm_source=openai

2: https://github.com/securego/gosec?utm_source=openai

3: https://github.com/securego/gosec?utm_source=openai

4: https://github.com/securego/gosec?utm_source=openai

5: https://github.com/securego/gosec?utm_source=openai

6: https://github.com/securego/gosec?utm_source=openai

7: https://github.com/securego/gosec?utm_source=openai

8: https://github.com/securego/gosec?utm_source=openai

9: https://github.com/securego/gosec?utm_source=openai

10: https://github.com/securego/gosec?utm_source=openai

11: https://github.com/securego/gosec?utm_source=openai

12: https://github.com/securego/gosec?utm_source=openai

13: https://github.com/securego/gosec?utm_source=openai

14: https://github.com/securego/gosec?utm_source=openai

15: https://github.com/securego/gosec?utm_source=openai

16: https://pkg.go.dev/github.com/securego/gosec/v2/testutils?utm_source=openai

17: https://github.com/securego/gosec?utm_source=openai

18: https://github.com/securego/gosec?utm_source=openai

19: https://github.com/securego/gosec?utm_source=openai

//gosec:disable is valid syntax, but G117 is not a recognized gosec rule.

The //gosec:disable directive format is documented and supported by gosec, so that part of the comment is fine. However, G117 does not exist in the official gosec rule list (which currently ranges from G101–G116, with some retired). This means the suppression is ineffective—it's attempting to suppress a non-existent rule. Either reference a valid rule ID (like G304, G404) or remove the suppression if the false positive is not actually triggered by any real gosec check.

🤖 Prompt for AI Agents

In `@connectors/s3/connector.go` around lines 7 - 8, The gosec suppression "//gosec:disable G117" is using a non-existent rule ID (G117) so it is ineffective; update the directive by either removing the suppression line or replacing "G117" with a valid gosec rule ID (e.g., "G304" or "G404") that matches the actual false positive, ensuring the comment text remains "//gosec:disable <VALID_RULE>" and leaving any surrounding code and behavior unchanged.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@connectors/file/connector.go`:
- Around line 516-527: The Teardown method currently calls writer.writer.Flush()
but never checks its error; update connector.Teardown to mirror writeCSV by
calling writer.writer.Flush() then checking writer.writer.Error() and logging
any error (use slog.Warn with namespace and error) before attempting
writer.file.Close(), and continue closing remaining writers; ensure c.writers is
still cleared at the end and reference the csvFileWriter type, the
writer.writer.Flush()/writer.writer.Error() calls, writer.file.Close(), and the
writeCSV behavior as the model for logging/handling.

🧹 Nitpick comments (2)

connectors/file/connector.go (2)
447-472: Header is inferred from only the first document — keys in later documents will be silently dropped.

If documents across different WriteData batches have heterogeneous schemas, any key not present in the very first document will never appear in the CSV. The warning on line 481 fires, but data is lost. This is an inherent CSV limitation, but consider documenting this constraint prominently (e.g., in the ConnectorSettings godoc or the error message itself) so users are not surprised by silent data loss.

480-480: Nit: prefer !writer.headerSet[normalizedKey] over == false.

Idiomatic Go for boolean map lookups.
Proposed fix
-			if writer.headerSet[normalizedKey] == false {
+			if !writer.headerSet[normalizedKey] {

coderabbitai · 2026-02-11T18:49:26Z

connectors/file/connector.go

+func (c *connector) Teardown() {
+	c.writeMutex.Lock()
+	defer c.writeMutex.Unlock()
+
+	for ns, writer := range c.writers {
+		writer.writer.Flush()
+		if err := writer.file.Close(); err != nil {
+			slog.Warn("failed to close CSV file", "namespace", ns, "err", err)
+		}
+	}
+	c.writers = make(map[string]*csvFileWriter)
+}


⚠️ Potential issue | 🟡 Minor

Teardown ignores flush errors — data loss can go unnoticed.

writer.writer.Flush() may fail (e.g., disk full), but writer.writer.Error() is never checked. Compare with writeCSV (line 510) which correctly checks the flush error.

Proposed fix

for ns, writer := range c.writers { writer.writer.Flush() + if err := writer.writer.Error(); err != nil { + slog.Warn("failed to flush CSV writer", "namespace", ns, "err", err) + } if err := writer.file.Close(); err != nil { slog.Warn("failed to close CSV file", "namespace", ns, "err", err) } }

🤖 Prompt for AI Agents

In `@connectors/file/connector.go` around lines 516 - 527, The Teardown method currently calls writer.writer.Flush() but never checks its error; update connector.Teardown to mirror writeCSV by calling writer.writer.Flush() then checking writer.writer.Error() and logging any error (use slog.Warn with namespace and error) before attempting writer.file.Close(), and continue closing remaining writers; ensure c.writers is still cleared at the end and reference the csvFileWriter type, the writer.writer.Flush()/writer.writer.Error() calls, writer.file.Close(), and the writeCSV behavior as the model for logging/handling.

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

connectors/file/connector.go Show resolved Hide resolved

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

alex-thc added 21 commits February 11, 2026 10:52

initial impl

b400e68

fix bson id business

2f58d05

fix

9ecaa58

add tests

66c1302

more tests

c0ab9bb

test fixes

fd8c21e

error message improvements

2ce0572

Add FactoryAI skills

37b5bb6

fix gosec issues

0b91337

sort header when writing

cb6523e

count lines in countRecords()

ee7f35d

add pagination (unoptimized)

2edb79d

use byte offsets

ec4073f

add --batch-size

94227aa

add integrity tests

a037f46

use byte offsets

28a0674

improve countRecords perf

63df0aa

refactoring

06f5126

ignore gosec

b1cb997

report keys not in the header

7a9f11f

_id and id on write

ef683ab

alex-thc force-pushed the file_connector branch from 4a05b82 to ef683ab Compare February 11, 2026 18:52

gosec....

e97f475

alex-thc requested a review from adiom-mark February 11, 2026 19:34

alex-thc enabled auto-merge (squash) February 11, 2026 22:04

alex-thc removed the request for review from adiom-mark February 11, 2026 22:32

alex-thc disabled auto-merge February 11, 2026 22:33

alex-thc merged commit d20c75b into main Feb 11, 2026
2 checks passed

alex-thc deleted the file_connector branch February 11, 2026 22:34

Rule ID	Meaning / status
G101	Look for hard coded credentials (github.com)
G102	Bind to all interfaces (github.com)
G103	Audit the use of `unsafe` block (github.com)
G104	Audit errors not checked (github.com)
G105	Retired: `math/big.Int.Exp` (CVE fixed) (github.com)
G106	Audit the use of `ssh.InsecureIgnoreHostKey` (github.com)
G107	URL provided to HTTP request as taint input (github.com)
G108	Profiling endpoint automatically exposed on `/debug/pprof` (github.com)
G109	Potential integer overflow via `strconv.Atoi` result conversion to `int16/32` (github.com)
G110	Potential DoS via decompression bomb (github.com)
G111	Potential directory traversal (github.com)
G112	Potential slowloris attack (github.com)
G113	Retired: `math/big.Rat.SetString` overflow (CVE-2022-23772; old Go versions) (github.com)
G114	Use of `net/http` serve function without support for setting timeouts (github.com)
G115	Potential integer overflow when converting between integer types (github.com)
G116	Trojan Source / bidirectional Unicode text control characters (scans full file content) (pkg.go.dev)
G117	Not defined in upstream gosec rule list/docs (no current rule with this ID) (github.com)
G118	Not defined in upstream gosec rule list/docs (no current rule with this ID) (github.com)
G119	Not defined in upstream gosec rule list/docs (no current rule with this ID) (github.com)

Conversation

alex-thc commented Feb 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 11, 2026

Choose a reason for hiding this comment

#nosec (primary / documented form)

//gosec:disable (alternative directive)

Auditing note

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alex-thc commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 11, 2026 •

edited

Loading

`#nosec` (primary / documented form)

`//gosec:disable` (alternative directive)