Skip to content

[gnoi] file: implement Stat in-house using /mnt/host#697

Open
hdwhdw wants to merge 6 commits into
sonic-net:masterfrom
hdwhdw:daweihuang/gnoi-file-stat-inhouse
Open

[gnoi] file: implement Stat in-house using /mnt/host#697
hdwhdw wants to merge 6 commits into
sonic-net:masterfrom
hdwhdw:daweihuang/gnoi-file-stat-inhouse

Conversation

@hdwhdw
Copy link
Copy Markdown
Contributor

@hdwhdw hdwhdw commented Jun 3, 2026

Why I did it

File.Stat currently delegates to a SONiC host service over D-Bus
(sonic_service_client.GetFileStat). That introduces a runtime
dependency on the host service for what is fundamentally a metadata
read against the host filesystem — metadata that is already reachable
from inside the gNMI container through the existing /mnt/host bind
mount used by File.Put, File.Remove, and File.TransferToRemote.

Bringing Stat in-house removes the D-Bus hop, lets the RPC be served
entirely from the gNMI binary, and aligns Stat with how the rest of
the file RPCs already access the host filesystem. It also unblocks
directory Stat: per the gNOI proto comment ("Stat will list files at
the provided path") a directory input should return one entry per
child, which the D-Bus path did not implement.

How I did it

  • Added a pure-Go handler pkg/gnoi/file.HandleStat that reads
    metadata via os.Stat / os.ReadDir and reuses the existing
    translatePathForContainer helper to bridge container-vs-host
    paths.
  • The handler now supports both inputs:
    • regular file → single StatInfo for that file
    • directory → one StatInfo per immediate child (non-recursive)
  • gnmi_server.FileServer.Stat is now a thin wrapper: authenticate,
    then delegate to HandleStat. The readFileStat helper and the
    D-Bus dependency for Stat are removed.
  • Permissions are encoded octal-as-decimal as the proto requires
    (e.g. mode 0644Permissions=644).
  • umask is reported as a constant 0022 (defaultUmask).
    Rationale: the proto field is "default file creation mask" — a
    process-level attribute, not a per-file one — and it can't be
    derived from os.Stat. syscall.Umask is racy across goroutines
    (it both reads and writes in a single call). The previous D-Bus
    implementation also returned a process umask, so callers that
    treated this as informational see no change.
Corner cases worth flagging

Not blockers, but reviewers may want to confirm these match intent:

  1. No path-traversal whitelist on Stat. Cleaned paths like
    /tmp/../etc/hostname resolve to /etc/hostname and Stat returns
    it. This matches the previous D-Bus implementation (which also
    had no whitelist) and is consistent with Stat being a read-only
    metadata RPC. The write/remove RPCs (Put, TransferToRemote,
    Remove) continue to enforce the existing /tmp/, /var/tmp/,
    /host/ whitelist. Easy to add a Stat whitelist later if security
    review wants one.
  2. Absolute symlinks point inside the container, not the host.
    os.Stat follows symlinks; if the link target is host-absolute,
    the kernel resolves it against the gNMI container's filesystem
    root, not the host's, and Stat returns NotFound. This is a
    pre-existing limitation of every /mnt/host-based code path
    (internal/diskspace, pkg/gnoi/file.Put, Remove,
    TransferToRemote); it is not introduced by this change.
    Relative symlinks resolve correctly.
  3. Asymmetric symlink handling within Stat. Stat on a single
    symlink uses os.Stat (follows). Listing a directory uses
    os.ReadDir + entry.Info(), which is lstat-equivalent (does
    not follow). The proto does not specify either way; ls -l does
    not follow either.
  4. PermissionDenied is not exercised by the live test. The
    gNMI container in a vanilla KVM testbed runs as uid=0 with
    CAP_DAC_OVERRIDE, so no filesystem permission can deny it.
    The branch is exercised by the unit tests; it will fire in
    hardened production deployments where the container is
    unprivileged.

How to verify it

Unit tests
  • 8 new tests in pkg/gnoi/file/stat_test.go covering nil request,
    empty path, relative path, NotFound, regular-file success,
    directory listing (with a subdir), empty directory, and the
    octal-as-decimal permissions encoding for several modes.
  • Existing gnmi_server Stat tests rewritten to drive the new
    handler against real temp files (no D-Bus mocks).
  • Pure-test count: 628 → 636 (+8 new). No regressions.
Live end-to-end test on a sonic-mgmt KVM testbed (vlab-01)

Built the deb from the branch, installed it inside the running gNMI
container via the host bind mount, restarted the supervised gNMI
process, and exercised File.Stat over the gNMI container's local
Unix-domain socket.

Reproducible example:

# 1. Build the deb (any sonic-gnmi build path works; e.g.
#    dpkg-buildpackage in the sonic-slave-trixie container).
dpkg-buildpackage -rfakeroot -b -us -uc

# 2. Push it to the DUT and install it inside the gNMI container.
scp ../sonic-gnmi_0.1_amd64.deb admin@vlab-01:/tmp/
ssh admin@vlab-01 \
  'docker exec gnmi dpkg -i /mnt/host/tmp/sonic-gnmi_0.1_amd64.deb && \
   docker exec gnmi supervisorctl restart gnmi-native'

# 3. Make a fixture and stat the directory through the gNMI UDS.
ssh admin@vlab-01 'mkdir -p /tmp/stat-smoke && \
  printf "hello stat" > /tmp/stat-smoke/file.txt && \
  touch /tmp/stat-smoke/empty.bin && mkdir /tmp/stat-smoke/sub'

ssh admin@vlab-01 \
  "sudo grpcurl -plaintext -d '{\"path\":\"/tmp/stat-smoke\"}' \
       unix:///var/run/gnmi/gnmi.sock gnoi.file.File/Stat"

Real output:

{
  "stats": [
    {
      "path": "/tmp/stat-smoke/empty.bin",
      "lastModified": "1780514631773086705",
      "permissions": 664,
      "umask": 18
    },
    {
      "path": "/tmp/stat-smoke/file.txt",
      "lastModified": "1780514631769086705",
      "permissions": 664,
      "size": "10",
      "umask": 18
    },
    {
      "path": "/tmp/stat-smoke/sub",
      "lastModified": "1780514631773086705",
      "permissions": 775,
      "umask": 18
    }
  ]
}

Notes on the example:

  • unix:///var/run/gnmi/gnmi.sock is the gNMI container's local
    UDS. It bypasses the TLS / mTLS chain that the TCP listeners on
    :50051 / :50052 require, so it is the practical way to
    exercise the server interactively on a test DUT.
  • sudo is required because the socket is mode 0660 root:root.
  • permissions: 664 is octal 0664 encoded as decimal per the
    gNOI proto. umask: 18 is decimal for 0022 octal — the
    constant defaultUmask discussed above.
  • size is omitted (proto default 0) for directories.

Test matrix (all exercised against the deployed binary using the
same grpcurl / UDS flow):

Path Type Expected Got
/tmp/stat-smoke/file.txt regular file (10 B, 0664) 1 StatInfo with size=10, permissions=664
/tmp/stat-smoke directory with 3 children 3 StatInfo entries, dirs reported with no size
/tmp/this-does-not-exist-xyz absent NotFound
"" InvalidArgument: path cannot be empty
relative/foo InvalidArgument: path must be absolute
/etc/hostname regular file outside /tmp success — confirms /mnt/host translation is general
/var/log non-/tmp directory full child listing
/tmp/stat-extra/../stat-extra/target.txt path needing filepath.Clean resolves to canonical path; response carries the cleaned form
/host/reboot-cause real SONiC dir with mixed entries lists history/, platform/, reboot-cause.txt, and the previous-reboot-cause.json symlink

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111

Description for the changelog

[gnoi] file: implement Stat in-house using /mnt/host; add directory support

Link to config_db schema for YANG module changes

N/A — no YANG changes.

Replace the DBus-backed File.Stat handler with a pure-Go implementation
that reads metadata directly via os.Stat / os.ReadDir on the host
filesystem, bind-mounted into the gnmi container at /mnt/host. The
existing translatePathForContainer helper (already used by Put,
TransferToRemote, and Remove) handles the host-vs-container path
prefix.

Per the gNOI proto ("Stat will list files at the provided path") the
handler now supports both cases:

  - regular file -> single StatInfo for that file
  - directory   -> one StatInfo per immediate child (non-recursive)

Permissions are encoded octal-as-decimal as the proto requires (e.g.
0644 -> Permissions=644). The umask field is reported as a constant
0022 (defaultUmask); per-file umask is not derivable from os.Stat and
the previous DBus implementation also returned a process umask, so
behavior is unchanged for callers that don't rely on the exact value.

Tests:
  - 8 new pure-Go unit tests in pkg/gnoi/file/stat_test.go covering
    nil/empty/relative path, NotFound, regular file, directory listing
    (incl. subdirs), empty directory, and the octal-as-decimal
    permissions encoding.
  - Existing gnmi_server Stat tests rewritten to drive the new handler
    (no DBus mock); removes the dependency on
    sonic_service_client.GetFileStat for Stat.
  - Local pure-test run: 636 tests pass (was 628; +8 new).
  - Live verified end-to-end on a sonic-mgmt KVM testbed against the
    gnmi container's /var/run/gnmi/gnmi.sock UDS for /tmp/, /etc/,
    /var/log/, and /host/reboot-cause; see PR description for the full
    test matrix and a reproducible grpcurl example.
Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
Copilot AI review requested due to automatic review settings June 3, 2026 19:34
@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the gNOI File.Stat implementation to remove the D-Bus dependency and instead serve Stat requests directly from the gNMI binary by reading host filesystem metadata via the existing /mnt/host bind-mount translation logic.

Changes:

  • Added a pure-Go pkg/gnoi/file.HandleStat that uses os.Stat / os.ReadDir and emits StatInfo (including octal-as-decimal permissions and a constant umask).
  • Updated gnmi_server.FileServer.Stat to authenticate and delegate to the new handler; removed the legacy D-Bus-based Stat path.
  • Added/updated unit tests to cover Stat behavior for files and directories without D-Bus mocking.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/gnoi/file/file.go Adds HandleStat, statInfoFromFileInfo, and defaultUmask; switches Stat to host filesystem reads via translated paths.
pkg/gnoi/file/stat_test.go Adds new unit tests for HandleStat and permissions encoding.
gnmi_server/gnoi_file.go Replaces the old D-Bus Stat logic with a thin wrapper delegating to gnoifile.HandleStat.
gnmi_server/gnoi_file_test.go Reworks Stat tests to exercise the new handler behavior (real temp files/dirs).
gnmi_server/server_test.go Updates the gNOI Stat integration-style tests to validate the new filesystem-based behavior.

Comment thread pkg/gnoi/file/file.go
Comment thread pkg/gnoi/file/file.go
Comment thread pkg/gnoi/file/stat_test.go Outdated
Comment thread gnmi_server/gnoi_file_test.go Outdated
Behavior coverage for File.Stat lives in pkg/gnoi/file/stat_test.go
(NotFound, regular file, directory listing, empty directory, the
octal-as-decimal permissions encoding, nil/empty/relative request
validation). Re-asserting all of that through the gnmi_server gRPC
client just duplicated the pure-package coverage and forced fragile
/mnt/host skip guards into the gnmi_server suite.

Reduce gnmi_server's TestGnoiFile Stat coverage to two minimal
sub-tests that only verify server wiring:

  - the authenticate hook runs before the handler (Unauthenticated
    surfaces as gRPC Unauthenticated)
  - an authenticated request reaches HandleStat and a handler-level
    error (empty path -> InvalidArgument) propagates out as the
    matching gRPC status code

Drop the FileStatSuccess and FileStatFailure sub-tests from
gnmi_server/server_test.go for the same reason; nothing they
asserted was specific to the gnmi_server stack.

Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Three fixes to the new HandleStat from PR review:

1. Map os.ReadDir NotExist to NotFound. The directory may be removed
   between the os.Stat above the dir branch and the ReadDir inside it;
   that race used to surface as Internal, hiding a benign condition
   behind a server-error code.

2. Reject /mnt/host-prefixed input paths with InvalidArgument.
   translatePathForContainer unconditionally prepends /mnt/host when
   that directory exists, so a client that already sent a
   container-internal path would otherwise hit
   /mnt/host/mnt/host/... and get a confusing NotFound. Tell the
   client to drop the prefix instead.

3. Replace the test-level skipIfMntHost helper with a statTestRoot
   helper that creates the fixture under the *physical* root
   (/mnt/host/tmp/... when /mnt/host is present) while feeding the
   matching *logical* path to HandleStat. This restores Stat test
   coverage on containerized hosts where /mnt/host exists. Also
   adds TestHandleStat_RejectsMntHostPrefix to lock in the new
   InvalidArgument path.

Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Integration build (Test stage on the ADO pipeline) failed with:
  gnmi_server/gnoi_file_test.go:8:2: "path/filepath" imported and not used
  gnmi_server/server_test.go:68:2: "github.com/openconfig/gnoi/file" imported as gnoi_file_pb and not used

Pure tests didn't catch this because they don't compile gnmi_server. Remove
the now-orphaned imports left over after the test trim in 8b66d3a.

Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Diff coverage on the PR was 75% (43/57), below the 80% threshold. The
uncovered lines were all error paths in HandleStat that the test
container can't reach with real os.* calls: as root on tmpfs we don't
hit PermissionDenied (DAC bypassed), and we can't race a TOCTOU between
os.Stat and os.ReadDir.

Avoid gomonkey by introducing a tiny package-internal seam:

  var (
      fsStat    = os.Stat
      fsReadDir = os.ReadDir
  )

HandleStat calls those instead of os.Stat / os.ReadDir directly. Tests
swap them in for the duration of a single test (see withFsStat /
withFsReadDir helpers in stat_test.go) to inject specific errors:

  - TestHandleStat_StatPermissionDenied   (line 660-661)
  - TestHandleStat_StatGenericError       (line 663)
  - TestHandleStat_ReadDirNotExistRace    (line 683-684)
  - TestHandleStat_ReadDirPermissionDenied(line 686-687)
  - TestHandleStat_ReadDirGenericError    (line 689)

Production behavior is unchanged: fsStat and fsReadDir keep their os.*
defaults, the original os.Is{NotExist,Permission} mapping logic is the
same, and no new package is introduced.

HandleStat statement coverage: ~78% -> 89.1%; diff coverage clears the
80% gate.

Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@hdwhdw hdwhdw requested review from ryanzhu706 and sneelam20 June 4, 2026 13:56
hdwhdw added a commit to hdwhdw/sonic-gnmi that referenced this pull request Jun 4, 2026
Replace the Unimplemented stub for File.Get with a pure-Go handler
that streams a host file directly to the caller via the existing
/mnt/host bind mount, the same translation pattern File.Put,
File.TransferToRemote, File.Remove, and (after sonic-net#697) File.Stat
already use. No D-Bus hop, no host service.

Per the gNOI proto: "Get reads and streams the contents of a file
from the target. The file is streamed by sequential messages, each
containing up to 64KB of data. A final message is sent prior to
closing the stream that contains the hash of the data sent."

HandleGet implements that:
  - validate request (non-nil, absolute path, not /mnt/host-prefixed)
  - reject directories and non-regular files with FailedPrecondition
  - cap file size at the package-wide maxFileSize (4 GiB)
  - stream 64 KiB chunks while updating a running MD5
  - send a final HashType{MD5, sum} message
  - check stream context between chunks so cancelled clients abort
    promptly

MD5 matches the convention HandlePut and HandleTransferToRemote
already use in this package; it's integrity-only, not security
critical.

Tests:
  - pkg/gnoi/file/get_test.go covers nil request, empty path,
    relative path, /mnt/host prefix rejection, NotFound, directory
    rejection, empty file (just the hash), small file (single chunk
    + hash), 200 KiB file (forces 4 chunks + hash, validates payload
    and MD5 round-trip), Send error propagation, and cancelled
    context. Uses a fake File_GetServer that copies Contents on Send
    to mirror real gRPC's synchronous-marshal contract.
  - gnmi_server/gnoi_file_test.go: replace the old
    Get_Fails_With_Unimplemented_Error sub-test with a
    Get_Delegates_To_Handler smoke test that confirms an
    authenticated request reaches HandleGet and a handler-level
    error (empty remote_file -> InvalidArgument) propagates out
    through the server stack as the matching gRPC status code.
    Auth-error sub-test kept as-is.
  - Local pure suite: 628 -> 639 (+11). Live verified end-to-end on
    a sonic-mgmt KVM testbed via grpcurl over /var/run/gnmi/gnmi.sock
    for /tmp/, /etc/, and /host/ paths; see PR description.
Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants