Skip to content

Add reusable React components: file uploader and file tree view#898

Draft
ErykKul wants to merge 109 commits into
developfrom
feature/standalone-file-uploader
Draft

Add reusable React components: file uploader and file tree view#898
ErykKul wants to merge 109 commits into
developfrom
feature/standalone-file-uploader

Conversation

@ErykKul
Copy link
Copy Markdown
Contributor

@ErykKul ErykKul commented Dec 5, 2025

What this PR does / why we need it:

This PR ships two reusable frontend components that work in both the React SPA and the legacy JSF UI:

  • dv-uploader — the React file uploader, built as a standalone bundle that JSF dataset-edit pages can mount via <script type="module">. Replaces the PrimeFaces upload widget for direct-upload S3 storage when the matching feature flag is on. The SPA upload page uses the same shared core.
  • dv-tree-view — a virtualised, lazily-loaded folder tree for the dataset Files tab. Replaces the classic PrimeFaces tree on JSF when its feature flag is on; in the SPA it lives next to the existing files table behind a Table/Tree toggle. Tri-state per-row checkboxes with a header select-all, full WAI-ARIA tree keyboard navigation (Up/Down/Left/Right/Home/End/Space/Enter), URL-bookmarkable folder paths via ?view=tree&path=…, and client-side streaming-zip download of the user's selection — multi-file selections are zipped in the browser via client-zip (~3 KB gzip) without any server-side ZIP endpoint.

Both bundles share the same shared-core React tree under src/sections/...Core.tsx and a thin src/standalone-<component>/ wrapper that reads window.<componentConfig> and mounts the React tree into a Shadow DOM root. CSS is isolated in both directions.

This is the dual-mode pattern the project has been moving toward: one React component, two host environments, no fork.

Which issue(s) this PR closes:

Special notes for your reviewer:

Cross-repo coupling — read together

This PR is one of three that ship together:

Repo PR Branch
dataverse-frontend this PR (#898) feature/standalone-file-uploader
dataverse-client-javascript IQSS/dataverse-client-javascript#403 feature/configurable-uploads
IQSS/dataverse IQSS/dataverse#12382 6691_reusable_components

package.json pins the SDK to a GitHub Packages prerelease (2.2.0-pr403.3d6f638). Before this PR can merge to develop, the SDK needs a stable 2.2.0 release. Reviewers can ignore the prerelease pin during review; we'll flip to the stable version as soon as IQSS/dataverse-client-javascript#403 lands.

A co-landing dependency on IQSS/dataverse#12188 (session-cookie API hardening) is noted: the standalone bundles call the API with the user's session cookie via dataverse.feature.api-session-auth. #12188 adds the matching dataverse.feature.api-session-auth-hardening track (Origin/Referer + X-Dataverse-CSRF-Token), and a small follow-up PR on @iqss/dataverse-client-javascript will add the matching CSRF-token wiring on the SDK side (TODO: after IQSS/dataverse#12188 is merged). All four PRs target the same Dataverse release.

Reviewer's guide

The diff is ~13.6k LOC across 122 files; here's how to walk it efficiently.

Start here (~30 min for a thorough read):

  1. src/sections/dataset/dataset-files/files-tree/FilesTree.tsx — the React tree component, ~700 lines, all the keyboard-navigation, virtualisation, selection, and download logic.
  2. src/sections/dataset/dataset-files/files-tree/useFileTree.ts — the lazy-loading hook with cursor pagination + reset-on-version-change.
  3. src/sections/dataset/dataset-files/files-tree/useStreamingZipDownload.ts — the client-side zip engine, including the pause-on-fail / Skip / two-pass UI flow.
  4. src/standalone-tree-view/index.tsx and src/standalone-uploader/index.tsx — both ~150 lines, identical structure: read window config → init i18n + ApiConfig → mount in Shadow DOM with a MutationObserver guard for JSF partial-update remounts.
  5. src/standalone-shared/shadow-mount.ts — the Shadow DOM mounting helper used by both standalone bundles.
  6. docs/reusable-components.md — the contract; explains the dual-mode pattern, Shadow DOM CSS isolation, and the JSF-integration gotchas we hit.

Skim or skip (auto-correlated with the source files above):

  • tests/component/sections/dataset/dataset-files/files-tree/*.spec.tsx — spec files that mirror the source. All green; coverage gate (95% branches in .nycrc.json) is cleared.
  • tests/component/sections/shared/file-uploader/*.spec.tsx — uploader hook tests.
  • The locale JSON additions under public/locales/{en,es}/files.json and public/locales/{en,es}/shared.json.

Heads-up:

  • src/sections/shared/file-uploader/FileUploaderPanel.tsx and FileUploaderPanelCore.tsx had a parent/child effect-ordering race introduced when the panel was split for the standalone bundle: the child's success effect called navigate() before the parent's useBlocker had registered the new (false) predicate, so React Router's blocker latched state='blocked' and the "Discard Uploaded Files?" leave modal showed even though the save had succeeded. The fix colocates the navigate effect with useBlocker in the same component (parents own their navigation; PanelCore only toasts on success). See CHANGELOG.md under Fixed.

Lessons paid for during the pilot — captured in docs/reusable-components.md

The first end-to-end pilot turned up several JSF-integration footguns that aren't visible from the SPA side. They're documented in docs/reusable-components.md so the next reusable component doesn't pay for them again:

  • PrimeFaces partial updates re-insert the host <div> without re-running the module script. Already-loaded ES modules don't re-execute on DOM mutation, so the React Root ends up orphaned on a removed div and the freshly-inserted div sits empty. Fix: a MutationObserver in each standalone wrapper that detects host-element identity change and re-mounts.
  • credentials: 'include' + redirect to S3 = browser CORS rejection. When download-redirect=true storage returns a 302 to a presigned S3 URL, browsers carry the credentials mode through the redirect. S3's Allow-Origin: * plus a credentialed request is rejected. Fix: credentials: 'same-origin' so cookies travel only on the same-origin Dataverse hop.
  • var(--bs-*) references need hardcoded fallbacks when the bundle is hosted in JSF (Bootstrap-3 or no Bootstrap). CSS custom properties penetrate Shadow DOM; without a fallback, border: 1px solid var(--bs-border-color) resolves to invisible.

Known open ends

  • Bundle distribution. Currently the IQSS/dataverse PR commits the prebuilt bundles into webapp/reusable-components/ (WAR-build). The longer-term option (WAR-extract: pull the bundle from a dataverse-frontend-published image at WAR-build time) is not implemented in this PR. Open for the team to decide; nothing in this PR forecloses it.

Suggestions on how to test this:

Two paths.

A. SPA-only (fastest)

npm install
npm run dev        # SPA on :5173, proxies API to :8080
  • Navigate to a dataset, switch to the Tree view via the toggle.
  • Expand folders, select a mix of files and folders, click Download — multi-file selection should produce a streaming zip; single-file should be a direct download.
  • Refresh on a deep folder URL (?view=tree&path=foo/bar) — the tree should restore to that folder.
  • Keyboard nav: tab into the tree, then ↓ ↑ → ← Home End Space Enter should all work.
  • Test the failure UI by stopping the dev backend mid-download — the tray should pause with Retry / Skip / Skip & retry at end / Skip all options.

B. Full JSF integration

Requires the matching IQSS/dataverse PR IQSS/dataverse#12382 and SDK PR IQSS/dataverse-client-javascript#403 deployed. See those PRs' test plans for the dev stack setup. Then:

  • On the dataset Edit Files page, the React uploader should replace the PrimeFaces upload widget when dataverse.feature.react-uploader=true and storage has upload-redirect=true.
  • On the dataset Files tab, with dataverse.feature.react-tree-view=true and the user toggled to Tree, the React tree should load.
  • Verify the tree loads on every Table↔Tree toggle (the regression that motivated the MutationObserver fix).

Per-PR backend images. The IQSS/dataverse#12382 CI publishes preview images so reviewers don't have to build the server PR locally:

  • ghcr.io/gdcc/dataverse:6691-reusable-components
  • ghcr.io/gdcc/configbaker:6691-reusable-components

To exercise this PR's frontend against that server PR's code (new tree endpoint, the feature-flag mounts, the tagging field on the upload-destination response), set REGISTRY=ghcr.io in dev-env/.env and start the stack with ./run-env.sh 6691-reusable-components. The dev-env compose currently hardcodes gdcc/configbaker:unstable — for full PR-image testing override that locally too, or accept that the configbaker bootstrap step runs against unstable while the Dataverse server runs the PR image.

C. Coverage / lint

npx nyc check-coverage   # branches >= 95%
npm run lint
npm run typecheck
# Full unit suite (~4 min on a 6-core box):
/path/to/dataverse-context/scripts/fe-test-parallel.sh
# Authoritative sequential run for the coverage gate (~18 min):
/path/to/dataverse-context/scripts/fe-test-coverage.sh

Does this PR introduce a user interface change?

Yes:

  • New "Tree" entry in the Files-tab view toggle, with its own header (select-all column + Name/Size/Files columns).
  • New bottom-sheet download tray that surfaces per-file failures inline.
  • Per-row checkbox in a dedicated thin column.
  • The standalone uploader replaces the PrimeFaces upload widget on JSF when its feature flag is on.

No mockup links to add; the visual is a direct evolution of the existing SPA FilesTable patterns.

Is there a release notes or changelog update needed for this change?:

Yes — CHANGELOG.md is updated for the full PR scope: original feature surface (uploader, tree view, streaming-zip, SDK bump, domain layer), demo-prep iteration polish (Shadow DOM mount, header select-all, MutationObserver-based remount, credentials: 'same-origin', BS5 color fallbacks), and the late-cycle fixes (typed storage-driver capability checks, Shadow-DOM-aware focus follow, leave-modal stale-blocker race).

Additional documentation:

  • docs/reusable-components.md (in this PR) — the frontend half of the dual-mode contract: build pipeline, Shadow DOM, JSF remount, credentials, fallback-colors rule.
  • IQSS/dataverse PR IQSS/dataverse#12382's release notes — the operator-facing half (feature flags, JVM settings, dev-stack notes, LocalStack/MinIO setup).

AI-assistance disclosure

Some parts of this work were developed with the help of Claude (Anthropic) via Claude Code.

Reviewer attention is still required: AI-assisted code is still author-owned, and we've reviewed every diff that landed. Flagging this so reviewers can apply whatever scrutiny they reserve for AI-touched changes.

@coveralls
Copy link
Copy Markdown

coveralls commented Dec 5, 2025

Coverage Status

coverage: 97.184% (-0.4%) from 97.535% — feature/standalone-file-uploader into develop

… handling

- Add fallback to event.dataTransfer.files when webkitGetAsEntry() returns null
  (fixes Firefox drag-and-drop regression)
- Set webkitRelativePath on files obtained via FileSystemFileEntry for consistency
- Strip leading slash from entry.fullPath in both addFromDir and handleDroppedItems
- Add folder selection button with webkitdirectory input for cross-browser folder uploads
ErykKul added 12 commits May 4, 2026 13:12
…lopment

Adds DATAVERSE_FEATURE_API_SESSION_AUTH, DATAVERSE_FEATURE_API_SESSION_AUTH_HARDENING,
and dataverse.siteUrl=http://localhost:8000 so the dev environment supports session-based
API authentication (required for the standalone uploader integration).

Also fixes pre-existing issues blocking the pre-commit hook: missing @types/turndown,
vite.config.uploader.ts not included in tsconfig, and two lint errors in
useFileUploadOperations.spec.tsx.
…ing config

- index.tsx: replace API_KEY auth with SESSION_COOKIE; remove FilesConfig
  import and FilesConfig.init() call
- config.ts: remove useS3Tagging, maxRetries, uploadTimeoutMs, and apiKey
  from window config and StandaloneUploaderConfig; key URL param no longer
  required
- HTML templates: remove stale dvWebloaderConfig properties
- DatasetJSDataverseRepository: fix import alias for getTemplatesByCollectionId
  (getDatasetTemplates did not exist; alias corrects pre-existing type error)

S3 tagging is now server-driven via the tagging field in the upload
destination response; no client-side flag is needed.
The uploader no longer requires an iframe or a host HTML page. JSF (or any
page) can embed it with a single script tag:

  <div id="dv-uploader"></div>
  <script>
    window.dvUploaderConfig = { siteUrl: '...', datasetPid: '...' }
  </script>
  <script src=".../dvwebloader-v2.js"></script>

Changes:
- config.ts: replace URL-param parsing with window.dvUploaderConfig;
  add rootElementId and localesPath options; remove unused
  getDatasetIdentifier helper
- index.tsx: mount to config.rootElementId (default 'dv-uploader');
  derive i18n locales path from config.localesPath or siteUrl default;
  show inline error if required config is missing
- dvwebloaderV2.html: simplified demo page showing window config usage;
  falls back to URL params for siteUrl/datasetPid for easy testing
- embeddedDvWebloader.html: removed (iframe resize messaging no longer needed)
- package.json: remove embeddedDvWebloader.html from build-uploader copy

Auth: session cookie (JSESSIONID) — no API key required.
Requires DATAVERSE_FEATURE_API_SESSION_AUTH on the Dataverse instance.
Embedded styles can no longer leak page-level resets into the host JSF
page: html/body/#root rules and the page-only container layout move into
a separate standalone-page.scss that ships alongside the demo HTML but
is not included in the bundle. The React tree now mounts inside a
dv-uploader-root wrapper so future CSS scoping (PostCSS prefix or
Shadow DOM) has a stable hook.

Also tightens the manualChunks regex so the react chunk holds only
react/react-dom/scheduler instead of every node_modules path that
contains the substring "react" (react-bootstrap, react-router-dom,
react-toastify, etc.). The result is better cache reuse across future
reusable components.

run-env.sh now builds dist-uploader on demand so nginx never mounts an
empty directory.
# Conflicts:
#	CHANGELOG.md
#	package-lock.json
#	package.json
#	src/dataset/infrastructure/repositories/DatasetJSDataverseRepository.ts
#	src/sections/shared/file-uploader/FileUploader.tsx
#	src/sections/shared/file-uploader/FileUploaderPanel.tsx
#	src/sections/shared/file-uploader/file-upload-input/FileUploadInput.tsx
The merge of develop (42f5533) accidentally rolled back the v0.3.1
release entry in CHANGELOG.md and the corresponding package.json
version bump. Those should not have been part of this branch's diff.

Also reverts two unrelated edits the merge brought in:
- public/config.js: banner URL changed from /modern/ to /spa/.
- @types/turndown was added as a devDependency; not needed (turndown
  is already a runtime dep without typings, and tsc passes without
  the @types package).

Adds a TODO comment in src/standalone-uploader/index.tsx noting that
DataverseApiAuthMechanism is currently deep-imported from the SDK's
internal path; once the SDK prerelease that re-exports it from
core/index.ts is published, the import can move to the package's
public surface.
DDD domain layer for the new tree view (#6691). Lives next to the
existing FileRepository and follows the same layout conventions:

- models: FileTreeItem (FileTreeFolder | FileTreeFile + type guards),
  FileTreePage (response shape with opaque nextCursor, effective
  order/include, optional approximateCount).
- repositories: FileTreeRepository.getNode(...) — single-page lookup.
- use cases:
  - getFileTreeNode: thin error-mapping wrapper.
  - enumerateFileTreeFiles: recursive enumerator used at download
    time. Re-pages each folder via the same repository contract,
    returns a deduplicated flat list of descendant files. Lazy by
    design — no pre-fetch on mount.

Plus FileTreeItemMother / FileTreePageMother and a unit spec for the
enumerator covering paginated walking and overlapping-paths dedup.
Two FileTreeRepository implementations on the infrastructure side:

- FileTreeJSDataverseRepository: calls the new dataset-version tree
  endpoint (GET /api/datasets/{id}/versions/{versionId}/tree) via
  axios. Carries a TODO marker noting the planned switch to the SDK
  helper (listDatasetTreeNode) once the matching SDK prerelease is
  published; the wire format is already aligned.
  On 404/405/501 the repository transparently falls back to the
  in-memory previews adapter so the SPA stays usable in mixed-version
  deployments.

- FileTreeFromPreviewsRepository: synthesises a tree from the existing
  FilePreview[] returned by FileRepository.getAllByDatasetPersistentIdWithCount.
  Groups previews by directoryLabel, applies include/order, paginates
  with an opaque "mem:<offset>" cursor in memory. Cached per
  (persistentId, version). Folder counts track distinct subfolder
  names rather than per-file occurrences.

Tests cover root grouping, immediate children, include filter, cursor
pagination stability, descending order, invalid-cursor rejection,
caching, and the corrected folder-count semantics.
SPA presentation layer for the files tree view (#6691,
dataverse-frontend#622, dataverse-frontend#117).

New section under src/sections/dataset/dataset-files/files-tree/:

- FilesTree: virtualised lazy tree. Computes visible rows from
  scrollTop/clientHeight; only the slice in the viewport renders.
  Falls back to a fixed height when ResizeObserver is missing
  (test envs).
- FilesTreeRow / FilesTreeHeader / FilesTreeCheckbox: row primitives
  with custom tri-state checkbox (none/partial/all).
- format.ts: byte/count formatters for the row size + count cells.
- icons/FilesTreeIcons.tsx: small inline SVG glyphs (no new deps).
- useFileTree: per-folder fetch + cache + nextCursor paging.
- useFileTreeSelection: path-keyed three-set selection model.
  Folder selection is logical — descendants are not enumerated until
  download time. A deselect-override set captures per-file unchecks
  inside a logically selected folder.
- useFileTreeFlatten: turns the per-folder cache into the visible row
  list (incl. inline loading/error/load-more rows).
- useFileTreeDownload: at download time uses enumerateFileTreeFiles
  to expand selected folders into concrete file IDs, then delegates
  to the existing requestSignedDownloadUrlFromAccessApi flow. No new
  server contract.

New section under src/sections/dataset/dataset-files/files-view-toggle/:

- FilesViewToggle: Table ↔ Tree toggle backed by the ?view=tree URL
  query parameter (bookmarkable).

DatasetFiles.tsx and DatasetFilesScrollable.tsx render the toggle
above the table/tree and switch between FilesTable / FilesTree based
on the URL state. CSS-Modules class for the toggle layout (no inline
styles).

Adds tree.* and view.toggle.* keys to public/locales/en/files.json.

Cypress component tests cover: tri-state selection logic, lazy expand,
visible-row flattening (incl. load-more/loading/error rows), the
toggle's URL-driven state, and FilesTree mounting against a fake
repository for loading / error / empty / populated states.
…act)

New developer guide for building React components that run in BOTH
the SPA and the legacy JSF UI. Covers:

- Why dual-mode (avoid two implementations during the JSF→SPA
  migration).
- The contract every reusable component must follow: standalone
  entry, typed window config, shared core component, repository
  adapter, session-cookie auth, single-file CSS injection.
- Build pipeline (vite.config.uploader.ts → reusable-components/
  + shared chunks) and how to add a new entry.
- Authentication / CSRF prerequisites.
- CSS isolation strategy and the known Bootstrap 3 vs 5 caveat.
- Adding a new reusable component (greenfield) and extracting one
  from an existing SPA section.
- Currently shipped: dv-uploader (and tree-view planned).
- Test conventions and versioning rules.

Cross-links to the backend half in dataverse/doc/Architecture/
reusable_frontend_components.md.
@ErykKul ErykKul force-pushed the feature/standalone-file-uploader branch from 42f5533 to bd2185b Compare May 5, 2026 10:28
ErykKul added 2 commits May 5, 2026 12:33
Pulls the latest prerelease of the SDK published from PR #403 after
its CI went green following the IQSS/dataverse#12182 storage-driver
endpoint move. This version ships:

- The tree node listing helpers (listDatasetTreeNode +
  iterateDatasetTreeNode) that the tree-view track will consume.
- The public re-export of DataverseApiAuthMechanism, replacing the
  current deep import in src/standalone-uploader/index.tsx.
- The server-driven S3 tagging (FileUploadDestination.tagging) that
  removes the duplicate client-side flag.

Lockfile updated; no behaviour change in this commit (consumer code
still uses the existing deep import + inline axios; follow-up commits
will wire in the new public surfaces).
ErykKul added 30 commits May 10, 2026 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request/Idea: making the tree view on dataset page bookmarkable Allow selecting of files in Tree View to Edit or Download

4 participants