Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 17 additions & 2 deletions release/release_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -425,8 +425,23 @@ Set up a few environment variables to simplify Maven commands that follow. This
module. See [checklist](#checklist-to-proceed-to-the-next-step).
2. Continue with Java 17 build for Spark 4 bundle, run `export JAVA_HOME=$(/usr/libexec/java_home -v 17)` and
`./scripts/release/deploy_staging_jars_java17.sh 2>&1 | tee -a "/tmp/${RELEASE_VERSION}-${RC_NUM}.deploy2.log"`
5. Note that the artifacts from Java 17 build are uploaded to a separate staging repo. You need to manually
download those artifacts and upload them to the first staging repo so that all artifacts stay in the same repo.
5. Note that the artifacts from Java 17 build are uploaded to a separate staging repo. Use the
`copy_staging_repo.sh` script to copy all artifacts from the Java 17 staging repo into the Java 11 staging repo
so that all artifacts stay in the same repo.
1. Identify both staging repo IDs from [Apache Nexus Staging Repositories](https://repository.apache.org/#stagingRepositories)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Worth confirming: does the script require the target (Java 11) staging repo to be in the "open" state as well? The note says "Make sure both repos are still in the 'open' state (not closed)" — if Nexus rejects uploads to a closed repo, it might be useful to mention what error the user would see, or whether the script surfaces a clear message in that case.

- AI-generated; verify before applying. React 👍/👎 to flag quality.

(e.g., `orgapachehudi-1177` for Java 17, `orgapachehudi-1176` for Java 11). Make sure both repos are still in
the "open" state (not closed).
2. First do a dry-run to verify the list of artifacts to be copied:
```shell
./scripts/release/copy_staging_repo.sh --dry-run <java17-repo-id> <java11-repo-id>
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Since both invocations use a relative path (./scripts/release/copy_staging_repo.sh), it could clarify that the commands should be run from the repository root, consistent with the convention used elsewhere in the guide.

- AI-generated; verify before applying. React 👍/👎 to flag quality.

3. Then run the actual copy:
```shell
./scripts/release/copy_staging_repo.sh <java17-repo-id> <java11-repo-id> 2>&1 | tee -a "/tmp/${RELEASE_VERSION}-${RC_NUM}.copy_staging.log"
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 It might help to clarify what the script considers an "artifact" — specifically, does it also copy the accompanying .asc signature files and .md5/.sha1 checksum files that Apache releases require? If not, the resulting Java 11 staging repo would fail signature verification during the vote. Worth calling out explicitly in the doc (or confirming the script handles them).

- AI-generated; verify before applying. React 👍/👎 to flag quality.

4. The script reads Nexus credentials from `~/.m2/settings.xml` (server id `apache.releases.https`), downloads
every artifact from the source repo, and re-uploads them to the target repo. After it finishes, drop the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 It might help to explicitly instruct the release manager to verify that all artifacts have landed in the Java 11 staging repo (e.g., by spot-checking in the Nexus UI or relying on the end-of-run success summary) before dropping the Java 17 staging repo. As written, the step sequences "run the copy" → "drop the Java 17 repo" → "review all staged artifacts" (step 6), so if the copy silently missed something, the source artifacts would already be gone when the review happens.

- AI-generated; verify before applying. React 👍/👎 to flag quality.

Java 17 staging repo on Apache Nexus.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Dropping the Java 17 staging repo right after the copy script finishes is irreversible — if any artifact failed to upload (or if signatures/checksums are missing), the original source is gone. This step could direct the release manager to first complete the verification in step 6 (confirming the Java 11 repo contains all expected artifacts, including .asc/.md5/.sha1 files for each jar/pom) and only then drop the Java 17 repo.

- AI-generated; verify before applying. React 👍/👎 to flag quality.

6. Review all staged artifacts by logging into Apache Nexus and clicking on "Staging Repositories" link on left pane.
Then find a "open" entry for apachehudi
7. Ensure it contains all 2 (2.12 and 2.13) artifacts, mainly hudi-spark-bundle-2.12/2.13,
Expand Down
160 changes: 160 additions & 0 deletions scripts/release/copy_staging_repo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
#!/usr/bin/env bash
#
# Copies all artifacts from one Nexus staging repository to another.
#
# Usage:
# ./copy_staging_repo.sh [--dry-run] <source-repo-id> <target-repo-id>
#
# Example:
# ./copy_staging_repo.sh --dry-run orgapachehudi-1177 orgapachehudi-1176
# ./copy_staging_repo.sh orgapachehudi-1177 orgapachehudi-1176
#

set -euo pipefail

DRY_RUN=false
if [[ "${1:-}" == "--dry-run" ]]; then
DRY_RUN=true
shift
fi

if [[ $# -ne 2 ]]; then
echo "Usage: $0 [--dry-run] <source-repo-id> <target-repo-id>"
echo "Example: $0 --dry-run orgapachehudi-1177 orgapachehudi-1176"
exit 1
fi

SOURCE_REPO="$1"
TARGET_REPO="$2"
NEXUS_BASE="https://repository.apache.org"
SETTINGS_XML="$HOME/.m2/settings.xml"
WORK_DIR="./staging-copy-${SOURCE_REPO}-to-${TARGET_REPO}"
mkdir -p "$WORK_DIR"
CONTENT_BASE="${NEXUS_BASE}/service/local/repositories/${SOURCE_REPO}/content"

echo "==> Work directory: $WORK_DIR"

# ---------------------------------------------------------------------------
# Extract credentials from ~/.m2/settings.xml for apache.releases.https
# ---------------------------------------------------------------------------
if [[ ! -f "$SETTINGS_XML" ]]; then
echo "ERROR: $SETTINGS_XML not found"
exit 1
fi

if command -v xmllint &>/dev/null; then
NEXUS_USER=$(xmllint --xpath \
"string(//server[id='apache.releases.https']/username)" "$SETTINGS_XML")
NEXUS_PASS=$(xmllint --xpath \
"string(//server[id='apache.releases.https']/password)" "$SETTINGS_XML")
else
NEXUS_USER=$(sed -n '/<server>/,/<\/server>/{ /<id>apache.releases.https<\/id>/,/<\/server>/{ s/.*<username>\(.*\)<\/username>.*/\1/p; }; }' "$SETTINGS_XML" | head -1 | xargs)
NEXUS_PASS=$(sed -n '/<server>/,/<\/server>/{ /<id>apache.releases.https<\/id>/,/<\/server>/{ s/.*<password>\(.*\)<\/password>.*/\1/p; }; }' "$SETTINGS_XML" | head -1 | xargs)
fi

if [[ -z "$NEXUS_USER" || -z "$NEXUS_PASS" ]]; then
echo "ERROR: Could not extract credentials for 'apache.releases.https' from $SETTINGS_XML"
exit 1
fi

echo "==> Credentials loaded for user: $NEXUS_USER"

# ---------------------------------------------------------------------------
# Crawl the Nexus content XML API to discover all artifact paths
# ---------------------------------------------------------------------------
# Nexus returns XML with <content-item> elements; <leaf>true</leaf> means file.
# We recursively crawl directories to collect every file's relativePath.
# ---------------------------------------------------------------------------
ARTIFACT_LIST_FILE="$WORK_DIR/.artifact_list"
: > "$ARTIFACT_LIST_FILE"

crawl_nexus_dir() {
local dir_url="$1"
local xml
xml=$(curl --silent --fail "$dir_url") || {
echo " WARN: Failed to list $dir_url" >&2
return
}

# Parse <relativePath> and <leaf> from each <content-item> block.
# They appear in matching order, one per block.
echo "$xml" | awk '
/<relativePath>/ { gsub(/.*<relativePath>/, ""); gsub(/<\/relativePath>.*/, ""); path=$0 }
/<leaf>/ { gsub(/.*<leaf>/, ""); gsub(/<\/leaf>.*/, ""); print $0 "\t" path }
' | while IFS=$'\t' read -r is_leaf rel_path; do
if [[ "$is_leaf" == "true" ]]; then
echo "$rel_path" >> "$ARTIFACT_LIST_FILE"
else
crawl_nexus_dir "${CONTENT_BASE}${rel_path}/"
fi
done
}

echo "==> Crawling $SOURCE_REPO for artifacts ..."
crawl_nexus_dir "${CONTENT_BASE}/org/apache/hudi/"

# Filter out checksums and maven-metadata.xml (Nexus regenerates these)
ARTIFACT_LIST=$(sort "$ARTIFACT_LIST_FILE")

TOTAL=$(echo "$ARTIFACT_LIST" | grep -c . || true)

echo "==> Found $TOTAL artifacts."
echo ""

# ---------------------------------------------------------------------------
# Dry-run mode: list files and exit
# ---------------------------------------------------------------------------
if [[ "$DRY_RUN" == true ]]; then
echo "$ARTIFACT_LIST" | while read -r path; do
echo " $path"
done
echo ""
echo "==> [DRY RUN] No files were downloaded or uploaded."
rm -rf "$WORK_DIR"
exit 0
fi

# ---------------------------------------------------------------------------
# Download all artifacts
# ---------------------------------------------------------------------------
echo "==> Downloading $TOTAL artifacts from $SOURCE_REPO ..."

echo "$ARTIFACT_LIST" | while read -r rel_path; do
local_path="${WORK_DIR}${rel_path}"
mkdir -p "$(dirname "$local_path")"
echo " Downloading: $rel_path"
curl --silent --fail --output "$local_path" "${CONTENT_BASE}${rel_path}"
done

echo "==> Download complete."

# ---------------------------------------------------------------------------
# Upload each artifact to the target staging repo
# ---------------------------------------------------------------------------
echo "==> Uploading $TOTAL artifacts to $TARGET_REPO ..."

UPLOAD_BASE="${NEXUS_BASE}/service/local/staging/deployByRepositoryId/${TARGET_REPO}"

SUCCESS=0
FAIL=0

echo "$ARTIFACT_LIST" | while read -r rel_path; do
local_path="${WORK_DIR}${rel_path}"
echo " Uploading: $rel_path"

HTTP_CODE=$(curl --silent --output /dev/null --write-out "%{http_code}" \
-u "${NEXUS_USER}:${NEXUS_PASS}" \
--upload-file "$local_path" \
"${UPLOAD_BASE}${rel_path}" 2>&1) || true

if [[ "$HTTP_CODE" =~ ^2 ]]; then
SUCCESS=$((SUCCESS + 1))
else
FAIL=$((FAIL + 1))
echo " FAILED (HTTP $HTTP_CODE): $rel_path"
fi
done

echo ""
echo "==> Done. Total: $TOTAL | Success: $SUCCESS | Failed: $FAIL"
echo "==> Artifacts are in: $WORK_DIR (delete when no longer needed)"
Loading