How to find your video IDs and delete those you no longer use #68
-
|
Hello, I have a question. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hi @Gapuccino , Yes — it’s absolutely possible to obtain the IDs of all videos published on your Arc XP sites and identify which ones are no longer in use. To make this easier, we’ve prepared a Bash script that connects to the Arc Content API, lists all The script generates two CSV reports:
PrerequisitesMake sure you have sudo apt install jq curl -y # Debian/Ubuntu
# or
brew install jq # macOS⚙️ Steps to Run
Copy an save the script locally -> arcxp_video_audit.sh & execute the steps above #!/usr/bin/env bash
set -euo pipefail
# -------- CONFIG --------
# Your Arc org shortname, e.g. "sandbox.myorg"
ORG="${ORG:-<env>.<org>}"
# Website handle exactly as defined in Arc (e.g. "the-herald")
WEBSITE="${WEBSITE:-<site>}"
# Arc Bearer token with Content API access
TOKEN="${TOKEN:-<token>}"
# Page size for search (max sensible ~100)
SIZE="${SIZE:-50}"
# Base API host (switch to sandbox if needed)
API_HOST="https://api.${ORG}.arcpublishing.com"
# Output files
ALL_CSV="videos_audit.csv"
UNUSED_CSV="videos_unused.csv"
# -------- INIT --------
echo "video_id,title,display_date,usage_count" > "$ALL_CSV"
echo "video_id,title,display_date,usage_count" > "$UNUSED_CSV"
# Track pagination
FROM=0
TOTAL=-1
echo "Enumerating videos for website: ${WEBSITE}"
while : ; do
SEARCH_URL="${API_HOST}/content/v4/search/published?website=${WEBSITE}&q=type:video&sort=display_date:desc&size=${SIZE}&from=${FROM}&_sourceInclude=_id,headlines.basic,display_date"
RESP="$(curl -sS -H "Authorization: Bearer ${TOKEN}" "$SEARCH_URL")"
# On first page, capture expected total hits (if available)
if [[ "$TOTAL" -lt 0 ]]; then
TOTAL="$(echo "$RESP" | jq -r '.hits.total.value // .hits.total // 0')"
echo "Estimated total videos: ${TOTAL}"
fi
COUNT="$(echo "$RESP" | jq -r '.content_elements | length')"
if [[ "$COUNT" -eq 0 || "$COUNT" == "null" ]]; then
break
fi
# Iterate videos on this page
echo "$RESP" | jq -c '.content_elements[]' | while read -r vid; do
VID_ID="$(echo "$vid" | jq -r '._id')"
TITLE="$(echo "$vid" | jq -r '.headlines.basic // ""' | tr '\n' ' ' | tr -d '\r' | sed 's/"/""/g')"
DISP_DATE="$(echo "$vid" | jq -r '.display_date // ""')"
# Count how many stories reference this video ID
# Query: stories that contain a video element with matching _id
# Using track_total_hits to get an accurate count
REF_Q="type:story+AND+content_elements.type:video+AND+content_elements._id:\"${VID_ID}\""
REF_URL="${API_HOST}/content/v4/search/published?website=${WEBSITE}&q=$(python3 - <<EOF
import urllib.parse
print(urllib.parse.quote('''$REF_Q'''))
EOF
)&size=1&track_total_hits=true&_sourceInclude=_id"
REF_RESP="$(curl -sS -H "Authorization: Bearer ${TOKEN}" "$REF_URL")"
USAGE_COUNT="$(echo "$REF_RESP" | jq -r '.hits.total.value // .hits.total // 0')"
# Write to CSV (quote the title)
echo "${VID_ID},\"${TITLE}\",${DISP_DATE},${USAGE_COUNT}" >> "$ALL_CSV"
if [[ "$USAGE_COUNT" -eq 0 ]]; then
echo "${VID_ID},\"${TITLE}\",${DISP_DATE},${USAGE_COUNT}" >> "$UNUSED_CSV"
fi
done
# Next page
FROM=$((FROM + SIZE))
if [[ "$FROM" -ge "$TOTAL" && "$TOTAL" -ge 0 ]]; then
break
fi
done
echo "Done."
echo "All videos CSV: $(pwd)/${ALL_CSV}"
echo "Unused videos CSV: $(pwd)/${UNUSED_CSV}"
Hope this helps! |
Beta Was this translation helpful? Give feedback.
Hi @Gapuccino ,
Yes — it’s absolutely possible to obtain the IDs of all videos published on your Arc XP sites and identify which ones are no longer in use.
To make this easier, we’ve prepared a Bash script that connects to the Arc Content API, lists all
type:videoitems, and checks how many stories reference each video.The script generates two CSV reports:
videos_audit.csv— all videos with their usage countsvideos_unused.csv— only videos that are not referenced anywhereYou can run it locally on macOS, Linux, or Windows (WSL).
Prerequisites
Make sure you have
curlandjqinstalled:⚙️ St…