Pipeline for analysing audio files for bird identification using BirdNET
In partnership with the Ngarrindjeri Aboriginal Corporation and the Raukkan Rangers, Flinders University (Global Ecology Laboratory) under the auspices of the Australian Research Council Centre of Excellence for Indigenous and Environmental Histories and Futures (CIEHF) have set up an initial array of 5 passive acoustic recorders to document the change in bird diversity in recently restored wetlands within the Teringie Wetlands complex in South Australia. We are comparing these records to existing wetland complexes and control saltponds devoid of most birdlife (control). These data belong to the Ngarrindjeri Nation.
The audio-file repository is available at EcoSounds (but not currently open-access).
R-based BirdNET workflow for:
- processing a single audio file
- processing a large local
.tar.zstarchive one.flacat a time OR downloading and processing original recordings directly from an authenticated EcoSounds project - converting non-
.wavsource audio to.wav - filtering BirdNET predictions with a repository-local species list
- writing per-file prediction summaries and rolling progress reports
- post-processing existing summary CSVs into plots and aggregate tables
birdnetRpredict/
├── README.md
├── data/
│ └── species_lists/
│ ├── reference/
│ └── regional/
│ └── lower_murray/
├── scripts/
│ ├── analyse_birdnet_output.R
│ ├── birdnet_helpers.R
│ ├── birdnetID.R
│ ├── downloading_user_options.R
│ ├── process_download_common.R
│ ├── process_download_pipeline.R
│ ├── process_ecosounds.R
│ └── process_tar_archive.R
└── www
scripts/birdnetID.R:
- loads the shared helper functions
- points to a single
.wavfile - loads a species CSV from this repository
- builds BirdNET models
- extracts date/time and coordinates from the file name where available
- uses the BirdNET range model to narrow candidate species for that file
- runs BirdNET on the audio
- writes:
- a filtered prediction CSV
- a cleaned species summary CSV
The source-processing pipeline is now split across four scripts:
-
scripts/downloading_user_options.R- holds the user-editable download settings in one place
- includes
source_mode <- "archive"orsource_mode <- "ecosounds" - includes the archive path, EcoSounds credentials/settings, and other user-adjustable processing settings
-
scripts/process_download_pipeline.R- sources
downloading_user_options.R - dispatches automatically to the correct source-specific entrypoint based on
source_mode
- sources
-
scripts/process_tar_archive.R- archive-only entrypoint
- opens a source
.tar.zst - streams the archive sequentially instead of doing a full pre-scan
- starts processing as soon as the next
.flacis encountered - extracts a single
.flacwhile preserving the internal archive path - converts that
.flacto.wavwithffmpeg
-
scripts/process_ecosounds.R- EcoSounds-only entrypoint
- authenticates against the EcoSounds / Acoustic Workbench API
- lists only the recordings accessible in the chosen project and matching the selected recorder/site filter
- when a site ID is set, uses the site-specific EcoSounds filter endpoint and a larger page size to obtain the file list faster
- can optionally restrict processing to a single recorder/site in the user-defined settings
- downloads each original recording file into a temporary local workspace one file at a time
- if original-file download is not permitted for your account, falls back to chunked
media.wavdownloads and concatenates them locally - if EcoSounds denies direct access to the original file (
403), falls back to the standardmedia.wavroute for that recording - can also fall back to the EcoSounds-generated PowerShell downloader script for single-recording downloads
- processes
.wavrecordings directly and converts other source formats to.wavwhen needed - deletes the downloaded local audio immediately after that one file is analysed, before downloading the next file
The shared source-processing logic then:
- runs the same BirdNET summary workflow used by the single-file script
- writes per-file CSV outputs
- deletes temporary downloaded/extracted audio files
- moves to the next source item until the run is complete
Archive mode still avoids unpacking the entire archive at once and avoids waiting for a full member enumeration before processing starts. macOS sidecar entries such as ._*.flac and __MACOSX/ metadata are skipped during archive streaming.
scripts/analyse_birdnet_output.R:
- searches recursively under
out/for existing*_birdnet_species_summary.csvfiles - combines the summary CSVs that are already present and readable
- filters detections by a user-defined minimum confidence threshold
- bins detections into a user-defined time step (default
60minutes) - writes aggregate CSV tables plus plots for:
- identifications over time
- cumulative new species over time
- identifications per species
- temporal autocorrelation, partial autocorrelation, spectral periodicity, and Ljung-Box periodicity tests for both detections-per-bin and unique-species-per-bin
This script is intended to work while archive processing is still incomplete. You can rerun it at any time and it will analyse whatever summary CSVs currently exist in out/.
The pipeline first tries to parse metadata from the audio file name.
For a name like:
20251123T080000+0930_REC_-31.52235+152.10576.flac
the scripts derive:
- date/time:
2025-11-23 08:00:00 +0930 - latitude:
-35.52235 - longitude:
139.10576
The archive pipeline extracts .flac, converts it to .wav, and keeps the same basename, so the parsed coordinates and timestamp continue to apply during BirdNET processing.
If a file name cannot be parsed, the scripts can fall back to user-defined default latitude, longitude, and date values.
Species list files are stored in this repository under:
data/species_lists/reference/data/species_lists/regional/lower_murray/
The current scripts use:
data/species_lists/regional/lower_murray/BirdNet_SA_LowerMurray_Tolderol_matches.csv
That file is combined with the BirdNET location/week range model to reduce false positives before prediction summaries are written.
- R packages:
birdnetR,processx,callr,jsonlite - casks:
ffmpeg,tarwith--zstdsupport,zstd
The current environment also expects BirdNET's Python dependencies to be installable through birdnetR.
Main user-editable settings are at the top of the scripts.
Edit:
audio_filespecies_csvfallback_latitudefallback_longitudeprediction_min_confidencesummary_confidence_threshold
Edit:
source_modearchive_fileecosounds_workbench_urlecosounds_project_idecosounds_recorder_idecosounds_recorder_nameecosounds_download_methodecosounds_powershell_scriptecosounds_refresh_powershell_scriptecosounds_listing_page_sizeecosounds_auth_tokenecosounds_user_nameecosounds_passwordspecies_csvpipeline_timezonefallback_latitudefallback_longitudeprediction_min_confidencesummary_confidence_thresholduse_arrowstage_heartbeat_secondsstage_timeout_seconds
These control which source pipeline is used, whether the run processes a local archive or an authenticated EcoSounds project, which single EcoSounds recorder/site is included (if any), which species filter is used, how strict the prediction summaries are, and how often the source-processing pipeline emits heartbeat updates while waiting on extraction/download or BirdNET subprocess stages.
For EcoSounds access, prefer supplying credentials through environment variables rather than storing secrets in the script:
ECOSOUNDS_AUTH_TOKENECOSOUNDS_USERNAMEECOSOUNDS_PASSWORD
If ECOSOUNDS_AUTH_TOKEN is supplied, the script uses it directly. Otherwise it logs in with ECOSOUNDS_USERNAME + ECOSOUNDS_PASSWORD and then downloads recordings from the selected project.
To process only one EcoSounds recorder at a time, set one of these near the top of scripts/downloading_user_options.R:
ecosounds_recorder_id <- 7238Lfor theGEL_AEcoSounds site in project1281ecosounds_recorder_name <- "GEL_A"for an exact recorder/site name match when you know the API site name matches that label
The EcoSounds listing request itself is restricted to that recorder/site before files are queued for download. Leave both empty if you want the whole project. Do not set both at once.
The default EcoSounds settings now also include:
ecosounds_download_method <- "api_then_powershell"to try the direct API first and then fall back to the EcoSounds-generated PowerShell downloaderecosounds_powershell_script <- "/Users/brad0317/Downloads/download_audio_files.ps1"as an optional local script path to reuse when refresh is disabledecosounds_refresh_powershell_script <- TRUEto regenerate the downloader script from EcoSounds for the current authenticated session before PowerShell fallback downloadsecosounds_listing_page_size <- 500Lto reduce the number of listing pages needed for large sites
Edit the user-defined settings directly near the top of the script:
summary_rootoutput_rootanalysis_timezonebin_minutesdiversity_window_daystop_species_time_bin_minutesrolling_mean_window_daysmin_confidenceperiodicity_max_lag_binsshow_plots_in_session
These control which existing summary CSVs are included, where the analysis outputs are written, the temporal bin size used by the plots, the diversity-analysis window length, and the minimum confidence required for a detection to be counted.
Rscript scripts/birdnetID.RRscript scripts/process_download_pipeline.RWith source_mode <- "ecosounds", make sure your EcoSounds credentials are available first, for example:
export ECOSOUNDS_USERNAME="your_username"
export ECOSOUNDS_PASSWORD="your_password"
Rscript scripts/process_download_pipeline.ROr set ecosounds_auth_token, ecosounds_user_name, and ecosounds_password directly in scripts/downloading_user_options.R before running the pipeline.
If you want to force a specific source entrypoint directly, you can run:
Rscript scripts/process_tar_archive.R
Rscript scripts/process_ecosounds.RThose entrypoints still read scripts/downloading_user_options.R, but they now stop if source_mode does not match the script you ran.
Open scripts/analyse_birdnet_output.R in RStudio, VS Code, or another R editor, adjust the settings block if needed, then run the script inside R.
The script is intended to be run as a standalone analysis file rather than driven by command-line arguments.
It uses ggplot2 for all figures.
By default, the figures are shown in the active R graphics session and also saved as .png files.
For each processed recording, the pipeline writes:
*_birdnet_predictions.csv*_birdnet_species_summary.csv
The summary CSV contains:
date_timescientific_namecommon_nameconfidencecumulative_number_of_new_species_detectedtotal_number_of_species_identified
For source-processing runs, output is written under:
out/<source_name>_birdnet_output/
The source-processing workflow writes:
-
*_processing_manifest.csv
machine-readable log of file-by-file outcomes -
*_file_results.txt
continually updated text summary by file, including status, timing, coordinates, outputs, and any errors -
*_summary_of_summaries.txt
continually updated overall run summary, including progress, current file, current phase, elapsed time, ETA, and cumulative species count
All source-processing outputs are written to the local repository drive, not back to the source archive drive or remote EcoSounds repository. In EcoSounds mode, the workflow does not build up a local cache of all recordings first; it downloads one source audio file, analyses it locally, deletes the temporary local audio, and only then moves to the next recording.
Post-processing outputs are written under:
out/analysis/confidence_<threshold>_bin_<minutes>min/
The analysis workflow writes:
-
birdnet_analysis_summary.txt
text summary of the current analysis run, including skipped or incomplete input files -
birdnet_analysis_input_files.csv
one row per discovered summary CSV, showing whether it was loaded successfully -
birdnet_analysis_filtered_detections.csv
all retained detections after applying the chosen minimum confidence threshold, now including local light-phase classification where timestamps and coordinates can be recovered -
birdnet_identifications_by_time_bin.csv
identifications per time bin, plus unique species count per bin -
birdnet_identifications_by_time_bin_by_recorder.csv
identifications per time bin for each recorder separately -
birdnet_light_phase_calendar.csv
local civil-dawn, sunrise, sunset, and civil-dusk times by date/location, plus total daylight, twilight, and darkness hours -
birdnet_light_phase_calendar_long.csv
the same local light-phase calendar in long format, one row per date/location/phase -
birdnet_light_phase_sampling_effort.csv
total sampled recording hours falling within daylight, twilight, and darkness across all currently analysed recordings -
birdnet_light_phase_sampling_effort_by_recorder.csv
the same sampled light-phase effort summarized separately for each recorder -
birdnet_diel_activity_by_species.csv
per-species daylight, twilight, and darkness detections, normalized detections-per-hour, dominant light phase, and normalized night-versus-day rate ratio -
birdnet_diel_activity_by_species_by_recorder.csv
recorder-specific diel activity summaries by species using the same normalized light-phase metrics -
birdnet_top_10_species_detections_through_time.csv
detections through time for the 10 most detected species across all currently analysed recorders -
birdnet_top_10_species_detections_through_time_by_recorder.csv
detections through time for the 10 most detected species within each recorder -
birdnet_cumulative_new_species_by_time_bin.csv
newly detected species per time bin and cumulative species richness through time -
birdnet_cumulative_new_species_by_time_bin_by_recorder.csv
cumulative new-species summaries calculated separately for each recorder -
birdnet_identifications_by_species.csv
species ranked from most frequently identified to least frequently identified -
birdnet_identifications_by_species_by_recorder.csv
species-frequency summaries calculated separately for each recorder -
birdnet_identifications_by_species_by_month.csv
species ranked by identification frequency within each month of the year -
birdnet_identifications_by_species_by_month_by_recorder.csv
month-by-species summaries calculated separately for each recorder -
birdnet_monthly_diversity_metrics.csv
recorder-level diversity metrics calculated from detections-as-abundance across user-defined diversity windows, including Shannon index, Simpson index, and Hill numbers for q = 1 and q = 2 -
birdnet_monthly_diversity_metrics_overall.csv
diversity metrics after combining detections across all currently analysed recorders within each user-defined diversity window -
birdnet_monthly_diversity_metrics_daily_incidence.csv
recorder-level diversity metrics calculated from daily species incidence within each user-defined diversity window -
birdnet_monthly_diversity_metrics_daily_incidence_overall.csv
diversity metrics after combining daily species incidence across all currently analysed recorders within each user-defined diversity window -
birdnet_raw_species_richness_by_diversity_window.csv
raw species richness calculated as the number of unique species detected within each user-defined diversity window -
birdnet_identification_acf.csv
detection-count autocorrelation values by temporal lag, retained for backward compatibility -
birdnet_identification_spectrum.csv
detection-count spectral-density summary for inspecting periodicity, retained for backward compatibility -
birdnet_identification_periodicity_by_recorder.csv
recorder-specific detection-count temporal diagnostics used by the recorder comparison figure -
birdnet_temporal_diagnostics.csv
combined temporal diagnostics for both detections per bin and unique species identified per bin, including autocorrelation (ACF), partial autocorrelation (PACF), and spectral-density curves -
birdnet_temporal_periodicity_tests.csv
Ljung-Box test summaries at key lags for both detections per bin and unique species identified per bin -
birdnet_temporal_spectral_peaks.csv
ranked dominant spectral peaks, including the strongest candidate periods and their relative power -
birdnet_temporal_diagnostics_by_recorder.csv
recorder-specific ACF, PACF, and spectral-density values for both detections per bin and unique species identified per bin -
birdnet_temporal_periodicity_tests_by_recorder.csv
recorder-specific Ljung-Box test summaries at key lags -
birdnet_temporal_spectral_peaks_by_recorder.csv
recorder-specific dominant spectral peaks for easier interpretation of likely recurring periods -
birdnet_identifications_over_time.png -
birdnet_identifications_over_time_linear.png -
birdnet_top_10_species_detections_through_time.png -
birdnet_diel_activity_by_species.png -
birdnet_day_night_calling_bias_by_species.png -
birdnet_cumulative_new_species.png -
birdnet_identifications_by_species.png -
birdnet_identifications_by_species_by_month.png -
birdnet_monthly_diversity_metrics.png -
birdnet_monthly_diversity_metrics_daily_incidence.png -
birdnet_raw_species_richness_by_diversity_window.png -
birdnet_periodicity.png
overall temporal-diagnostics figure combining all recorders currently present in the analysis, with ACF, PACF, and spectral-density panels for detections and unique species -
birdnet_identifications_over_time_by_recorder.png -
birdnet_identifications_over_time_by_recorder_linear.png -
birdnet_top_10_species_detections_through_time_by_recorder.png -
birdnet_cumulative_new_species_by_recorder.png -
birdnet_identifications_by_species_by_recorder.png -
birdnet_identifications_by_species_by_month_by_recorder.png -
birdnet_monthly_diversity_metrics_by_recorder.png -
birdnet_periodicity_by_recorder.png
multi-panel recorder-comparison temporal-diagnostics figure with detection and species-richness periodicity panels -
recorders/<RECORDER_ID>/...
recorder-specific figures written for each recorder that currently has usable detections (for examplerecorders/GEL_A/)
In the species-frequency plots, the identification axis is shown on a log10 scale, and common names are displayed in lowercase except where proper nouns remain capitalised. Latin names are italicised in the species-axis labels.
The root-level analysis figures are the combined overall results across all recorders currently present in out/. Additional recorder-comparison figures are written as multi-panel plots, and recorder-specific figures are written into the recorders/ subdirectory as each recorder becomes available.
The detections-over-time plots now include two versions: a log10-scale plot with black bars and a red trailing running mean controlled by rolling_mean_window_days, and a separate linear-scale plot with black bars only and no running mean. The linear-scale versions include low-alpha background bands showing local morning twilight (pink), daylight (yellow), evening twilight (pink), and night (dark blue).
The count-based diversity metrics treat the number of detections per species as the abundance proxy for Shannon, Simpson, and Hill-number calculations, and are produced across user-defined diversity windows set with diversity_window_days. A parallel daily-incidence diversity summary is also written, and raw species richness is summarized across those same diversity windows in a separate plot/table.
The top-species time-series plots default to 24-hour bins through top_species_time_bin_minutes <- 24 × 60, but that bin size can be changed directly in scripts/analyse_birdnet_output.R.
In the recorder-comparison diversity figure, each recorder-by-metric panel now uses its own y-axis range so Shannon, Simpson, and Hill-number panels are scaled to their local maxima.
The temporal periodicity figures now show both detections per time bin and unique species identified per time bin. Autocorrelation function (ACF) and partial autocorrelation function (PACF) panels include approximate type I error bands (± 1.96/√N), spectral-density panels mark the strongest candidate periods, and the companion Ljung-Box CSV outputs provide a compact test summary at identified lags for easier interpretation of recurring temporal structure.
The diel activity analysis reconstructs local daylight, civil twilight, and darkness from each recording's timestamp and recovered coordinates, summarizes the sampled hours available in each light phase, and reports normalized detections-per-hour so day-versus-night comparisons are not biased by unequal photoperiod or recording effort.
For each recorder and each calendar month, the analysis script:
- filters detections to those meeting
min_confidence - groups the remaining detections by recorder and month
- counts the number of detections for each species within that recorder-month
- treats those species-level detection counts as the abundance vector
- converts counts to relative abundance:
pi = ni / N
where:
ni= number of detections for speciesiN= total number of detections across all species in that recorder-monthpi= relative abundance of speciesi
The diversity metrics are then calculated as follows:
H' = -Σ(pi loge pi)
The script reports the Gini-Simpson form:
1 - Σ (pi2)
This is the exponential of Shannon diversity:
1D = eH'
This is the inverse Simpson concentration:
2D = 1 / Σ pi2
Interpretation in this workflow:
- larger Shannon and Hill q = 1 indicate greater effective diversity with sensitivity to both common and less-common species
- larger Simpson and Hill q = 2 indicate greater diversity with stronger weighting toward the most frequently detected species
- because the pipeline uses detections rather than direct counts of individuals, these are diversity estimates based on the assumption that detection frequency is a reasonable proxy for relative abundance
When Rscript scripts/process_download_pipeline.R is running, the console reports:
- current file index and percent complete
- current per-file stage percent
- current source item being processed
- archive streaming progress or EcoSounds listing progress
- extraction/download step
.flacto.wavconversion step- BirdNET range-filter step
- BirdNET prediction step
- output-writing step
- cleanup step
- per-file elapsed time
- estimated time remaining
Archive streaming, EcoSounds downloads, extraction, and conversion stages are monitored through processx, so they emit recurring heartbeat updates instead of staying silent until the subprocess returns.
BirdNET analysis is also run in a monitored child R process through callr, so TensorFlow/TFLite warnings should no longer make the main console progress appear frozen.
The source-processing script is restart-friendly.
If a file's summary CSV already exists, that file is skipped and logged as:
skipped_existing
This allows rerunning the script after interruption without reprocessing every file, regardless of whether the source is a local archive or EcoSounds. When switching between archive mode and EcoSounds mode, skip detection now also uses an origin-agnostic recording key derived from the underlying timestamped audio filename, so recordings already processed from one source are skipped when encountered later from the other source. In EcoSounds mode, the stable recording-ID path is still used for the local EcoSounds output tree itself, so already processed recordings are also skipped on rerun before any fresh download is attempted.
The script uses fallback latitude/longitude if configured. If neither filename coordinates nor fallback coordinates are available, processing stops for that file.
The script uses the fallback date if supplied. If not, processing stops for that file.
If the repository species CSV and BirdNET range-model output do not overlap, processing stops for that file with an explicit error.
If BirdNET returns no usable detections after cleaning, the script writes empty summary outputs and records the file as:
no_usable_detections
If BirdNET runs successfully but no predictions meet summary_confidence_threshold, the script writes empty summary outputs and records the file as:
no_summary_detections
If ffmpeg fails to convert a .flac, the file is recorded as:
error
and processing continues to the next source item.
If tar --zstd fails for a specific member, or an authenticated EcoSounds download fails for a specific recording, that file is recorded as:
error
and processing continues.
This workflow extracts one archive member at a time in archive mode, and downloads one recording at a time in EcoSounds mode. For compressed .tar.zst archives, extraction can still be slow because tar may need to scan or decompress a large portion of the archive to reach a later member.
That means a file can legitimately spend a long time in the extraction/download stage even when it is not frozen. The script emits heartbeat updates during that stage so you can tell the process is still alive.
BirdNET model startup and inference can also take a long time, especially on the first files of a run while Python/model dependencies initialize. The source-processing runner polls that stage from a child R process and keeps updating console and text progress during analysis.
If the process is interrupted, rerun:
Rscript scripts/process_download_pipeline.RFiles with existing summary outputs skipped automatically.
- temporary extracted
.flacand converted.wavfiles deleted after each file is processed - in EcoSounds mode, each downloaded recording is stored in a per-recording temporary workspace that is removed before the next download begins
- helper functions live in
scripts/birdnet_helpers.R - archive mode mirrors the archive subdirectory structure in the output folder when writing per-file CSV results
- EcoSounds mode writes outputs under stable recorder-based paths such as
GEL_A/<canonical_file_name>_...so interrupted runs can resume cleanly and skip already processed recordings




