# Basic usage - process all planets
python3 comprehensive_exomoon_search.py
# Process with more workers (faster)
python3 comprehensive_exomoon_search.py --workers 8
# Process a subset (e.g., first 100 planets)
python3 comprehensive_exomoon_search.py --start 0 --end 100
# Resume from where you left off (automatically skips processed planets)
python3 comprehensive_exomoon_search.py--input FILE Input CSV file (default: ranked_transiting_planets.csv)
--output FILE Output CSV file (default: comprehensive_exomoon_results.csv)
--workers N Number of parallel workers (default: 4)
--start N Start index for processing (default: 0)
--end N End index (default: None = all)
--shard-id N Shard ID for parallel execution
--num-shards N Total number of shards
--cadence POLICY Cadence policy (short, long, short_then_any, any)
--cache-dir DIR Cache directory for preprocessed light curves
--cache Enable persistent light-curve caching (disabled by default)
To process in 8 parallel shards:
# Terminal 1
python3 comprehensive_exomoon_search.py --shard-id 0 --num-shards 8 &
# Terminal 2
python3 comprehensive_exomoon_search.py --shard-id 1 --num-shards 8 &
# ... etc for shards 2-7
# Or use the provided script:
bash run_all_shards.sh # Modify for 8 shards if neededThe script implements complementary methods over shared transit features:
- Skew Detection - Analyzes asymmetry between ingress/egress areas
- TTV Detection - Detects Transit Timing Variations
- Shoulder Detection - Looks for anomalies in transit shoulders
- Variability Detection - Compares variability in transit regions vs baseline
- Duration Periodicity - Tracks periodic transit-duration changes
✅ Fixed bugs - Proper duration validation, error handling ✅ Dual hypothesis scoring - single-moon and multi-moon interpretations ✅ Statistical significance - FAP (False Alarm Probability) for all detections ✅ Data quality metrics - SNR, completeness, number of transits ✅ Combined scoring - Two weighted scores, plus legacy alias ✅ Robust error handling - Continues processing even if one method fails
The output CSV contains:
planet_name- Planet identifiertic_id- TIC ID for light curve accessP_expected- Expected orbital periodStatus- OK or ErrorReason- Error message if failed
num_transits_observed- Number of transits in light curvedata_quality_score- Overall data quality (0-1)
Core methods include:
{method}_P_refined- Refined orbital period{method}_T0- Transit epoch{method}_dur- Transit duration{method}_power- Detection signal strength{method}_fap- False Alarm Probability{method}_amplitude- Signal amplitude
Additional dual-model outputs include:
single_*fields andcombined_single_moon_scoremulti_*fields andcombined_multi_moon_score
combined_single_moon_score- Single-moon weighted score (0-1)combined_multi_moon_score- Multi-moon weighted score (0-1)combined_exomoon_score- Backward-compatible alias tocombined_single_moon_scoresingle_skew_dist_score/skew_dist_score- consistency with the single-moon skew occupancy hypothesis (high = more consistent)single_skew_dist_inconsistency_score/skew_dist_inconsistency_score- mismatch strength for that same hypothesis (high = less consistent)
Single-moon score components (scan):
- Skew periodicity: 0.22
- Skew distribution consistency: 0.06 (single-moon GOF consistency)
- TTV periodicity: 0.38
- Shoulder: 0.12
- Variability: 0.12
- Duration periodicity: 0.11
- Data quality bonus: 0.05
Multi-moon score components (scan):
- Multi-peak skew structure: 0.26
- Multi-frequency TTV structure: 0.28
- Multi-skew distribution shape: 0.10
- Shoulder: 0.10
- Variability: 0.08
- Duration structure: 0.13
- Data quality bonus: 0.05
Transit-count reliability modulation:
- Both models apply a reliability factor to timing/periodicity-heavy terms:
transit_reliability = clip(log10(num_transits_observed + 1) / 2, 0, 1)
- This prevents low-transit outliers from saturating skew/TTV scores.
Look for planets with:
combined_single_moon_score> 0.5 orcombined_multi_moon_score> 0.5ttv_fap< 0.01 (1% false alarm probability)skew_fap< 0.01num_transits_observed>= 10 (more transits = better)data_quality_score> 0.5
import pandas as pd
df = pd.read_csv('comprehensive_exomoon_results.csv')
# Top candidates per model
top_single = df.nlargest(20, 'combined_single_moon_score')
top_multi = df.nlargest(20, 'combined_multi_moon_score')
# High confidence (low FAP)
high_conf = df[(df['ttv_fap'] < 0.01) | (df['skew_fap'] < 0.01)]
high_conf = high_conf.nlargest(20, 'combined_single_moon_score')
# Good data quality
good_data = df[df['data_quality_score'] > 0.5]
good_data = good_data.nlargest(20, 'combined_multi_moon_score')
# Follow-up-oriented ranking with transit floor
followup = df[df['num_transits_observed'] >= 40].copy()
followup['single_followup_score'] = (
followup['combined_single_moon_score'] *
(followup['num_transits_observed'].clip(upper=100) / 100.0)
)
followup['multi_followup_score'] = (
followup['combined_multi_moon_score'] *
(followup['num_transits_observed'].clip(upper=100) / 100.0)
)
top_single_followup = followup.nlargest(20, 'single_followup_score')
top_multi_followup = followup.nlargest(20, 'multi_followup_score')-
Use more workers if you have CPU cores available
python3 comprehensive_exomoon_search.py --workers 8
-
Process in chunks to monitor progress
python3 comprehensive_exomoon_search.py --start 0 --end 100 python3 comprehensive_exomoon_search.py --start 100 --end 200
-
Use sharding for very large datasets
# Process 8 shards in parallel for i in {0..7}; do python3 comprehensive_exomoon_search.py --shard-id $i --num-shards 8 & done wait
- Some planets may not have TESS data
- Check if TIC ID is correct
- Try different mission (Kepler/K2) if available
- Reduce number of workers
- Process in smaller chunks (--start/--end)
- Increase workers (if CPU allows)
- Some planets take longer (more transits = more processing)
- Rarely, native-library teardown may emit a segfault after all rows are already written.
- If the script printed completion and row counts look correct, outputs are usually usable.
- Sanity check with:
Statuscounts- expected number of rows
- quick parse of top candidates
The script automatically resumes - it skips planets already in the output file. To restart completely, delete or rename the output file.
- Review top candidates - Check highest scoring planets
- Visual inspection - Plot light curves for promising candidates
- Follow-up analysis - Use photodynamical modeling for best candidates
- Publication - Report detections with proper statistical significance
comprehensive_exomoon_results.csv- Main results filecomprehensive_exomoon_results_shard{N}.csv- Shard-specific results (if using sharding)
If you used sharding, combine results:
import pandas as pd
import glob
shard_files = sorted(glob.glob("comprehensive_exomoon_results_shard*.csv"))
dfs = [pd.read_csv(f) for f in shard_files]
combined = pd.concat(dfs, ignore_index=True)
combined = combined.drop_duplicates(subset=['planet_name'])
combined.to_csv("comprehensive_exomoon_results_combined.csv", index=False)Or use the existing merge_shards.py script (modify for your file pattern).