Skip to content

ht55/Iwate-Bear-Incident

Repository files navigation

🐻 Bear Incident Predictive Modeling: Final Report

1. Project Motivation: Addressing the Escalating Risk

Escalating Wildlife Conflicts: The Gravity of Bear Attacks and the Motivation for Analysis In recent years, the incidence of human-wildlife conflicts involving bears (both Higuma = Brown Bears and Tsukinowaguma = Asian Black Bears), including damage to crops and attacks on people, has rapidly increased across Japan. A concerning trend is the continuous rise in sightings in urban and downtown areas, a phenomenon previously unimaginable.

Specifically, Iwate Prefecture has recorded one of the highest numbers of incidents nationwide in the 2025 fiscal year. As a native of Iwate, the tragic news from my hometown compelled me to move beyond daily media reports and undertake a deeper, data-driven analysis myself. This deep-seated desire to understand the root causes motivated the inception of this project.

The Changing Perception of Tsukinowaguma Tsukinowaguma (Ursus thibetanus japonicus), predominantly inhabiting Honshu, was traditionally believed to be "generally docile, afraid of humans, and rarely attacking people or livestock." During my childhood, summer camps and light hiking in Iwate were common, yet bear encounters were nonexistent. Fatal bear attacks were only known as extremely rare historical events, such as the Sankebetsu Higuma Incident and the Fukuoka University Mountaineering Club Incident in Hokkaido in the Shōwa era. However, that perception has drastically changed.

An Unpredictable Threat and its Unique Brutality This year alone, six fatal bear attacks have already occurred in Iwate Prefecture, indicating that this threat is now imminent. Although bears may not appear large in stature, their iron-hard skin, extraordinary physical strength, and large, sharp claws give them extremely high lethal capability. Even a minor scratch can quickly become a fatal injury. Furthermore, unlike other major predators such as big cats (e.g., lions) or crocodilians, bears possess a distinctive and notorious habit of beginning to consume their prey while it is still alive, highlighting their exceptional brutality (e.g., the Timothy Treadwell Incident).

Project Objectives and Structure To prevent further tragedies and inform the development of concrete conservation and safety measures, this project will conduct a detailed analysis of past incident data and attempt to predict future occurrences of human-bear conflicts. I have centered my analysis on Machine Learning (ML) and eXplainable AI (XAI). This report will provide a model-driven Prediction and subsequent Visualization of the results. By clearly articulating the underlying mechanisms, I aim to offer new insights into bear sighting patterns and the factors contributing to human injury and fatality.

2. Project Overview

This project successfully developed a robust, statistically sound predictive model for monthly bear incident counts. I address the complex challenges posed by ecological data—specifically, the non-negative, over-dispersed nature of count data—to deliver reliable forecasts for wildlife management and public safety planning.

The project successfully transitioned from unstable linear time-series models (SARIMAX) to a specialized Negative Binomial (NB) Regression model, which proved to be the correct statistical framework for this data.

3. Key Findings & Conclusion

  1. Model Stabilization: Traditional time-series models (ARIMAX) failed due to numerical instability and the data's inherent over-dispersion (variance > mean). The final Negative Binomial model successfully converged and confirmed this over-dispersion (significant $\alpha$ term), validating the model choice.

[Image of Negative Binomial probability distribution]

  1. Dominant Drivers: The most statistically significant predictors of bear incidents are Dynamic Weather (Avg Temperature) and Temporal Dependence (Incident_Count_Lag_1). Static geospatial features (Elevation, Distance to Forest) were found to be largely redundant.

  2. Performance: The final model achieved a Mean Absolute Error (MAE) of 2.39 incidents, providing highly accurate forecasts for most of the seasonal cycle.

  3. Forecast: The model correctly predicts a strong seasonal pattern, with peak activity occurring in the summer months (peaking in July).

4. Methodology

4.1. Feature Engineering

I integrated and standardized time-series and geospatial data to create a final set of 7 non-collinear features, essential for stable modeling:

  • Dynamic/Time-Series: Avg Temperature, Incident_Count_Lag_1, Month, Avg Precipitation, Precip_Lag_1.

  • Static/Geospatial: Elevation_m, Distance_to_Forest_m.

4.2. Modeling Pipeline

  • Initial Approach: SARIMAX (failed due to data distribution mismatch).

  • Final Solution: Negative Binomial (NB) Regression (Best Stable Fit for count data).

4.3. Future Work (Advanced Techniques)

To improve prediction accuracy for rare, high-magnitude outlier months (like September and October peaks), the following advanced techniques are recommended:

  • Zero-Inflated Negative Binomial (ZINB) Regression: To explicitly model the excess of zero counts and improve fit.

  • XGBoost / Tree-Based Models: For superior handling of non-linear relationships and extreme value prediction.

5. Visualizations

The project includes geospatial analysis showing precise incident locations, injury level and feature distribution for the past 10 years between January 2016 and November 2025:

6. Usage and Dependencies

This project was built using a standard Python data science stack.

Technologies

  • Python 3

  • statsmodels: Used for the final Negative Binomial Regression model.

  • pandas / numpy: Data handling and transformation.

  • geopandas: Geospatial feature engineering and mapping.

  • scikit-learn: Data scaling (StandardScaler) and evaluation.

About

Developed a stable Negative Binomial (NB) Regression model for predictive analysis of over-dispersed bear incident counts. The model achieved an MAE of 2.39, revealing that dynamic weather and temporal factors are the dominant drivers of risk, informing future XGBoost/XAI work.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors