Skip to content

zaydaanjahangir/GeoScope

Repository files navigation

GeoScope

GeoScope is a deep learning framework for image geolocalization that employs CNN transfer learning and a CLIP-inspired vision-language model to classify street view images into geographic regions.

GeoScope is a deep learning framework for image geolocalization. The project leverages two complementary approaches:

  • CNN-Based Geolocalization: Utilizing CNN architectures such as ResNet-18 and WideResNet via transfer learning, GeoScope classifies images into discrete geographic regions (e.g., regions or continents) using geotagged street view images.

  • Lite StreetCLIP-Inspired Model: Adapting a lightweight version of the CLIP model, this approach leverages contrastive learning on image–caption pairs and synthetic captions derived from geographic labels. It enables robust zero-shot or few-shot predictions to improve generalization on unseen geographies.

Overview

GeoScope addresses the challenge of image geolocalization, a problem with critical applications in photo tagging, search, and open-source intelligence, by mapping images to predefined geographic regions. By integrating transfer learning and vision-language models, GeoScope aims to overcome data scarcity and improve robustness to distribution shifts.

Key Features

  • Dual-Model Approach: Combines CNN-based classification with CLIP-inspired zero-shot learning.
  • Transfer Learning: Fine-tuning on geotagged datasets for robust feature extraction.
  • Zero-Shot Generalization: Leverages pretrained embeddings to predict geographic regions for unseen data.
  • Scalable & Diverse: Designed to work across various environments from urban to rural scenes.

References

  • He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition.
  • Radford, A., et al. (2021). Learning transferable visual models from natural language supervision.
  • Haas, L., Alberti, S., & Skreta, M. (2023). Learning generalized zero-shot learners for open-domain image geolocalization.
  • Weyand, T., Kostrikov, I., & Philbin, J. (2016). PlaNet - Photo Geolocation with Convolutional Neural Networks.

About

Image Geolocalization aka GeoGuessr AI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors