This project presents a critical, evidence-based analysis of cutting-edge multimodal AI systems β including CLIP, Flamingo, BLIP-2, LLaVA, and MiniGPT-4 β evaluating their architecture, technical evolution, societal implications, and ethical challenges.
Additionally, the project includes an AI system design analysis of an Autonomous Search-and-Rescue Drone, applying intelligent agent theory and the PEAS framework to model autonomous decision-making in disaster environments.
- CLIP (Contrastive LanguageβImage Pretraining)
- Flamingo (Few-shot Visual-Language Model)
- BLIP-2 (Frozen Vision Encoder + LLM Bridge Architecture)
- LLaVA (Visual Instruction Tuning)
- MiniGPT-4 (Lightweight Multimodal Alignment)
- Architecture design patterns in multimodal AI
- Modular alignment of frozen LLMs and vision encoders
- Instruction tuning and synthetic data trends
- Ethical risks (hallucination, bias, privacy concerns)
- Computational and scalability trade-offs
- Open research gaps in visual reasoning and safety
- π Full Report:
AI_RESEARCH_PAPERS.pdf
This section analyzes an autonomous drone operating in disaster-affected, GPS-denied environments.
- PEAS Framework (Performance, Environment, Actuators, Sensors)
- Six AI Environment Classifications:
- Partially Observable
- Stochastic
- Sequential
- Dynamic
- Continuous
- Multi-Agent
- Hybrid Model-Based + Learning Agent design
- Planning under uncertainty (POMDP-style reasoning)
- Sensor fusion (RGB + Thermal integration)
- Human-in-the-loop fallback mechanisms
Designing a robust AI agent capable of safe and reliable autonomous decision-making in:
- Partially observable environments
- Unpredictable disaster conditions
- Real-time dynamic scenarios
- Multi-agent rescue coordination settings
- AI architecture analysis
- Critical evaluation of academic research
- Ethical AI reasoning and governance awareness
- Intelligent agent modeling
- AI environment classification
- Autonomous system design under uncertainty
- Research synthesis and comparative analysis
All academic sources are cited in APA format within the full report.
Primary sources include recent publications from arXiv, ICML, NeurIPS, and leading AI research institutions.
Fathima Safva Ovinakath Kammukkakath
BSc Computer Science (Level 5)
University of West London