Skip to content

seigneurvador/DataCampProject

Repository files navigation

Studying the French socio-economic factors on highschool education

This is the Codabench challenge of Group 36

Team members (github account)

  • seigneurvador
  • xiaomaju11
  • Malcolm-ZHANG
  • taiwei-wu

The link to the original data is presented below. The script used for preprocessing is presented in the folder Origin_and_merge_data. We add the original .csv files to it.

Context

The analysis of educational outcomes in relation to socio-economic factors is essential for addressing inequalities in the education system. By predicting the rate of honors based on social background and local educational levels, we can identify patterns of segregation and highlight schools that succeed in providing quality education regardless of students' socio-economic status. This challenge not only contributes to academic research but also informs policy decisions aimed at promoting equity and excellence in education across different regions.

Objectives

  • Understand how socio-economic factors influence high school outcomes in France.
  • Identify schools that reduce inequalities effectively.
  • Build a predictive model for the rate of honors (Taux de mentions - Toutes séries) based on population diplomas, IPS (social indice of school), and median income.

Visualizations

1. Distribution Analysis

Bivariate & Univariate Distributions

Bivariate distribution
Univariate distribution

2. High School Rate of Honors (Target variable)

Rate of Honors - All Series

Rate of honors France
Rate of honors Île-de-France

3. Social Position Index (IPS)

IPS Distribution

IPS France
IPS Île-de-France

4. BAC ≥ 5 Rate

High School Graduation Rate ≥ Bac+5

Bac+5 Rate France
Bac+5 Rate Île-de-France

5. Median Income

Median Income Distribution

Median income France
Median income Île-de-France


Data description

Our dataset was created by merging multiple open data sources:

🧑‍🎓 Population diploma level (2022)

  • Provider: INSEE
  • Description: Distribution of the population by highest diploma level at the municipal scale.
  • Purpose: Highlight unequal access to educational capital, strongly correlated with social and territorial determinants.

Download: base-cc-diplomes-formation-2022.csv

🏫 High school Social Position Index (IPS) — 2016–2022

  • Provider: data.gouv.fr
  • Description: Composite indicator reflecting students’ social background at the high school level.
  • Purpose: Objectify school segregation and challenge the myth of equal opportunity.

Download: fr-en-ips_lycees.csv

📈 High school value added indicator

  • Provider: data.gouv.fr
  • Description: Measures school performance while accounting for students’ social and academic profiles.
  • Purpose: Move beyond raw rankings and recognize institutions that actively reduce inequalities.

Download: fr-en-indicateurs-de-resultat-des-lycees-gt_v2.csv

🌍 Geographic coordinates of French highschools

  • Provider: Ministry of Higher Education and Research
  • Description: National geographic reference for French highschool (coordinates and labels).

Download: fr-en-adresse-et-geolocalisation-etablissements-premier-et-second-degre.csv

🏘️ Municipal reference table — INSEE codes (2022)

  • Provider: INSEE
  • Description: Official list of French municipalities with INSEE codes (2022 reference).

Download: commune_2022.csv

🏘️ Median income table for each city (2021)

  • Provider: INSEE
  • Description: Median income for each city in France.

Download: revenu-des-francais-a-la-commune-1765372688826.csv


Task

The goal of this data challenge is to build a model that can predict the Taux de mentions - Toutes séries (Rate of honors - All series) for French high schools based on the social background of their students (IPS) and the educational level of the population in their municipality.


Metric

This is a regression task predicting a continuous variable. The metric used is the Mean Squared Error (MSE):

$$\displaystyle\text{MSE} = \frac{1}{n}\sum_{i=1}^{n}(y^{(i)}-\hat y^{(i)})^2$$


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages