- Import the Dataset & Explore Basic Info python Copy Edit import pandas as pd
df = pd.read_csv('titanic.csv') # Adjust filename if different print(df.info()) print(df.describe()) print(df.isnull().sum()) 2. Handle Missing Values Age: Use mean or median.
Cabin: May drop or label as "Unknown".
Embarked: Fill with mode (most common port).
python Copy Edit df['Age'].fillna(df['Age'].median(), inplace=True) df['Embarked'].fillna(df['Embarked'].mode()[0], inplace=True) df['Cabin'].fillna('Unknown', inplace=True) 3. Encode Categorical Features python Copy Edit df['Sex'] = df['Sex'].map({'male': 0, 'female': 1}) df = pd.get_dummies(df, columns=['Embarked'], drop_first=True) 4. Normalize/Standardize Numerical Features Use StandardScaler or MinMaxScaler:
python Copy Edit from sklearn.preprocessing import StandardScaler
scaler = StandardScaler() df[['Age', 'Fare']] = scaler.fit_transform(df[['Age', 'Fare']]) 5. Visualize and Remove Outliers python Copy Edit import seaborn as sns import matplotlib.pyplot as plt
sns.boxplot(x=df['Fare']) plt.show()
df = df[df['Fare'] < df['Fare'].quantile(0.99)]