Click here to Skip to main content
15,890,557 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hey, I have a problem with OrdinalEncoder. I'm using OrdinalEncoder to encode this dataset: Loan Prediction - Analytics Vidhya | Kaggle[^] , but there are 2 features (Dependants and Education) that are -1 for all 614 rows. Can you please tell me what I did wrong?

What I have tried:

object_cols = [col for col in X.columns if X[col].dtype == "object"]
num_cols = list(set(X.columns) - set(object_cols))

X = X.fillna(X.mean())
X_test = X_test.fillna(X_test.mean())

object_imputer = SimpleImputer(strategy="most_frequent")
X[object_cols] = pd.DataFrame(object_imputer.fit_transform(X[object_cols]))
X_test[object_cols] = pd.DataFrame(object_imputer.transform(X_test[object_cols]))

mapping = [
    {"col": "Dependents", "mapping": {
        "0": 0, "1": 1, "2": 2, "3+": 3
    }},
    {"col": "Education", "mapping": {
        "Not Graduate": 0, "Graduate": 1
    }}
]
onehot_cols = ["Gender", "Property_Area"]
ord_cols = list(set(object_cols) - set(onehot_cols)) 
for col in ord_cols:
    mapping.append({"col": col, "mapping": {
        "No": 0, "Yes": 1
    }})

encoder = Pipeline(steps=[
    ("ordinal", OrdinalEncoder(mapping=mapping)),
    ("onehot", OneHotEncoder(cols=onehot_cols))
])

X = pd.DataFrame(encoder.fit_transform(X))
X_test = pd.DataFrame(encoder.transform(X_test))
Posted

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS


CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900