did you compared your CBM code with catboost encoding https://catboost.ai/docs/concepts/algorithm-main-stages_cat-to-numberic.html by the way catboost people say that category_encoders mimic of their encoder is mistaken https://github.com/scikit-learn-contrib/category_encoders