The code for using PyCaret is quite simple:
df = pd.read_csv(TRAIN_CSV_FILE)
setup(data=data, target="TARGET", session_id=1023)
compare_models(verbose = False)
But it reported error in the first run:
ValueError: array is too big; `arr.size * arr.dtype.itemsize` is larger than the maximum possible size.
This is because it has a custom_id
column in my data that have string value like “11100000033445566”. Seems PyCaret recognize it as integer type.
After drop that column, I got the result of models comparison:
Looks the GradientBoostingClassifier works as well as LightGBM.