Using XGBoost to predict large sparse data

For using XGBoost to predict, I wrote code like this:

But it reported error:

Looks csr_matrix in SciPy is not supported by XGBoost. Maybe I need to transfer sparse data to dense:

But it still reported:

The ‘test’ data is too big so it cann’t even be transfered to dense data!
XGBoost doesn’t support the sparse format, and my sparse data cannot be changed to dense. Then what should I do?

Actually, the solution is inredibale simple — just use XGBoost’s DMatrix!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.