Using XGBoost to predict large sparse data

For using XGBoost to predict, I wrote code like this:

But it reported error:

Seems csr_matrix in SciPy is not supported by XGBoost. Maybe I need to transfer sparse data to dense:

But it still reported:

The ‘test’ data is too big so it cann’t even be transfered to dense data!
XGBoost doesn’t support the sparse format, and my sparse data cannot be changed to dense. Then what should I do?

Actually, the solution is incredible simple — just use XGBoost’s DMatrix!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.