Classification in Spark 2.0: “Input validation failed” and other wondrous tales

Christos - Iraklis Tsatsoulis Data Science, Spark 6 Comments

Spark 2.0 has been released since last July but, despite the numerous improvements and new features, several annoyances still remain and can cause headaches, especially in the Spark machine learning APIs. Today we’ll have a look at some of them, inspired by a recent answer of mine in a Stack Overflow question (the question was about Spark 1.6 but, as …