IDM: Welcome

With more and more applications of machine learning and data mining methods to real-world problems, the necessity of understanding why those methods lead to certain results has also become more and more obvious.

The reasons for this need to understand are diverse: users might need to understand because money is on the line — as investment or profit — or even lives, e.g. in healthcare or disaster preparedness. Or they might want to understand because computational methods can then act as hypothesis generators: observing the results of a pattern mining operation or a produced clustering triggers new insights and informs new research directions – the final step in the “knowledge
discovery” process [1].

There also legal frameworks that require interpretability and explainability of algorithmic decisions, e.g. the EU General Data Protection Regulation or the US Fair Credit Reporting Act.
In machine learning (ML), most of the focus has been on using symbolic representations of non-symbolic models to help with explainability: “Why does the model give certain predictions?”.

Data mining, specifically pattern mining and clustering, is already largely symbolic but lacks the label-provided supervision that can be leveraged in ML. As such, there are two questions one needs to answer to understand data mining results: 1) “How did the algorithm arrive at the result?”, and 2) “What does this result mean w.r.t. the underlying data?” I.e., instead of explaining a result, one needs to understand a process and/or interpret a result.

The best way of answering these questions come from offering supervision in the form of involving the user by making data mining interactive. This involvement gives the user a better understanding of how successive mining steps build on each other, helping with interpreting the final result, explaining how the process arrived at it, and therefore building trust.

Venues for publishing work on understanding ML have been proliferated under the “explainable AI” (XAI) label whereas this remains an underexplored topic for data mining. After the end
of the “Interactive Data Exploration and Analytics” (IDEA) workshop series at KDD (2013–2017), opportunities for presenting on-going research have been rare. We therefore intend to bring the
topic back to ECML PKDD, host of the 2012 “Instant Interactive Data Mining” workshop, to take the temperature of the community on that topic, assemble work on progress on it, and in general bring interested researchers together for exchange and discussion.

References

[1] U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowledge discovery in databases. AI Magazine, 17(3):37–54, 1996.