2022 | |
Label-Descriptive Patterns and Their Application to Characterizing Classification Errors. In: Proceedings of the International Conference on Machine Learning (ICML), PMLR, 2022 (21.9% acceptance rate). |
|
Heteroscadastic Noise Based Causal Inference. In: Proceedings of the International Conference on Machine Learning (ICML), PMLR, 2022 (21.9% acceptance rate). |
|
Mining Interpretable Data-to-Sequence Generators. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2022 (15.0% acceptance rate). |
|
Differentially Describing Groups of Graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2022 (oral presentation 5.5% acceptance rate; overall 15.0%). |
|
Naming the most anomalous cluster in Hilbert Space for structures with attribute information. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2022 (15.0% acceptance rate). |
|
Omen: Discovering Sequential Patterns with Reliable Prediction Delays. Knowledge and Information Systems, Springer (IF 2.822) |
|
2021 | |
Differentiable Pattern Set Mining. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'21), ACM, 2021 (15.4% acceptance rate). |
|
Graph Similarity Description: How Are These Graphs Similar?. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'21), ACM, 2021 (15.4% acceptance rate). |
|
What's in the Box? Explaining Neural Networks with Robust Rules. In: Proceedings of the International Conference on Machine Learning (ICML), PMLR, 2021 (21.4% acceptance rate). |
|
Discovering Reliable Causal Rules. In: Proceedings of the SIAM International Conference on Data Mining (SDM), SIAM, 2021 (21.2% acceptance rate). |
|
Mining Easily Understandable Models from Complex Event Data. In: SIAM International Conference on Data Mining (SDM), SIAM, 2021 (21.2% acceptance rate). |
|
SUSAN: The Structural Similarity Random Walk Kernel. In: Proceedings of the SIAM International Conference on Data Mining (SDM), SIAM, 2021 (21.2% acceptance rate). |
|
Discovering Fully Oriented Causal Networks. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), AAAI, 2021 (21.3% acceptance). |
|
Integrative Analysis of Epigenetics Data Identifies Gene-Specific Regulatory Elements. Nucleic Acids Research, Oxford University Press (IF 16.97) |
|
Data-driven Equation for Drug-Membrane Permeability across Drugs and Membranes. Journal of Chemical Physics vol.24(154), AIP, 2021. (IF 2.991) |
|
2020 | |
The Relaxed Maximum Entropy Distribution and its Application to Pattern Discovery. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'20), IEEE, 2020 (19.7% acceptance rate). |
|
Just Wait For It... Mining Sequential Patterns with Reliable Prediction Delays. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'20), IEEE, 2020 (full paper, 9.8% acceptance rate; overall 19.7%). (invited for the KAIS Special Issue on the Best of IEEE ICDM 2020) |
|
Discovering Succinct Pattern Sets Expressing Co-Occurrence and Mutual Exclusivity . In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'20), ACM, 2020 (16.8% acceptance rate). |
|
Discovering Functional Dependencies from Mixed-Type Data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'20), ACM, 2020 (16.8% acceptance rate). |
|
Discovering Approximate Functional Dependencies using Smoothed Mutual Information . In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'20), ACM, 2020 (16.8% acceptance rate). |
|
Towards Plausible Graph Anonymization. In: Proceedings of the Network and Distributed System Security Symposium (NDSS), The Internet Society, 2020 (17.4% acceptance rate). |
|
Explainable Data Decompositions. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI'20), AAAI, 2020 (oral presentation 4.5% acceptance rate; overall 20.6%). |
|
What is Normal, What is Strange, and What is Missing in a Knowledge Graph. In: Proceedings of the Web Conference (WWW), ACM, 2020 (oral presentation; overall acceptance rate 19.2%). |
|
Discovering Dependencies with Reliable Mutual Information. Knowledge and Information Systems vol.62, pp 4223-4253, Springer, 2020. (IF 2.936) |
|
Identifying Domains of Applicability of Machine Learning Models for Materials Science. Nature Communications vol.11(4428), pp 1-9, Nature Research, 2020. (IF 12.12) |
Max Planck Institute for Informatics
Saarland University
66123 Saarbrücken, Germany

I lead the research group on Exploratory Data Analysis at the CISPA Helmholtz Center for Information Security. In addition, I'm affiliated as Senior Researcher with the Database and Information Systems (D5) group of the Max Planck Institute for Informatics, and as Professor with the Department of Computer Science of Saarland University.
My research is mainly concerned with causality, unsupervised learning, and data mining. In particular, I enjoy developing theory and algorithms for answering exploratory questions about data, such as `what is going on in this data?', `what are the key dependencies?', `are they causal or confounded?', and so on, without having to make unnecessary or unjustified assumptions about the data generating process. To identify what is worthwhile structure, i.e. what is worth knowing, I often employ well-founded statistical methods based on information theory, and then proceed to develop efficient algorithms that can extract useful and insightful results from large and complex data. I like all data types equally much.
I'm always looking for talented and motivated
PhD candidates, postdocs, and HiWi's
with a strong background in data mining, machine learning, statistics, and/or mathematics.
Currently I'm investigating techniques for identifying informative and ideally causal structures in large collections of complex data; how to efficiently mine easily interpretable summaries from data; how to determine and discover causal dependencies from observational data; the theoretical and practical foundations of interactive exploration of very large data, discovering things by serendipity; how to mine large relational databases; how to mine very large graphs, including characterising influence propagation in social networks; as well as to study well-founded approaches for meaningfully comparing between, and validation of, explorative results.
Below, you'll find an overview of my activities, as well as a selection of my recent publications. You might further be interested in our recent workshop on Learning and Mining for Cybersecurity (LEMINCS) at KDD'19, our tutorial on Modern MDL meets Data Mining at KDD'19, or our tutorial on Summarizing Graphs at Multiple Scales at ICDM'18.
or, in case you're looking for a bit of procrastination, consider
Research in Progress — the secret life of research, through the medium of animated GIFs.
- Organisation & Invited Talks
- Panel Chair of SIAM SDM 2019, Calgary, Canada.
- Tutorial Chair of SIAM SDM 2017, Houston, USA.
- Program Co-Chair of ECML PKDD 2016, Riva del Garda, Italy.
- Publicity Co-Chair of ACM IUI 2015, Atlanta, USA.
- Sponsorship Co-Chair of ECML PKDD 2014, Nancy, France.
- Workshop Co-Chair of IEEE ICDM 2012, Brussels, Belgium.
- Organiser of the CECAM Workshop AI for Materials Science: Mining and Learning Interpretable, Explainable, and Generalizable Models from Data, June 2021, online.
- Organiser of the Max Planck BiGmax Summer School 2019 on Data-Driven Materials Discovery, Platja d'Aro, Spain.
- Organiser of the ACM SIGKDD 2019 Workshop on Learning and Mining for Cybersecurity (LEMINCS), Anchorage.
- Organiser of the ACM SIGKDD 2017 Workshop on Interactive Data Exploration and Analytics (IDEA), Halifax.
- Organiser of the ACM SIGKDD 2016 Workshop on Interactive Data Exploration and Analytics (IDEA), San Francisco.
- Organiser of the ACM SIGKDD 2015 Workshop on Interactive Data Exploration and Analytics (IDEA), Sydney, AU.
- Organiser of the ACM SIGKDD 2014 Workshop on Interactive Data Exploration and Analytics (IDEA), NYC, USA.
- Organiser of the International Workshop Data Mining: Beyond the Horizon, November 2014, Bristol, UK.
- Organiser of the ACM SIGKDD 2013 Workshop on Interactive Data Exploration and Analytics (IDEA), Chicago, USA.
- Organiser of the ACM SIGKDD 2013 Workshop on Outlier Detection and Description (ODD), Chicago, USA.
- Organiser of the ECML PKDD 2012 Workshop on Instant Interactive Data Mining (IID), Bristol, UK.
- Organiser of the ACM SIGKDD 2010 Workshop on Useful Patterns (UP), Washington DC, USA.
- Lecturer of the ACM SIGKDD 2019 Tutorial on Modern MDL meets Data Mining, Anchorage, Alaska.
- Lecturer of the IEEE ICDM 2018 Tutorial on Summarizing Graphs at Multiple Scales, Singapore.
- Lecturer of the SIAM SDM 2015 Tutorial on Information Theoretic Methods in Data Mining, Vancouver, Canada.
- Lecturer of the ECML PKDD 2014 Tutorial on Information Theoretic Methods in Data Mining, Nancy, France.
- Lecturer of the IEEE ICDM 2011 Tutorial on Mining Sets of Patterns, Vancouver, Canada.
- Lecturer of the ECML PKDD 2010 Tutorial on Mining Sets of Patterns, Barcelona, Spain.
- Keynote at the 4th SciCAR conference Science Meets Computer Assisted-Reporting, Nov 2-3 2020, Dortmund.
- Keynote at the DSN Workshop on Data-Centric Dependability and Security, Jun 29 2020, Valencia.
- Keynote Lecturer at the EuADS Summer School on Explainable Data Science, Sep 10-13 2019, Luxemburg.
- Keynote speaker at the NOMAD Summer School on Materials Discovery, Sep 24-27 2018, Laussane, Switzerland.
- Keynote speaker at the 3rd Workshop on Formal Reasoning about Causation, Responsibility, and Explanation in Science and Technology at ETAPS 2018, April 20 2018, Thessaloniki, Greece.
- Keynote speaker at the Symposium on Managing and Exploiting the Raw Material of the 21st Century at the Spring Meeting of the German Physical Society, Mar 11-16 2018, Berlin, Germany.
- Keynote speaker at the CECAM Big-Data Driven Materials Science workshop, Sep 11-13 2017, Laussane, Switzerland.
- Keynote speaker at the International Conference on Formal Concept Analysis, June 14-16 2017, Rennes, France.
- Keynote speaker at the IRISA PEPS Prefute Symposium, October 26 2016, Rennes, France.
- Keynote speaker at the ECML PKDD 2016 PhD Forum, September 19 2016, Riva del Garda, Italy.
- Keynote speaker at the LORIA Mathematics for Decision and Discovery symposium, May 11 2016, Nancy, France.
- Keynote speaker at the SFB 876 Graduate School Lecture Series, April 14 2016, Dortmund, Germany.
- Keynote lecturer at the Estonian Summer School on Computer and System Science (ESSCaSS'15).
- Keynote speaker at the SFB 1102 Scientific Retreat, Dagstuhl, Germany, June 28 2015.
- Keynote speaker at the opening of GradUS, the Saarland University Graduate Centre, June 15 2015, Saarbrücken.
- Keynote speaker at the SFB 1102 Workshop on Data Mining for Linguistic Analysis, March 13 2015, Saarbrücken.
- Keynote speaker at the IEEE ICDM 2013 PhD Forum, Dallas, Texas.
- Keynote speaker at the IEEE ICDM 2011 Workshop on Data Mining for Computational Collective Intelligence.
- Keynote speaker at the ECML PKDD 2008 Workshop From Local Patterns to Global Models, Antwerp, Belgium.
-
Awards & Grants
- IEEE ICDM'18 Tao Li Award for Excellence in Research.
- IEEE ICDM'18 Best Paper Award for 'Discovering Reliable Dependencies from Data'.
- UdS-CS'15 Busy Beaver Teaching Award for 'Topics in Algorithmic Data Analysis'.
- ACM SIGKDD'11 Best Student Paper Award for 'Tell Me What I Need to Know'.
- ACM SIGKDD'10 Doctoral Dissertation Runner-Up Award for 'Making Pattern Mining Useful'.
- ECML PKDD'09 Best Student Paper Award for 'Identifying the Components'.
- ELLIS Faculty of the Saarbrücken Unit on Artificial Intelligence and Machine Learning (SAM).
- Honorary Professorship in Computer Science, Saarland University.
- PI of Crushing Antimicrobial Resistance using Explainable AI (HAICU, Helmholtz Association)
- PI of Trusted Federated Data Analytics (Pilot Project, Helmholtz Association)
- Deputy-PI of BigMax (MaxNet, Max Planck Society)
- Independent Research Group 'Exploratory Data Analysis' at the Cluster of Excellence MMCI at U.Saarland ('13–'18).
- Research Project 'Instant, Interactive & Adaptive Data Mining' of the Research Foundation – Flanders (FWO) ('12–'15).
- Post-Doctoral Fellowship of the Research Foundation – Flanders (FWO) ('10–'13).
-
Editorial Board Memberships
- Data Mining and Knowledge Discovery (DAMI) since '15.
- Knowledge and Information Systems (KAIS) since '20.
- Guest Editorial Board for the ECML PKDD Journal Track '13–'22.
- Guest Editor of the Special Issue on `Interactive Data Exploration and Analytics' of
Transactions on Knowledge Discovery and Data Mining (TKDD)
-
Journal Reviewing
- Transactions on Knowledge Discovery and Data Mining (TKDD)
- Transactions on Knowledge and Data Engineering (TKDE)
- Journal of Maching Learning Research (JMLR)
- Statistical Analysis and Data Mining (SAM)
- Maching Learning journal (MLj)
- Information Systems (IS)
- Knowledge and Information Systems (KAIS)
- Social Network Analysis and Mining (SNAM)
- Transactions on Intelligent Systems and Technology (TIST)
-
Program Committees
- ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) '10–'22, area chair '19–'20
- AAAI International Conference on Artificial Intelligence (AAAI), '20–'21, area chair '20
- IEEE International Conference on Data Mining (ICDM) '12–'22, area chair '21–'22
- International Conference on Machine Learning (ICML) '18, '21
- Neural Information Processing Systems (NeurIPS) '17–'21
-
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery
in Databases (ECML PKDD) '08–'22, program chair '16, area chair '14,'18–'22 - SIAM Conference on Data Mining (SDM) '10–'22, area chair '19–'22
- Artificial Intelligence and Statistics (AIStats), '22
- Uncertainty in Artificial Intelligence (UAI), '21
- IEEE International Conference on Data Science and Advanced Analytics (DSAA), area chair '19
- International World Wide Web Conference (WWW) '16
- German Conference on Artificial Intelligence (KI) '17
- Intelligent Data Analysis (IDA) '15
- Intelligent User Interfaces (IUI) senior PC '15
- European Conference on Artificial Intelligence (ECAI) '14
- ACM International Conference on Knowledge and Information Management (CIKM) '12–'13
- IEEE International Conference on Data Engineering (ICDE) '13
- Belgian-Dutch Conference on Machine Learning (BENELEARN) '13
- International Conference on Advances in Social Network Analysis and Mining (ASONAM) '12
- International Conference on Pattern Recognition Applications and Methods (ICPRAM) '12
- Workshop on Epidemiology meets Data Mining and Knowledge Discovery (epiDAMIK) '18
- Workshop on Outlier Detection De-Constructed (ODD) '18
- Workshop on Big Graph Mining (BGM) '14
- Workshop on Optimization Methods for Anomaly Detection (OMAD) '14
- Workshop on Practical Theories for Exploratory Data Mining (PTDM) '12
- Workshop on Discovering, Summarizing and Using Multiple Clusterings (MultiClust) '11–'13
- Workshop From Local Patterns to Global Models (LeGo) '08–'09
-
Courses
- Topics in Algorithmic Data Analysis (SS'22)
- Don't Panic (SS'22)
- Elements of Machine Learning (WS'21)
- Topics in Algorithmic Data Analysis (SS'21)
- Elements of Machine Learning (WS'20)
- Topics in Algorithmic Data Analysis (SS'20)
- Don't Panic (WS'19)
- Elements of Statistical Learning (WS'19)
- Topics in Algorithmic Data Analysis (SS'19)
- Elements of Statistical Learning (WS'18)
- Topics in Algorithmic Data Analysis (SS'18)
- Don't Panic (WS'17)
- Information Retrieval and Data Mining (WS'17)
- Topics in Algorithmic Data Analysis (SS'17)
- Information Theory (WS'16)
- Topics in Algorithmic Data Analysis (SS'16)
- Information Retrieval and Data Mining (WS'15)
- Time Series Analytics (WS'15)
- Topics in Algorithmic Data Analysis (SS'15)
- The Information Theory Seminar (WS'14)
- Topics in Algorithmic Data Analysis (SS'14)
- Artificial Intelligence (SS'13)
- Introduction to Artificial Intelligence (SS'10–'12)
- Advanced Data Mining (SS'10–'13)
- Database Security (WS'11)
- Project Databases (WS'10)
- Introduction to Data Mining (WS'09–'10)
- Internet Programming ('06–'08)
- Databases ('05–'06)
-
Researchers and Assistants
- Dr. Corinna Coupette
- Joscha Cueppers
- Sebastian Dalleiger
- Jonas Fischer
- Janis Kalofolias
- David Kaltenpoth
- Sarah Mameche
- Osman Ali Mian
- Boris Wiegand
- Muneeb Aadil
- Ekaterina Arkhangelskaya
- Abraham Ezema
- Ravil Gasanov
- Martin Gassner
- Nisha George
- Saif Ali Khan
- Paul Krieger
- Jyotsna Singh
- Marco Schuster
- Nils Walter
- Matthias Wilms
- Sascha Xu
- Mohammad Yaseen
- Former Postdoctoral Researchers
-
Former PhD Students
- Alexander Marx (29 June 2021)
- Panagiotis Mandros (4 March 2021)
- Kailash Budhathoki (3 July 2020)
- Dr. Roel Bertens
(27 May 2017)
- Dr. Koen Smets
(16 May 2012)
- Dr. Michael Mampaey
(21 Oct 2011)
- Alexander Marx (29 June 2021)
-
Former MSc Thesis Students
- Ekatarina Arkhangelskaya (2022)
- Abraham Ezema (2021)
- Tim Bruxmeier (2021)
- Frauke Hinrichs (2021)
- Sarah Mameche (2021)
- Jana Hess (2021)
- Anna Oláh (2020)
- Edith Heiter (2020)
- Sandra Sukarieh (2020)
- Joscha Cueppers (2019)
- Divyam Saran (2019)
- Osman Ali Mian (2019)
- Simina Ana Cotop (2019)
- Magnus Halbe (2018)
- Maha Aburahma (2018)
- Iva Farag-Baykova (2018)
- Yuliia Brendel (2018)
- Maike Eissfeller (2018)
- Boris Wiegand (2018)
- Tatiana Dembelova (2018)
- Robin Burghartz (2017)
- Henrik Jilke (2017)
- Benjamin Hättasch (2017)
- Amirhossein Baradaranshahroudi (2016)
- Apratim Bhattacharyya (2016)
- Beata Wójciak (2016)
- Margarita Salyaeva (2016)
- Manan Gandhi (2016)
- Kathrin Grosse (2016)
- Kailash Budhathoki (2015)
- Panagiotis Mandros (2015)
- Thomas Van Brussel (2012)
- Tanja Van den Eede (2011)
- Sandy Moens (2010)
- Andie Similon (2010)
- Sander Schuckmann (2008)
-
Former BSc Thesis Students
- Matthias Wilms (2021)
- Daniel Kindler (2021)
- Frauke Hinrichs (2017)
- Magnus Halbe (2016)
- Stefan Bier (2014)
-
Former Research Assistants
- Khánh Hiep Tran
- Grégoire Pacreau
- Michael Hedderich
- Patrick Ferber
- Shweta Mahajan
- Tobias Heinen
- Cristian Caloian
- David Ziegler
- Stefan Neumann
- Andrea Fuksova
- Eustace Ebhotemhen
- Shilpa Garg
- Sinan Bozca
- Michael Wessely