2016  
Causal Inference by Compression. In: Proceedings of the IEEE International Conference on Data Mining (ICDM'16), IEEE, 2016 (regular paper, 8.5% acceptance rate; overall 19.6%). 

Keeping it Short and Simple: Summarising Complex Event Sequences with Multivariate Patterns. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'16), pp 735744, ACM, 2016 (oral presentation, 8.9% acceptance rate; overall 18.1%). 

Reconstructing an Epidemic over Time. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp 18351844, ACM, 2016 (overall 18.1% acceptance rate). 

Flexibly Mining Better Subgroups. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 585593, SIAM, 2016 (overall 25% acceptance rate). 

Universal Dependency Analysis. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 792800, SIAM, 2016 (overall 25% acceptance rate). 

Lineartime Detection of NonLinear Changes in Massively High Dimensional Time Series. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 828836, SIAM, 2016 (overall 25% acceptance rate). 

Is Exploratory Search Different? A Comparison of Information Search Behavior for Exploratory and Lookup Tasks. Journal of the Association for Information Science and Technology (JASIST) vol.67(11), pp 26352651, Wiley, 2016. (IF 2.26) 

2015  
AdaptiveNav: Adaptive Discovery of Interesting and Surprising Nodes in Large Graphs. In: Proceedings of the IEEE Conference on Visualization (VIS), IEEE, 2015. 

The Difference and the Norm – Characterising Similarities and Differences between Databases. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 206223, Springer, 2015. 

NonParametric JensenShannon Divergence. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp 173189, Springer, 2015. 

Causal Inference by Direction of Information. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 909917, SIAM, 2015. 

Getting to Know the Unknown Unknowns: DestructiveNoise Resistant Boolean Matrix Factorization. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 325333, SIAM, 2015. 

Hidden Hazards: Finding Missing Nodes in Large Graph Epidemics. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 415423, SIAM, 2015. 

Summarizing and Understanding Large Graphs. Statistical Analysis and Data Mining vol.8(3), pp 183202, Wiley, 2015. 

The Blind Men and the Elephant: About Meeting the Problem of Multiple Truths in Data from Clustering and Pattern Mining Perspectives. Machine Learning vol.98(1), pp 121155, Springer, 2015. (IF 1.587) 

2014  
Narrow or Broad? Estimating Subjective Specificity in Exploratory Search. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp 819828, ACM, 2014 (IR track full paper, overall 21% acceptance rate). 

A Fresh Look on Knowledge Bases: Distilling Named Events from News. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp 16891698, ACM, 2014 (KM track full paper, overall 21% acceptance rate). 

Multivariate Maximal Correlation Analysis. In: Proceedings of the International Conference on Machine Learning (ICML), pp 775783, JMLR: W&CP vol.32, 2014 (25.0% acceptance rate). 

VoG: Summarizing and Understanding Large Graphs. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 9199, SIAM, 2014. (fast track journal invitation, as one of the best of SDM'14; full paper with presentation, 15.4% acceptance rate) 

Interesting Patterns. In: Aggarwal, CC & Han, J (eds) Frequent Pattern Mining, pp 105134, pp 105134, Springer, 2014. 

Mining and Using Sets of Patterns through Compression. In: Aggarwal, CC & Han, J (eds) Frequent Pattern Mining, pp 165198, pp 165198, Springer, 2014. 

Frequent Pattern Mining Algorithms for Data Clustering. In: Aggarwal, CC & Han, J (eds) Frequent Pattern Mining, pp 403424, pp 403424, Springer, 2014. 

mdl4bmf: Minimal Description Length for Boolean Matrix Factorization. Transactions on Knowledge Discovery from Data vol.8(4), pp 130, ACM, 2014. (IF 1.68) 

Uncovering the Plot: Detecting Surprising Coalitions of Entities in MultiRelational Schemas. Data Mining and Knowledge Discovery vol.28(5), pp 13981428, Springer, 2014. (IF 2.877) (ECML PKDD'14 Journal Track) 

Unsupervised InteractionPreserving Discretization of Multivariate Data. Data Mining and Knowledge Discovery vol.28(5), pp 13661397, Springer, 2014. (IF 2.877) (ECML PKDD'14 Journal Track) 
Exploratory Data Analysis
Cluster of Excellence MMCI
Saarland University
Building E 1.7 Room 3.22
66123 Saarbrücken, Germany
Since October 2013, I lead the independent research group on Exploratory Data Analysis at the DFG clusterofexcellence on Multimodal Computing and Interaction at the University of Saarland.
In addition, I'm affiliated as
Senior Researcher with the Database and Information Systems (D5) group of the Max Planck Institute for Informatics.
My research is mainly concerned with exploratory data mining. That is, I develop theory and algorithms for answering the question `this is my data, tell me what I need to know'. To identify what you need to know, i.e., what is the most interesting structure in the data, I often employ wellfounded statistical methods. In particular, Information Theory — the principles of Minimum Description Length (MDL) and Maximum Entropy have proven to be highly valuable tools. Next, I develop highly efficient algorithms for extracting these interesting structures, i.e., models, from very large and complex data—as well as investigate how we can use these structures in a wide range of applications, including identifying rare diseases, ehealth, bioinformatics, market analysis, product recommendation, etc.
I'm always looking for talented and motivated
PhD candidates, postdocs, and HiWi's
with a strong background in data mining, machine learning, statistics, and/or mathematics.
Currently I'm investigating techniques for identifying informative local structures in large collections of complex data; how to efficiently mine good data descriptions directly such data; the theoretical and practical foundations of interactive exploration of very large data, discovering things by serendipity; how to mine large relational databases; how to mine very large graphs, including characterising influence propagation in social networks; as well as to study wellfounded approaches for meaningfully comparing between, and validation of, explorative results.
Below, you'll find an overview of my activities, as well as a selection of my recent publications. You might further be interested in my publications, implementations, our workshop on Interactive Data Exploration and Analytics (IDEA) at KDD'16, or our tutorials on Information Theoretic Methods in Data Mining at ECML PKDD'14 and SIAM SDM'15.
or, in case you're looking for a bit of procrastination, consider
Research in Progress — the secret life of research, through the medium of animated GIFs.
 Organisation & Invited Talks
 Tutorial Chair of SIAM SDM 2017, Houston, USA.
 Program CoChair of ECML PKDD 2016, Riva del Garda, Italy.
 Publicity CoChair of ACM IUI 2015, Atlanta, USA.
 Sponsorship CoChair of ECML PKDD 2014, Nancy, France.
 Workshop CoChair of IEEE ICDM 2012, Brussels, Belgium.
 Organiser of the ACM SIGKDD 2016 Workshop on Interactive Data Exploration and Analytics (IDEA), San Francisco.
 Organiser of the ACM SIGKDD 2015 Workshop on Interactive Data Exploration and Analytics (IDEA), Sydney, AU.
 Organiser of the ACM SIGKDD 2014 Workshop on Interactive Data Exploration and Analytics (IDEA), NYC, USA.
 Organiser of the International Workshop Data Mining: Beyond the Horizon, November 2014, Bristol, UK.
 Organiser of the ACM SIGKDD 2013 Workshop on Interactive Data Exploration and Analytics (IDEA), Chicago, USA.
 Organiser of the ACM SIGKDD 2013 Workshop on Outlier Detection and Description (ODD), Chicago, USA.
 Organiser of the ECML PKDD 2012 Workshop on Instant Interactive Data Mining (IID), Bristol, UK.
 Organiser of the ACM SIGKDD 2010 Workshop on Useful Patterns (UP), Washington DC, USA.
 Lecturer of the SIAM SDM 2015 Tutorial on Information Theoretic Methods in Data Mining, Vancouver, Canada.
 Lecturer of the ECML PKDD 2014 Tutorial on Information Theoretic Methods in Data Mining, Nancy, France.
 Lecturer of the IEEE ICDM 2011 Tutorial on Mining Sets of Patterns, Vancouver, Canada.
 Lecturer of the ECML PKDD 2010 Tutorial on Mining Sets of Patterns, Barcelona, Spain.
 Invited speaker at the IRISA PEPS Prefute Symposium, October 26 2016, Rennes, France.
 Invited speaker at the ECMLPKDD2016 PhD Forum, September 19 2016, Riva del Garda, Italy.
 Invited speaker at the LORIA Mathematics for Decision and Discovery symposium, May 11 2016, Nancy, France.
 Invited speaker at the SFB 876 Graduate School Lecture Series, April 14 2016, Dortmund, Germany.
 Invited lecturer at the 14th Estonian Summer School on Computer and System Science (ESSCaSS'15).
 Invited speaker at the SFB 1102 Scientific Retreat, Dagstuhl, Germany, June 28 2015.
 Invited speaker at the opening of GradUS, the Saarland University Graduate Centre, June 15 2015, Saarbrücken.
 Invited speaker at the SFB 1102 Workshop on Data Mining for Linguistic Analysis, March 13 2015, Saarbrücken.
 Invited speaker at the IEEE ICDM 2013 PhD Forum, Dallas, Texas.
 Invited speaker at the IEEE ICDM 2011 Workshop on Data Mining for Computational Collective Intelligence.
 Invited speaker at the ECML PKDD 2008 Workshop From Local Patterns to Global Models, Antwerp, Belgium.

Awards & Grants
 ACM SIGKDD'11 Best Student Paper Award for 'Tell Me What I Need to Know'
 ACM SIGKDD'10 Doctoral Dissertation RunnerUp Award for 'Making Pattern Mining Useful'
 ECML PKDD'09 Best Student Paper Award for 'Identifying the Components'
 UdSCS Busy Beaver teaching award for 'Topics in Algorithmic Data Analysis (TADA)' SS'15.
 UdSCS 'Topics in Algorithmic Data Analysis (TADA)' ranked highest of all SS'14 Advanced Lectures.
 Young Researcher at the Heidelberg Laureate Forum 2014, Heidelberg, Germany.
 Independent Research Group 'Exploratory Data Analysis' at the Cluster of Excellence MMCI at U.Saarland ('13–'18)
 Research Project 'Instant, Interactive & Adaptive Data Mining' of the Research Foundation – Flanders (FWO) ('12–'15)
 PostDoctoral Fellowship of the Research Foundation – Flanders (FWO) ('10–'13)
 UABOFKP Small Project (2010)
 UABOFIWS Postdoctoral Researcher ('09–'10)
 Editorial Board Memberships

Journal Reviewing
 Transactions on Knowledge Discovery and Data Mining (TKDD)
 Transactions on Knowledge and Data Engineering (TKDE)
 Journal of Maching Learning Research (JMLR)
 Statistical Analysis and Data Mining (SAM)
 Maching Learning journal (MLj)
 Information Systems (IS)
 Knowledge and Information Systems (KAIS)
 Social Network Analysis and Mining (SNAM)
 Transactions on Intelligent Systems and Technology (TIST)

Program Committees
 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) '10–'16
 ACM International Conference on Knowledge and Information Management (CIKM) '12–'13
 IEEE International Conference on Data Mining (ICDM) '12,'14–'16
 IEEE International Conference on Data Engineering (ICDE) '13
 SIAM Conference on Data Mining (SDM) '10,'11,'15–'17
 International World Wide Web Conference (WWW) '16
 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery
in Databases (ECML PKDD) '08–'15, area chair '14, program chair '16  Intelligent User Interfaces (IUI) senior PC '15
 Intelligent Data Analysis (IDA) '15
 European Conference on Artificial Intelligence (ECAI) '14
 International Conference on Advances in Social Network Analysis and Mining (ASONAM) '12
 International Conference on Pattern Recognition Applications and Methods (ICPRAM) '12
 BelgianDutch Conference on Machine Learning (BENELEARN) '13
 Workshop on Big Graph Mining (BGM) '14
 Workshop on Optimization Methods for Anomaly Detection (OMAD) '14
 Workshop on Practical Theories for Exploratory Data Mining (PTDM) '12
 Workshop on Discovering, Summarizing and Using Multiple Clusterings (MultiClust) '11–'13
 Workshop From Local Patterns to Global Models (LeGo) '08–'09

Graduate Courses
 Information Theory (WS'16)
 Topics in Algorithmic Data Analysis (SS'16)
 Information Retrieval and Data Mining (WS'15)
 Time Series Analytics (WS'15)
 Topics in Algorithmic Data Analysis (SS'15)
 The Information Theory Seminar (WS'14)
 Topics in Algorithmic Data Analysis (SS'14)
 Advanced Data Mining (SS'10–'13)
 Database Security (WS'11)
 Project Databases (WS'10)

Undergraduate Courses
 Artificial Intelligence (SS'13)
 Introduction to Artificial Intelligence (SS'10–'12)
 Introduction to Data Mining (WS'09–'10)
 Internet Programming ('06–'08)
 Databases ('05–'06)

Researchers and Assistants
 Dr. Mario Boley
 Kailash Budhathoki
 Janis Kalofolias
 Panagiotis Mandros
 Alexander Marx
 Roel Bertens
 Amirhossen Baradaranshahroudi
 Iva Baykova
 Robin Burghartz
 Jonas Fischer
 Patrick Ferber
 Xingaong Gao
 Magnus Halbe
 Michael A. Hedderich
 Frauke Hinrichs
 Former Postdoctoral Researchers

Former PhD Students
 Polina Rozenshtein (visiting, from Aalto U.)
 Dr. Koen Smets (16 May 2012)
 Dr. Michael Mampaey (21 Oct 2011)

Former MSc Thesis Students
 Amirhossein Baradaranshahroudi (2016)
 Apratim Bhattacharyya (2016)
 Beata Wójciak (2016)
 Margarita Salyaeva (2016)
 Manan Gandhi (2016)
 Kathrin Grosse (2016)
 Kailash Budhathoki (2015)
 Panagiotis Mandros (2015)
 Thomas Van Brussel (2012)
 Tanja Van den Eede (2011)
 Sandy Moens (2010)
 Andie Similon (2010)
 Sander Schuckmann (2008)

Former BSc Students
 Magnus Halbe (2016)
 Stefan Bier (2014)

Former Research Assistants
 Shweta Mahajan
 Sebastian Brust
 Sinan Bozca
 Cristian Caloian
 Eustace Ebhotemhen
 Andrea Fuksova
 Shilpa Garg
 Tobias Heinen
 Stefan Neumann
 Michael Wessely
 David Ziegler