Project
back
Tiles and Tilings
by Jilles Vreeken, based on the paper 'Tiling Databases' by Geerts, Goethals & Mielikainen.

Mining Tiles: given a binary database, the code can find you all tiles (i.e. itemsets) that have an area (cardinality times support) larger than the given threshold. This can be done fairly efficiently, but may produce many many results.

Mining Tilings: given a binary database, the code can find you the (overlapping) k-set of itemsets that together cover as much of the 1's in the data as possible. This is fairly inefficient, in particular for dense datasets. (a non-overlapping tiling miner is in the works, but is not finished, and as I have not had time to further on it, it may remain so for a long time.)

The implementation is in C++ and was tested on Windows.

Implementation

the C++ source code (Nov 2016), by Jilles Vreeken.

Related Publications

Remmerie, N, De Vijlder, T, Valkenborg, D, Laukens, K, Smets, K, Vreeken, J, Mertens, I, Carpentier, S, Panis, B, De Jaeger, G, Prinsen, E & Witters, E Unraveling Tobacco BY-2 Protein Complexes with BN PAGE/LC-MS/MS and Clustering Methods. Journal of Proteomics vol.74(8), pp 1201-1217, Elsevier, 2011. (IF 5.074)