Subjectively interesting alternative clusterings

Verfasser / Beitragende:
[Kleanthis-Nikolaos Kontonasios, Tijl De Bie]
Ort, Verlag, Jahr:
2015
Enthalten in:
Machine Learning, 98/1-2(2015-01-01), 31-56
Format:
Artikel (online)
ID: 60547804X
LEADER caa a22 4500
001 60547804X
003 CHVBK
005 20210128100403.0
007 cr unu---uuuuu
008 210128e20150101xx s 000 0 eng
024 7 0 |a 10.1007/s10994-013-5333-z  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s10994-013-5333-z 
245 0 0 |a Subjectively interesting alternative clusterings  |h [Elektronische Daten]  |c [Kleanthis-Nikolaos Kontonasios, Tijl De Bie] 
520 3 |a We deploy a recently proposed framework for mining subjectively interesting patterns from data to the problem of alternative clustering, where patterns are sets of clusters (clusterings) in the data. This framework outlines how subjective interestingness of patterns (here, clusterings) can be quantified using sound information theoretic concepts. We demonstrate how it motivates a new objective function quantifying the interestingness of a clustering, automatically accounting for a user's prior beliefs and for redundancies between the discovered patterns. Directly searching for the optimal set of clusterings defined in this way is hard. However, the optimization problem can be solved approximately if clusterings are generated iteratively. In this iterative scheme, each subsequent clustering is maximally interesting given the whole set of previously generated clusterings, automatically trading off interestingness with non-redundancy. Although generating each clustering in an iterative fashion is computationally hard as well, we develop an approximation technique similar to spectral clustering algorithms. Our method can generate as many clusterings as the user requires. Subjective evaluation or the value of the objective function can guide the termination of the process. In addition our method allows varying the number of clusters in each successive clustering. Experiments on artificial and real-world datasets show that the mined clusterings fulfill the requirements of a good clustering solution by being both non-redundant and of high compactness. Comparison with existing solutions shows that our approach compares favourably with regard to well-known objective measures of similarity and quality of clusterings, even though it is not designed to directly optimize them. 
540 |a The Author(s), 2013 
690 7 |a Subjective interestingness  |2 nationallicence 
690 7 |a Maximum entropy modelling  |2 nationallicence 
690 7 |a Alternative clustering  |2 nationallicence 
700 1 |a Kontonasios  |D Kleanthis-Nikolaos  |u Intelligent Systems Laboratory, University of Bristol, Bristol, UK  |4 aut 
700 1 |a De Bie  |D Tijl  |u Intelligent Systems Laboratory, University of Bristol, Bristol, UK  |4 aut 
773 0 |t Machine Learning  |d Springer US; http://www.springer-ny.com  |g 98/1-2(2015-01-01), 31-56  |x 0885-6125  |q 98:1-2<31  |1 2015  |2 98  |o 10994 
856 4 0 |u https://doi.org/10.1007/s10994-013-5333-z  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s10994-013-5333-z  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Kontonasios  |D Kleanthis-Nikolaos  |u Intelligent Systems Laboratory, University of Bristol, Bristol, UK  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a De Bie  |D Tijl  |u Intelligent Systems Laboratory, University of Bristol, Bristol, UK  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Machine Learning  |d Springer US; http://www.springer-ny.com  |g 98/1-2(2015-01-01), 31-56  |x 0885-6125  |q 98:1-2<31  |1 2015  |2 98  |o 10994