Multilabel learning aims to predict labels of unseen instances by learning from training samples that are associated with a set of known labels. In this paper, we propose to use a hierarchical tree model for multilabel learning, and to develop the ML-Tree algorithm for finding the tree structure. ML-Tree considers a tree as a hierarchy of data and constructs the tree using the induction of one-against-all SVM classifiers at each node to recursively partition the data into child nodes. For each node, we define a predictive label vector to represent the predictive label transmission in the tree model for multilabel prediction and automatic discovery of the label relationships.
If two labels co-occur frequently as predictive labels at leaf nodes, these labels are supposed to be relevant. The amount of predictive label co-occurrence provides an estimation of the label relationships. We examine the ML-Tree method on 11 real data sets of different domains and compare it with six well-established multilabel learning algorithms. The performances of these approaches are evaluated by 16 commonly used measures. We also conduct Friedman and Nemenyi tests to assess the statistical significance of the differences in performance. Experimental results demonstrate the effectiveness of our method.