( events)   Timezone: »  
Program Highlights »
Tue Dec 08 04:00 PM -- 08:55 PM (PST) @ 210D
An interactive system for the extraction of meaningful visualizations from high-dimensional data
Madalina Fiterau · Artur Dubrawski · Donghan Wang

We demonstrate our novel techniques for building ensembles of low-dimensional projections that facilitate data understanding and visualization by human users, given a learning task such as classification or regression. Our system trains user-friendly models, called Informative Projection Ensembles (IPEs). Such ensembles comprise of a set of compact submodels that ensure compliance with stringent user-specified requirement on model size and complexity, in order to allow visualization of the extracted patterns from data. IPEs handle data in a query-specific manner, each sample being assigned to a specialized Informative Projection, with data being automatically partitioned during learning. Through this setup, the models attain high performance while maintaining the transparency and simplicity of low-dimensional classifiers and regressors.

In this demo, we illustrate how Informative Projection Ensembles were of great use in practical applications. Moreover, we allow users the possibility to train their own models in real time, specifying such settings as the number of submodels, the dimensionality of the subspaces, costs associated with features as well as the type of base classifier or regressor to be used. Users are also able to see the decision-support system in action, performing classification, regression or clustering on batches of test data. The process of handling test data is also transparent, with the system highlighting the selected submodel, and how the queries are assigned labels/values by the submodel itself. Users can give feedback to the system in terms of the assigned outputs, and they will be able to perform pairwise comparisons of the trained models.

We encourage participants to bring their own data to analyze. Users have the possibility of saving the outcome of the analysis, for their own datasets or non-proprietary ones. The system supports the csv format for data and xml for the models.