Workshop

Challenges of Data Visualization

Barbara Hammer ⋅ Laurens van der Maaten ⋅ Fei Sha ⋅ Alexander Smola

Project Page

Abstract

The increasing amount and complexity of electronic data sets turns visualization into a key technology to provide an intuitive interface to the information. Unsupervised learning has developed powerful techniques for, e.g., manifold learning, dimensionality reduction, collaborative filtering, and topic modeling. However, the field has so far not fully appreciated the problems that data analysts seeking to apply unsupervised learning to information visualization are facing such as heterogeneous and context dependent objectives or
streaming and distributed data with different credibility. Moreover, the unsupervised learning field has hitherto failed to develop human-in-the-loop approaches to data visualization, even though such approaches including e.g. user relevance feedback are necessary to arrive at valid and interesting results.\par As a consequence, a number of challenges arise in the context of data visualization which cannot be solved by classical methods in the field:
\begin{itemize}
\item \emph{Methods have to deal with modern data formats and data sets:}\par\noindent How can the technologies be adapted to deal with streaming and probably non i.i.d. data sets? How can specific data formats be visualized appropriately such as spatio-temporal data, spectral data, data characterized by a general probably non-metric dissimilarity measure, etc.? How can we deal with heterogeneous data and different credibility? How can the dissimilarity measure be adapted to emphasize the aspects which are relevant for visualization?
\item \emph{Available techniques for specific tasks should be combined in a canonic way:}\par\noindent How can unsupervised learning techniques be combined to construct good visualizations? For instance, how can we effectively combine techniques for clustering, collaborative filtering, and topic modeling with dimensionality reduction to construct scatter plots that reveal the similarity between groups of data, movies, or documents? How can we arrive at context dependent visualization?
\item \emph{Visualization techniques should be accompanied by theoretical guarantees:}\par\noindent What are reasonable mathematical specifications of data visualization to shape this inherently ill-posed problem? Can this be controlled by the user in an efficient way? How can visualization be evaluated? What are reasonable benchmarks? What are reasonable evaluation measures?
\item \emph{Visualization techniques should be ready to use for users outside the field:}\par\noindent
Which methods are suited to users outside the field? How can the necessity be avoided to set specific technical parameters by hand or choose from different possible mathematical algorithms by hand? Can this necessity be substituted by intuitive interactive mechanisms which can be used by non-experts?
\end{itemize}
The goal of the workshop is to identify the state-of-the-art with respect to these challenges and to discuss possibilities to meet these demands with modern techniques.

Chat is not available.