Timezone: »

GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization
Xuhai Xu · Han Zhang · Yasaman Sefidgar · Yiyi Ren · Xin Liu · Woosuk Seo · Jennifer Brown · Kevin Kuehn · Mike Merrill · Paula Nurius · Shwetak Patel · Tim Althoff · Margaret Morris · Eve Riskin · Jennifer Mankoff · Anind Dey

Tue Nov 29 02:00 PM -- 04:00 PM (PST) @ Hall J #1031

Recent research has demonstrated the capability of behavior signals captured by smartphones and wearables for longitudinal behavior modeling. However, there is a lack of a comprehensive public dataset that serves as an open testbed for fair comparison among algorithms. Moreover, prior studies mainly evaluate algorithms using data from a single population within a short period, without measuring the cross-dataset generalizability of these algorithms. We present the first multi-year passive sensing datasets, containing over 700 user-years and 497 unique users’ data collected from mobile and wearable sensors, together with a wide range of well-being metrics. Our datasets can support multiple cross-dataset evaluations of behavior modeling algorithms’ generalizability across different users and years. As a starting point, we provide the benchmark results of 18 algorithms on the task of depression detection. Our results indicate that both prior depression detection algorithms and domain generalization techniques show potential but need further research to achieve adequate cross-dataset generalizability. We envision our multi-year datasets can support the ML community in developing generalizable longitudinal behavior modeling algorithms.

Author Information

Xuhai Xu (University of Washington)
Han Zhang (Department of Computer Science, University of Washington)
Han Zhang

Han is a PhD student in the Paul G. Allen School of Computer Science & Engineering. Her research interests lie in human behavior modeling and the design of computer-aided methods to improve human well-being. She is also interested in fairness in machine learning and building explainable ML. Her recent projects focus on early predicting the academic performance of college students by leveraging multiple types of data, as well as understanding their academic-related behaviors.

Yasaman Sefidgar (University of Washington)
Yiyi Ren (University of Washington)
Xin Liu (University of Washington)
Woosuk Seo (University of Michigan - Ann Arbor)
Jennifer Brown (University of Washington)
Kevin Kuehn (University of Washington)
Mike Merrill (Department of Computer Science, University of Washington)
Paula Nurius
Shwetak Patel (University of Washington)
Tim Althoff (University of Washington)


Margaret Morris (University of Washington)
Eve Riskin (University of Washington)
Jennifer Mankoff (University of Washington)
Anind Dey (University of Washington)

More from the Same Authors