NeurIPS Evaluating Large-Scale Learning Systems, Virginia Smith

Plenary speaker
in
Workshop: OPT 2023: Optimization for Machine Learning

Evaluating Large-Scale Learning Systems, Virginia Smith

Virginia Smith

[ Abstract ]

Abstract:

Abstract: To deploy machine learning models in practice it is critical to have a way to reliably evaluate their effectiveness. Unfortunately, the scale and complexity of modern machine learning systems makes it difficult to provide faithful evaluations and gauge performance across potential deployment scenarios. In this talk I discuss our work addressing challenges in large-scale ML evaluation. First, I explore the problem of hyperparameter optimization in federated networks of devices, where issues of device subsampling, heterogeneity, and privacy can introduce noise in the evaluation process and make it challenging to effectively perform optimization. Second, I present ReLM, a system for validating and querying large language models (LLMs). Although LLMs have been touted for their ability to generate natural-sounding text, there is a growing need to evaluate the behavior of LLMs in light of issues such as data memorization, bias, and inappropriate language. ReLM poses LLM validation queries as regular expressions to enable faster and more effective LLM evaluation.

Chat is not available.

Plenary speaker in Workshop: OPT 2023: Optimization for Machine Learning

Evaluating Large-Scale Learning Systems, Virginia Smith

Virginia Smith

Plenary speaker
in
Workshop: OPT 2023: Optimization for Machine Learning