Timezone: »

Poster
Constrained Robust Submodular Partitioning
Shengjie Wang · Tianyi Zhou · Chandrashekhar Lavania · Jeff A Bilmes

Thu Dec 09 04:30 PM -- 06:00 PM (PST) @
In the robust submodular partitioning problem, we aim to allocate a set of items into $m$ blocks, so that the evaluation of the minimum block according to a submodular function is maximized. Robust submodular partitioning promotes the diversity of every block in the partition. It has many applications in machine learning, e.g., partitioning data for distributed training so that the gradients computed on every block are consistent. We study an extension of the robust submodular partition problem with additional constraints (e.g., cardinality, multiple matroids, and/or knapsack) on every block. For example, when partitioning data for distributed training, we can add a constraint that the number of samples of each class is the same in each partition block, ensuring data balance. We present two classes of algorithms, i.e., Min-Block Greedy based algorithms (with an $\Omega(1/m)$ bound), and Round-Robin Greedy based algorithms (with a constant bound) and show that under various constraints, they still have good approximation guarantees. Interestingly, while normally the latter runs in only weakly polynomial time, we show that using the two together yields strongly polynomial running time while preserving the approximation guarantee. Lastly, we apply the algorithms on a real-world machine learning data partitioning problem showing good results.

#### Author Information

##### Tianyi Zhou (University of Washington, Seattle)

Tianyi Zhou (https://tianyizhou.github.io) is a tenure-track assistant professor of computer science at the University of Maryland, College Park. He received his Ph.D. from the school of computer science & engineering at the University of Washington, Seattle. His research interests are in machine learning, optimization, and natural language processing (NLP). His recent works study curriculum learning that can combine high-level human learning strategies with model training dynamics to create a hybrid intelligence. The applications include semi/self-supervised learning, robust learning, reinforcement learning, meta-learning, ensemble learning, etc. He published >80 papers and is a recipient of the Best Student Paper Award at ICDM 2013 and the 2020 IEEE Computer Society TCSC Most Influential Paper Award.