Timezone: »

 
Poster
Tanimoto Random Features for Scalable Molecular Machine Learning
Austin Tripp · Sergio Bacallado · Sukriti Singh · José Miguel Hernández-Lobato

Wed Dec 13 08:45 AM -- 10:45 AM (PST) @ Great Hall & Hall B1+B2 #1217
Event URL: https://github.com/AustinT/tanimoto-random-features-neurips23 »

The Tanimoto coefficient is commonly used to measure the similarity between molecules represented as discrete fingerprints,either as a distance metric or a positive definite kernel. While many kernel methods can be accelerated using random feature approximations, at present there is a lack of such approximations for the Tanimoto kernel. In this paper we propose two kinds of novel random features to allow this kernel to scale to large datasets, and in the process discover a novel extension of the kernel to real-valued vectors. We theoretically characterize these random features, and provide error bounds on the spectral norm of the Gram matrix. Experimentally, we show that these random features are effective at approximating the Tanimoto coefficient of real-world datasetsand are useful for molecular property prediction and optimization tasks. Future updates to this work will be available at http://arxiv.org/abs/2306.14809.

Author Information

Austin Tripp (University of Cambridge)
Sergio Bacallado (University of Cambridge)
Sukriti Singh (University of Cambridge)
José Miguel Hernández-Lobato (University of Cambridge)

More from the Same Authors