Workshop: First Workshop on Quantum Tensor Networks in Machine Learning

Xiao-Yang Liu, Qibin Zhao, Jacob Biamonte, Cesar F Caiafa, Paul Pu Liang, Nadav Cohen, Stefan Leichenauer

2020-12-11T08:00:00-08:00 - 2020-12-11T19:00:00-08:00
Abstract: Quantum tensor networks in machine learning (QTNML) are envisioned to have great potential to advance AI technologies. Quantum machine learning promises quantum advantages (potentially exponential speedups in training, quadratic speedup in convergence, etc.) over classical machine learning, while tensor networks provide powerful simulations of quantum machine learning algorithms on classical computers. As a rapidly growing interdisciplinary area, QTNML may serve as an amplifier for computational intelligence, a transformer for machine learning innovations, and a propeller for AI industrialization.

Tensor networks, a contracted network of factor tensors, have arisen independently in several areas of science and engineering. Such networks appear in the description of physical processes and an accompanying collection of numerical techniques have elevated the use of quantum tensor networks into a variational model of machine learning. Underlying these algorithms is the compression of high-dimensional data needed to represent quantum states of matter. These compression techniques have recently proven ripe to apply to many traditional problems faced in deep learning. Quantum tensor networks have shown significant power in compactly representing deep neural networks, and efficient training and theoretical understanding of deep neural networks. More potential QTNML technologies are rapidly emerging, such as approximating probability functions, and probabilistic graphical models. However, the topic of QTNML is relatively young and many open problems are still to be explored.

Quantum algorithms are typically described by quantum circuits (quantum computational networks). These networks are indeed a class of tensor networks, creating an evident interplay between classical tensor network contraction algorithms and executing tensor contractions on quantum processors. The modern field of quantum enhanced machine learning has started to utilize several tools from tensor network theory to create new quantum models of machine learning and to better understand existing ones.

The interplay between tensor networks, machine learning and quantum algorithms is rich. Indeed, this interplay is based not just on numerical methods but on the equivalence of tensor networks to various quantum circuits, rapidly developing algorithms from the mathematics and physics communities for optimizing and transforming tensor networks, and connections to low-rank methods for learning. A merger of tensor network algorithms with state-of-the-art approaches in deep learning is now taking place. A new community is forming, which this workshop aims to foster.

Video

Chat

Chat is not available.

Schedule

2020-12-11T08:00:00-08:00 - 2020-12-11T08:05:00-08:00
Opening Remarks
Xiao-Yang Liu
A short introduction
2020-12-11T08:05:00-08:00 - 2020-12-11T08:37:00-08:00
Talk 1: Expressiveness in Deep Learning via Tensor Networks and Quantum Entanglement
Nadav Cohen
Understanding deep learning calls for addressing three fundamental questions: expressiveness, optimization and generalization. This talk will describe a series of works aimed at unraveling some of the mysteries behind expressiveness. I will begin by showing that state of the art deep learning architectures, such as convolutional networks, can be represented as tensor networks --- a prominent computational model for quantum many-body simulations. This connection will inspire the use of quantum entanglement for defining measures of data dependencies modeled by deep networks. Next, I will turn to derive a quantum max-flow / min-cut theorem characterizing the entanglement captured by deep networks. The theorem will give rise to new results that shed light on expressiveness in deep learning, and in addition, provide new tools for deep network design. Works covered in the talk were in collaboration with Yoav Levine, Or Sharir, Ronen Tamari, David Yakira and Amnon Shashua.
2020-12-11T08:37:00-08:00 - 2020-12-11T08:45:00-08:00
Talk 1 Q&A
2020-12-11T08:45:00-08:00 - 2020-12-11T09:30:00-08:00
Talk 2: TBD (By Prof. Anima)
Animashree Anandkumar
TBD
2020-12-11T09:30:00-08:00 - 2020-12-11T10:15:00-08:00
Talk 3: Quantum in ML and ML in Quantum
Ivan Oseledets
In this talk, I will cover recent results in two areas: 1) Using quantum-inspired methods in machine learning, including using low-entanglement states (matrix product states/tensor train decompositions) for different regression and classification tasks. 2) Using machine learning methods for efficient classical simulation of quantum systems. I will cover our results on simulating quantum circuits on parallel computers using graph-based algorithms, and also efficient numerical methods for optimization using tensor-trains for the computational of large number (up to B=100) on GPUs. The code is a combination of classical linear algebra algorithms, Riemannian optimization methods and efficient software implementation in TensorFlow. 1. Rakhuba, M., Novikov, A. and Oseledets, I., 2019. Low-rank Riemannian eigensolver for high-dimensional Hamiltonians. Journal of Computational Physics, 396, pp.718-737. 2. Schutski, Roman, Danil Lykov, and Ivan Oseledets. Adaptive algorithm for quantum circuit simulation. Physical Review A 101, no. 4 (2020): 042335. 3. Khakhulin, Taras, Roman Schutski, and Ivan Oseledets. Graph Convolutional Policy for Solving Tree Decomposition via Reinforcement Learning Heuristics. arXiv preprint arXiv:1910.08371 (2019).
2020-12-11T10:15:00-08:00 - 2020-12-11T10:25:00-08:00
Talk 3 Q&A
2020-12-11T10:25:00-08:00 - 2020-12-11T10:52:00-08:00
Talk 4: A Century of the Tensor Network Formulation from the Ising Model
Tomotoshi Nishino
A hundred years have passed since Ising model was proposed by Lenz in 1920. One finds that the square lattice Ising model is already an example of two-dimensional tensor network (TN), which is formed by contracting 4-leg tensors. In 1941, Kramers and Wannier assumed a variational state in the form of the matrix product state (MPS), and they optimized it `numerically'. Baxter reached the concept of the corner-transfer matrix (CTM), and performed a variational computation in 1968. Independently from these statistical studies, MPS was introduced by Affleck, Lieb, Kennedy and Tasaki (AKLT) in 1987 for the study of one-dimensional quantum spin chain, by Derrida for asymetric exclusion processes, and also (implicitly) by the establishment of the density matrix renormalization group (DMRG) by White in 1992. After a brief (?) introduction of these prehistories, I'll speak about my contribution to this area, the applications of DMRG and CTMRG methods to two-dimensional statistical models, including those on hyperbolic lattices, fractal systems, and random spin models. Analysis of the spin-glass state, which is related to learning processes, from the view point of the entanglement structure would be a target of future studies in this direction.
2020-12-11T10:50:00-08:00 - 2020-12-11T11:30:00-08:00
Talk 5: Getting Started with Tensor Networks
Glen Evenbly
I will provide an overview of the tensor network formalism and its applications, and discuss the key operations, such as tensor contractions, required for building tensor network algorithms. I will also demonstrate the TensorTrace graphical interface, a software tool which is designed to allow users to implement and code tensor network routines easily and effectively. Finally, the utility of tensor networks towards tasks in machine learning will be briefly discussed.
2020-12-11T11:30:00-08:00 - 2020-12-11T12:10:00-08:00
Talk 6: Tensor Network Models for Structured Data
Guillaume Rabusseau
In this talk, I will present uniform tensor network models (also known translation invariant tensor networks) which are particularly suited for modelling structured data such as sequences and trees. Uniform tensor networks are tensor networks where the core tensors appearing in the decomposition of a given tensor are all equal, which can be seen as a weight sharing mechanism in tensor networks. In the first part of the talk, I will show how uniform tensor networks are particularly suited to represent functions defined over sets of structured objects such as sequences and trees. I will then present how these models are related to classical computational models such as hidden Markov models, weighted automata, second-order recurrent neural networks and context free grammars. In the second part of the talk, I will present a classical learning algorithm for weighted automata and show how and it can be interpreted as a mean to convert non-uniform tensor networks to uniform ones. Lastly, I will present ongoing work leveraging the tensor network formalism to design efficient and versatile probabilistic models for sequence data.
2020-12-11T12:10:00-08:00 - 2020-12-11T13:10:00-08:00
Panel Discussion 1: Theoretical, Algorithmic and Physical
Jacob Biamonte, Xiao-Yang Liu, Nadav Cohen, Martin Ganahl, Glen Evenbly, Ivan Oseledets, Paul Springer
Theoretical, Algorithmic and Physical discussions of quantum tensor networks in machine learning;
2020-12-11T12:10:00-08:00 - None
Panel Discussion 2: Software and High Performance Implementation
Software and High Performance Implementation: Quantum Tensor Networks in Machine Learning.
2020-12-11T13:10:00-08:00 - 2020-12-11T13:50:00-08:00
Talk 7: cuTensor: High-Performance CUDA Tensor Primitives
Paul Springer
We'll discuss cuTENSOR, a high-performance CUDA library for tensor operations that efficiently handles the ubiquitous presence of high-dimensional arrays (i.e., tensors) in today's HPC and DL workloads. This library supports highly efficient tensor operations such as tensor contractions (a generalization of matrix-matrix multiplications), point-wise tensor operations such as tensor permutations, and tensor decompositions (a generalization of matrix decompositions). While providing high performance, cuTENSOR also allows users to express their mathematical equations for tensors in a straightforward way that hides the complexity of dealing with these high-dimensional objects behind an easy-to-use API.
2020-12-11T13:50:00-08:00 - 2020-12-11T14:30:00-08:00
Talk 8: TensorNetwork: A Python Package for Tensor Network Computations
Martin Ganahl
TensorNetwork is an open source python package for tensor network computations. It has been designed with the goal in mind to help researchers and engineers with rapid development of highly efficient tensor network algorithms for physics and machine learning applications. After a brief introduction to tensor networks, I will discuss some of the main design principles of the TensorNetwork package, and show how one can use it to speed up tensor network algorithms by running them on accelerated hardware, or by exploiting tensor sparsity.
2020-12-11T14:30:00-08:00 - 2020-12-11T15:10:00-08:00
Talk 9: Tensor Methods for Efficient and Interpretable Spatiotemporal Learning
Rose Yu
Multivariate spatiotemporal data is ubiquitous in science and engineering, from climate science to sports analytics, to neuroscience. Such data contain higher-order correlations and can be represented as a tensor. Tensor latent factor models provide a powerful tool for reducing dimensionality and discovering higher-order structures. However, existing tensor models are often slow or fail to yield interpretable latent factors. In this talk, I will demonstrate advances in tensor methods to generate interpretable latent factors for high-dimensional spatiotemporal data. We provide theoretical guarantees and demonstrate their applications to real-world climate, basketball, and neuroscience data.
2020-12-11T15:10:00-08:00 - 2020-12-11T15:50:00-08:00
Talk 10: Tensor Networks as a Data Structure in Probabilistic Modeling and for Learning Dynamical Laws from Data
Jens Eisert
Recent years have enjoyed a significant interest in exploiting tensor networks in the context of machine learning, both as a tool for the formulation of new learning algorithms and for enhancing the mathematical understanding of existing methods. In this talk, we will explore two readings of such a connection. On the one hand, we will consider the task of identifying the underlying non-linear governing equations, required both for obtaining an understanding and making future predictions. We will see that this problem can be addressed in a scalable way making use of tensor network based parameterizations for the governing equations. On the other hand, we will investigate the expressive power of tensor networks in probabilistic modelling. Inspired by the connection of tensor networks and machine learning, and the natural correspondence between tensor networks and probabilistic graphical models, we will provide a rigorous analysis of the expressive power of various tensor-network factorizations of discrete multivariate probability distributions. Joint work with A. Goeßmann, M. Götte, I. Roth, R. Sweke, G. Kutyniok, I. Glasser, N. Pancotti, J. I. Cirac.
2020-12-11T15:50:00-08:00 - 2020-12-11T16:30:00-08:00
Talk 11: Tensor networks and counting problems on the lattice
Frank Verstraete
An overview will be given of counting problems on the lattice, such as the calculation of the hard square constant and of the residual entropy of ice. Unlike Monte Carlo techniques which have difficulty in calculating such quantities, we will demonstrate that tensor networks provide a natural framework for tackling these problems. We will also show that tensor networks reveal nonlocal hidden symmetries in those systems, and that the typical critical behaviour is witnessed by matrix product operators which form representations of tensor fusion categories.
2020-12-11T16:30:00-08:00 - 2020-12-11T16:42:00-08:00
Contributed Talk 1: Paper 3: Tensor network approaches for data-driven identification of non-linear dynamical laws
Alex Goeßmann
To date, scalable methods for data-driven identification of non-linear governing equations do not exploit or offer insight into fundamental underlying physical structure. In this work, we show that various physical constraints can be captured via tensor network based parameterizations for the governing equation, which naturally ensures scalability. In addition to providing analytic results motivating the use of such models for realistic physical systems, we demonstrate that efficient rank-adaptive optimization algorithms can be used to learn optimal tensor network models without requiring a~priori knowledge of the exact tensor ranks.
2020-12-11T16:42:00-08:00 - 2020-12-11T16:54:00-08:00
Contributed Talk 2: Paper 6: Anomaly Detections with Tensor Networks
Jensen Wang
Originating from condensed matter physics, tensor networks are compact representations of high-dimensional tensors. In this paper, the prowess of tensor networks is demonstrated on the particular task of one-class anomaly detection. We exploit the memory and computational efficiency of tensor networks to learn a linear transformation over a space with dimension exponential in the number of original features. The linearity of our model enables us to ensure a tight fit around training instances by penalizing the model's global tendency to predict normality via its Frobenius norm---a task that is infeasible for most deep learning models. Our method outperforms deep and classical algorithms on tabular datasets and produces competitive results on image datasets, despite not exploiting the locality of images.
2020-12-11T16:54:00-08:00 - 2020-12-11T17:06:00-08:00
Contributed Talk 3: Paper 19: Deep convolutional tensor network
Philip Blagoveschensky
Neural networks have achieved state of the art results in many areas, supposedly due to parameter sharing, locality, and depth. Tensor networks (TNs) are linear algebraic representations of quantum many-body states based on their entanglement structure. TNs have found use in machine learning. We devise a novel TN based model called Deep convolutional tensor network (DCTN) for image classification, which has parameter sharing, locality, and depth. It is based on the Entangled plaquette states (EPS) TN. We show how EPS can be implemented as a backpropagatable layer. We test DCTN on MNIST, FashionMNIST, and CIFAR10 datasets. A shallow DCTN performs well on MNIST and FashionMNIST and has a small parameter count. Unfortunately, depth increases overfitting and thus decreases test accuracy. Also, DCTN of any depth performs badly on CIFAR10 due to overfitting. It is to be determined why. We discuss how the hyperparameters of DCTN affect its training and overfitting.
2020-12-11T17:06:00-08:00 - 2020-12-11T17:18:00-08:00
Contributed Talk 4: Paper 27: Limitations of gradient-based Born Machine over tensornetworks on learning quantum nonlocality
Khadijeh Najafi
Nonlocality is an important constituent of quantum physics which lies at the heart of many striking features of quantum states such as entanglement. An important category of highly entangled quantum states are Greenberger-Horne-Zeilinger (GHZ) states which play key roles in various quantum-based technologies and are particularly of interest in benchmarking noisy quantum hardwares. A novel quantum inspired generative model known as Born Machine which leverages on probabilistic nature of quantum physics has shown a great success in learning classical and quantum data over tensor network (TN) architecture. To this end, we investigate the task of training the Born Machine for learning the GHZ state over two different architectures of tensor networks. Our result indicates that gradient-based training schemes over TN Born Machine fails to learn the non-local information of the coherent superposition (or parity) of the GHZ state. This leads to an important question of what kind of architecture design, initialization and optimization schemes would be more suitable to learn the non-local information hidden in the quantum state and whether we can adapt quantum-inspired training algorithms to learn such quantum states.
2020-12-11T17:18:00-08:00 - 2020-12-11T17:30:00-08:00
Contributed Talk 5: Paper 32: High-order Learning Model via Fractional Tensor Network Decomposition
Chao Li
We consider high-order learning models, of which the weight tensor is represented by (symmetric) tensor network~(TN) decomposition. Although such models have been widely used on various tasks, it is challenging to determine the optimal order in complex systems (e.g., deep neural networks). To tackle this issue, we introduce a new notion of \emph{fractional tensor network~(FrTN)} decomposition, which generalizes the conventional TN models with an integer order by allowing the order to be an arbitrary fraction. Due to the density of fractions in the field of real numbers, the order of the model can be formulated as a learnable parameter and simply optimized by stochastic gradient descent~(SGD) and its variants. Moreover, it is uncovered that FrTN strongly connects to well-known methods such as $\ell_p$-pooling~\cite{gulcehre2014learned} and ``squeeze-and-excitation''~\cite{hu2018squeeze} operations in the deep learning studies. On the numerical side, we apply the proposed model to enhancing the classic ResNet-26/50~\cite{he2016deep} and MobileNet-v2~\cite{sandler2018mobilenetv2} on both CIFAR-10 and ILSVRC-12 classification tasks, and the results demonstrate the effectiveness brought by the learnable order parameters in FrTN.
2020-12-11T17:30:00-08:00 - 2020-12-11T18:10:00-08:00
Talk 12: Learning Quantum Channels with Tensor Networks
Giacomo Torlai
We present a new approach to quantum process tomography, the reconstruction of an unknown quantum channel from measurement data. Specifically, we combine a tensor-network representation of the Choi matrix (a complete description of a quantum channel), with unsupervised machine learning of single-shot projective measurement data. We show numerical experiments for both unitary and noisy quantum circuits, for a number of qubits well beyond the reach of standard process tomography techniques.
2020-12-11T18:10:00-08:00 - 2020-12-11T18:50:00-08:00
Talk 13: High Performance Computation for Tensor Networks Learning
Anwar Walid, Xiao-Yang Liu
In this talk, we study high performance computation for tensor networks to address time and space complexities that grow rapidly with the tensor size. We propose efficient primitives that exploit parallelism in tensor learning for efficient implementation on GPU.
2020-12-11T18:50:00-08:00 - 2020-12-11T19:00:00-08:00
Closing Remarks
Xiao-Yang Liu
TBD