Lawrence Berkeley National Laboratory - Recent Work
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Lawrence Berkeley National Laboratory Recent Work Title Euclidean Symmetry and Equivariance in Machine Learning Permalink https://escholarship.org/uc/item/7q29x3qh Journal Trends in Chemistry, 3(2) ISSN 2589-5974 Author Smidt, TE Publication Date 2021-02-01 DOI 10.1016/j.trechm.2020.10.006 Peer reviewed eScholarship.org Powered by the California Digital Library University of California
TRECHM 00209 No. of Pages 4 Trends in Chemistry Special Issue: Machine Learning for Molecules and Materials Forum sciences is by whether they use symmetry Euclidean Symmetry and, if so, where they use invariant (see Glossary 3D Euclidean symmetry: the symmetry group of and Equivariance in Glossary) versus equivariant operations. 3D space; includes 3D translations, rotations, and inversion. Machine Learning Between the two types of symmetry-aware 3D point groups: subgroups of 3D Euclidean models, invariant models eliminate coordi- symmetry; symmetry groups of molecules; includes 1,2, Tess E. Smidt * nate systems by dealing only with quantities rotations, mirrors, and inversion. that are invariant to the choice of coordinate 3D space groups: subgroups of 3D Euclidean system (scalars), while equivariant models symmetry; symmetry groups of crystals; includes rotations, mirrors, glides, screws, inversion, and preserve how quantities predictably change discrete translations. Understanding the role of symmetry under coordinate transformations. Equivariant: changes deterministically under in the physical sciences is critical for operation of group elements. choosing an appropriate machine- Equivariant model: a machine learning model that Invariance Is Easier to Deal with can handle equivariant quantities as input, intermediate learning method. While invariant Than Equivariance data, and output and uses equivariant (i.e., tensor) models are the most prevalent There is good reason for the popularity operations. symmetry-aware models, equivariant of invariant scalar features for machine Equivariant operations: operations on equivariant quantities (i.e., geometric tensors); for example, dot models such as Euclidean neural learning; scalars can be given to any ma- products, cross-products, tensor products; includes networks more faithfully represent chine learning algorithm without violating invariant operations. physical interactions and are ready symmetry. More practically, scalars are Euclidean neural networks: neural networks that are equivariant to 3D Euclidean symmetry. to take on challenges across the easier to handle than geometric tensors Geometric tensors: equivariant properties of 3D physical sciences. and invariant models are high (if not top) per- physical systems; includes invariant scalars. formers on many existing benchmarks [1]. Invariant: does not change under operation of group Euclidean Symmetry Is the Freedom elements. Neural networks: a subset of machine learning to Choose a Coordinate System It is common practice to use equivariant methods where models must be differentiable; model There is no inherent way to orient physical operations to generate invariant features for weights are updated using gradients of the error; also systems; yet we still need to choose a co- machine learning models. For example, known as deep learning. ordinate system to articulate their geometry SOAP [2] descriptors are equivariant func- and physical properties (Figure 1A). Unless tions that operate on the geometry and with geometric coordinates and any relevant coded otherwise, machine-learned models atom types of local atomic environments quantities you need to describe the system make no assumption of symmetry and are to produce invariant scalars (Figure 2A). (e.g., external fields, atom-wise properties sensitive to an arbitrary choice of coordinate Geometry in invariant models is reduced to such as velocities). Even if the desired system. To be able to recognize a 3D pat- bond lengths, bond angles, dihedral angles, target is a scalar, the interactions that yield tern in any orientation, such a model will and other scalar invariants of geometric that scalar may not be scalar in nature; for need to be shown roughly 500 rotated ver- elements; Figure 1B describes the invariant example, the only way to interact a particle sions of the same example, a process versus equivariant properties of 3D vectors. with charge q and velocity v with an external called data augmentation. One of the Invariant models can yield equivariant ! magnetic field B is to use the equivariant motivations of explicitly treating symmetry quantities, but only through gradients of the ! ! equivariant operations used in featurization cross-product, F ¼ q! v B , or the dif- in machine-learning models is to eliminate the need for data augmentation, so that [3]; this may not be practical or possible ference between the momentum ! p and the model can instead focus on, for depending on the quantity of interest. the charge-weighted vector potential ! → example, learning quantum mechanics. q A , H ¼ j! p − q A j2 =2m. To predict quan- In 3D space, we can transform between coor- When Invariance Is Not Enough tities that are fundamentally generated from dinate systems using elements of Euclidean Physical systems and their interactions equivariant interactions: (i) equivariant inter- symmetry [3D rotations, 3D translations, are inherently equivariant. With an invariant actions must be included in the scalar and inversion (x, y, z) → (−x, −y, −z), model, a featurization method must be featurization used for an invariant model which includes mirror symmetry]; hence, we used to represent a naturally equivariant (which requires knowing to include those say that 3D space has Euclidean symmetry. physical system in terms of invariant fea- interactions); or (ii) an equivariant model tures. With an equivariant model, a physical may be used, which may make more accu- One useful way of categorizing machine system can be articulated using the same rate or efficient predictions because it has learning models applied in the physical means used by many physical simulations: more expressive operations [4]. The more Trends in Chemistry, Month 2020, Vol. xx, No. xx 1
Trends in Chemistry (A) (B) (C) exotic or complex a property, the more likely there are non-trivial geometric tensor interactions at play (e.g., multipole interac- tions, antisymmetric exchange). Euclidean Neural Networks Are Equivariant Models Neural networks are one of the most Trends in Chemistry flexible machine learning methods since Figure 1. Examples of Equivariant and Invariant Properties of Geometric Objects in 3D. (A) A system the only requirement is that the network of molecules described by two different coordinate systems. (B) A 3D vector has a magnitude (orange), direction be differentiable. A neural network is a (pink), and location (purple), which are invariant (filled square) or equivariant (open square) under (table from top to function f that takes in inputs x and bottom) 3D translations, rotations, and inversion. The magnitude is a scalar and thus invariant to all of these trainable parameters W to produce outputs operations. (C) In Euclidean symmetry there are four ‘vector-like’ geometric tensors that transform distinctly under (table from top to bottom) inversion, perpendicular rotation by π, and parallel reflection: a vector, a y, f(x, W) = y. Given pairs of inputs x and pseudovector, a double-headed vector, and a helix [5]. target output ytrue, a neural network is Trends in Chemistry Figure 2. Visualizing Equivariant Properties as Spherical Harmonic Projections. (A) A simplified example of how invariant descriptors are derived from equivariant operations. The local atomic environments of the two carbons in an ethane molecule can be projected onto spherical harmonics (shown for L ≤ 6 with negative magnitudes set to zero). The coloring on the projection surface signifies the distance from the origin. The coefficients of the projection, organized in the visual by rows of equal degree L and columns of equal order m, form a geometric tensor in the irreducible basis (the same basis as spherical harmonics). One method of computing scalars from these two geometric tensors is by computing the ‘power spectrum’, which is the tensor generalization of the dot product. This results in seven scalars (one for each L used in the projection). (B) A symmetric 3 × 3 matrix and its visualization as a spherical harmonic signal. The magnitude of the signal at a given angle is interpreted as a radial distance and the projection is colored to signify the magnitude. For both (B) and (C), yellow indicates the maximum value and dark purple the minimum value, which in this case can be negative. (C) The output of three randomly initialized Euclidean neural networks when given a tetrahedron geometry (top) and octahedron geometry (bottom) as input. The outputs have equal or higher symmetry than the inputs. 2 Trends in Chemistry, Month 2020, Vol. xx, No. xx
Trends in Chemistry trained by computing derivatives of the for these varied purposes is how equivariant and so on. Likewise, the uses for Euclidean loss [e.g., L ¼ ðy − y true Þ2 ] with respect to operations and learnable equivariant symmetry equivariant machine learning trainable parameters W, and updating W ac- modules are composed. models may similarly surprise us in the ∂L as-yet-untapped modes of investigation cording to a learning rate η, W ¼ W þ η ∂w. Euclidean Neural Networks they offer. For example, we recently found First proposed in 2018, Euclidean neural Naturally Handle Geometric that the gradients of Euclidean neural networks (tensor field networks [6], Tensors, Point Groups, and networks can be used to find symmetry- Clebsch–Gordan nets [7], 3D steerable Space Groups implied ‘missing’ data unbeknown to the CNNs [8], and their descendants [9,10]) Euclidean neural networks provides a researcher (e.g., order parameters of are a flexible, general framework for learning mathematically rigorous framework for second-order phase transitions [13]). By 3D Euclidean symmetry equivariant func- articulating scientific questions in terms of bounding learnable functions of physical tions that can be trained on the context of geometric tensors and their tensor interac- data to be equivariant to 3D Euclidean a given dataset. To achieve equivariance tions. Inputs, outputs, and intermediate symmetry, we can ask more targeted ques- in Euclidean neural networks, scalar multi- data are completely specified by their tions with machine learning and explore the plication is replaced by the more general transformation properties. full-ranging consequences of our funda- tensor product and convolutional kernels mental assumption of symmetry. are restricted to be composed of spherical Geometric tensors take many forms and harmonics and learnable radial functions, can represent many different quantities: nu- Acknowledgments Wð! T.E.S. thanks Lucy Morrell, Sinéad Griffin, Josh r Þ ¼ RðjrjÞY lm ð^r Þ. Scalar nonlinearities merical geometric tensors, atomic orbitals, Rackers, and Risi Kondor for helpful discussions. must also be replaced with equivariant or projections of geometry. Figures 1C T.E.S. was supported by the Laboratory Directed equivalents. and 2B give additional examples of this Research and Development Program of Lawrence variety. Berkeley National Laboratory under US Department Due to these mathematical complexities, of Energy Contract No. DE-AC02-05CH11231. Euclidean neural networks can be challeng- Another particularly useful aspect of ing to implement; however, there are open- handling Euclidean symmetry in full generality Resources source implementations (e.g., e3nn is an is that you get all subgroup symmetries i https://doi.org/10.5281/zenodo.3723557 open-source PyTorch library that combines (e.g., 3D point groups, 3D space groups) 1 the implementations of [6,8] and imple- for free. Figure 2C shows an example of Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 947020, USA ments a variety of additional equivariant how the output of even randomly initialized 2Center for Advanced Mathematics for Energy Research layers and utility functions for converting Euclidean neural networks will have equal Applications (CAMERA), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA and visualizing geometric tensors)i. or higher symmetry than the input. *Correspondence: Since these networks intrinsically uphold any tsmidt@lbl.gov (T.E. Smidt). Equivariance has also been encoded in https://doi.org/10.1016/j.trechm.2020.10.006 other machine learning approaches such and all selection rules that occur in physical as kernel methods by using equivariant systems, they act as ‘symmetry compilers’ © 2020 Elsevier Inc. All rights reserved. kernels and an equivariant definition of co- that check your thinking about the data variance [11]. Euclidean neural networks types of your physical system. Using these References 1. Klicpera, J. et al. (2020) Directional Message Passing for can learn equivariant functions that generate models requires more forethought than a Molecular Graphs, International Conference on Learning traditional neural network; in exchange, Representations scalar invariants or equivariant kernels for 2. Bartók, A.P. et al. (2013) On representing chemical use with these traditional machine learning they cannot learn to do something that environments. Phys. Rev. B 87, 184115 does not symmetrically make sense. 3. Schütt, K. et al. (2017) Schnet: a continuous-filter methods. convolutional neural network for modeling quantum interactions. Adv. Neural Inf. Proces. Syst. 31, 991–1001 Euclidean neural networks can be used to Beginning to Uncover Features of 4. Miller, B.K. et al. (2020) Relevance of rotationally equivariant build end-to-end models for the prediction E(3) Equivariant Models convolutions for predicting molecular properties. arXiv Published online August 22, 2020. https://arxiv.org/abs/ of physical properties (e.g., molecular dy- Euclidean symmetry is a simple assumption 2008.08461 namics forces) from atomic geometries and that has many unintuitive consequences: 5. Hlinka, J. (2014) Eight types of symmetrically distinct vectorlike physical quantities. Phys. Rev. Lett. 113, initial atomic features (e.g., atom types). geometry, geometric tensors, normal 165502 They can be used to manipulate atomic modes, selection rules in spectroscopy, 6. Thomas, N. et al. (2018) Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. geometries and craft hierarchical features space groups, point groups, multipole in- arXiv Published online May 18, 2018. https://arxiv.org/abs/ [12]. The only difference between models teractions, second-order phase transitions, 1802.08219 Trends in Chemistry, Month 2020, Vol. xx, No. xx 3
Trends in Chemistry 7. Kondor, R. et al. (2018) Clebsch–Gordan nets: a fully 10. Fuchs, F.B. et al. (2020) SE(3)-transformers: 3D protein complexes. arXiv Published online June 5, 2020. Fourier space spherical convolutional neural network. roto-translation equivariant attention networks. arXiv https://arxiv.org/abs/2006.09275 Adv. Neural Inf. Proces. Syst. 32, 10117–10126 Published online June 22, 2020. https://arxiv.org/ 13. Smidt, T.E. et al. (2020) Finding symmetry breaking order 8. Weiler, M. et al. (2018) 3D steerable CNNs: learning abs/2006.10503 parameters with Euclidean neural networks. arXiv. rotationally equivariant features in volumetric data. Adv. 11. Grisafi, A. et al. (2018) Symmetry-adapted machine learn- Published online July 4, 2020. https://arxiv.org/abs/ Neural Inf. Proces. Syst. 32, 10402–10413 ing for tensorial properties of atomistic systems. Phys. 2007.02005 9. Anderson, B.M. et al. (2019) Cormorant: covariant molec- Rev. Lett. 120, 036002 ular neural networks. Adv. Neural Inf. Proces. Syst. 32, 12. Eismann, S. et al. (2020) Hierarchical, rotation- 14510–14519 equivariant neural networks to predict the structure of 4 Trends in Chemistry, Month 2020, Vol. xx, No. xx
You can also read