Title
Selecting XFEL single-particle snapshots by geometric machine learning
Date Issued
01 January 2021
Access level
open access
Resource Type
journal article
Author(s)
Hosseinizadeh A.
Mashayekhi G.
Fung R.
Ourmazd A.
Schwander P.
Department of Physics, University of Wisconsin-Milwaukee, 3135 N. Maryland Ave, Milwaukee, 53211, WI, United States
Publisher(s)
American Crystallographic Association
Abstract
A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a sufficiently large number of diffraction patterns of individual molecules, so-called single particles. Here, we present a method that allows for efficient identification of single particles in very large XFEL datasets, operates at low signal levels, and is tolerant to background. This method uses supervised Geometric Machine Learning (GML) to extract low-dimensional feature vectors from a training dataset, fuse test datasets into the feature space of training datasets, and separate the data into binary distributions of "single particles"and "non-single particles."As a proof of principle, we tested simulated and experimental datasets of the Coliphage PR772 virus. We created a training dataset and classified three types of test datasets: First, a noise-free simulated test dataset, which gave near perfect separation. Second, simulated test datasets that were modified to reflect different levels of photon counts and background noise. These modified datasets were used to quantify the predictive limits of our approach. Third, an experimental dataset collected at the Stanford Linear Accelerator Center. The single-particle identification for this experimental dataset was compared with previously published results and it was found that GML covers a wide photon-count range, outperforming other single-particle identification methods. Moreover, a major advantage of GML is its ability to retrieve single particles in the presence of structural variability.
Volume
8
Issue
1
Language
English
OCDE Knowledge area
Física de partículas, Campos de la Física Ingeniería, Tecnología
Scopus EID
2-s2.0-85101230143
Source
Structural Dynamics
Sponsor(s)
We acknowledge valuable discussions with I. Poudyal and M. Schmidt. The development of underlying techniques was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award No. DE-SC0002164 (underlying dynamical techniques), by the U.S. National Science Foundation under Award Nos. STC 1231306 (underlying data analytical techniques) and DBI-2029533 (underlying analytical models), and by the UWM Research Growth Initiative.
Sources of information: Directorio de Producción Científica Scopus