Fig. 1. The TranSPHIRE pipeline and the SPHIRE backend.
a Upper register (solid line): Overview of the integrated TranSPHIRE pipeline and all automated processing steps. The pipeline includes file management tasks, i.e., parallelized data transfer, file compression, and file backup (gray); 2D processing, i.e., motion correction, CTF estimation, particle picking, 2D clustering, and 2D class selection (turquoise); and 3D processing, i.e., ab initio 3D reconstruction and 3D refinement (red). Additionally, the pipeline includes an automated feedback loop optimization to adapt picking to the current data set during runtime (purple). Lower register (dotted line): The SPHIRE software package forms the backend for TranSPHIRE and offers the tools used for 2D and 3D processing. SPHIRE includes additional tools for advanced processing, such as heterogeneity analysis and local resolution determination. b The TranSPHIRE feedback loop. Gray arrows indicate the flow of data processing. Purple arrows indicate the flow of the feedback loop. Left (input): Micrographs are initially picked using the crYOLO general model. Center (processing): Particles are picked and extracted. Once a pre-defined number of particles have been accumulated, the pipeline performs 2D classification; the resulting 2D class averages are labeled as either “good” or “bad” by Cinderella. Class labels and crYOLO box files are then used to re-train crYOLO and adapt its internal model to the processed data. In the next feedback round this updated model is used to re-pick the data. Right (output): After five feedback rounds, the complete data set is picked with the final optimized picking model and 2D classified in batches. For every batch a particles stack of “good” particles is created and available for 3D processing.