Proposed Masked-LMCTrans (multimodality coattentional convolutional neural network transformer) for 1% extremely ultra-low-dose PET/MRI reconstruction. (A) Framework of Masked-LMCTrans. The referenced baseline PET (with tumor area masked out as covered in the yellow mask) and MRI, along with the follow-up 1% PET/MRI scans, are fed into the model as combined inputs. The DenseNet feature encoder encodes PET and MRI separately before aggregation, with batch normalization, rectified linear unit, and 3 × 3 convolution (BN-ReLU-Conv) composite operations and dense collectivities. The coattentional transformer block fuses the information from the baseline and the follow-up (as indicated by the feature maps colored in orange and blue, respectively; the fused feature maps in the latter layers are mixed colored). The fusion is performed through baseline and follow-up information exchange by query, key, and value (denoted as Q, K, V). In this manner, Masked-LMCTrans reconstructs a 1% follow-up PET image, making use of the longitudinal similarity. (B) Representative posttreatment fluorine 18 fluorodeoxyglucose PET/MRI scan in a 14-year-old male patient with Hodgkin lymphoma. The contrast and structural details are significantly improved on Masked-LMCTrans–reconstructed PET as opposed to the simulated 1% PET. The red bounding box shows the spine anatomic structure, which is completely missing in the simulated 1% PET but successfully reconstructed by Masked-LMCTrans, with the help of the referenced baseline PET. The small tumor around the left supraclavicular region (arrow) in the baseline PET was resolved after treatment and was not shown on the reconstructed PET.