Skip to main content
. 2022 Jul 20;4(5):e210299. doi: 10.1148/ryai.210299

Figure 2:

CheXDet architecture. An EfficientNet backbone is used for feature extraction, which also downsamples the data in width and height. The multiscale features (ie, p2, p3, p4, p5, and p6) are then fed into three bidirectional feature pyramid network (BiFPN) layers for information aggregation and enrichment. The bidirectional feature pyramid network introduces top-down feature aggregation (red arrows), bottom-up feature aggregation (green arrows), and feature aggregation from the same scales (blue arrows). Next, a region proposal network (RPN) module and a region of interest (ROI) alignment module are used to generate bounding-box proposals based on the bidirectional feature pyramid network features. The proposal features are further fed into four convolutional (conv) layers. Finally, two fully connected layers conduct classification and regression based on the proposals, respectively, and generate the predictions.

CheXDet architecture. An EfficientNet backbone is used for feature extraction, which also downsamples the data in width and height. The multiscale features (ie, p2, p3, p4, p5, and p6) are then fed into three bidirectional feature pyramid network (BiFPN) layers for information aggregation and enrichment. The bidirectional feature pyramid network introduces top-down feature aggregation (red arrows), bottom-up feature aggregation (green arrows), and feature aggregation from the same scales (blue arrows). Next, a region proposal network (RPN) module and a region of interest (ROI) alignment module are used to generate bounding-box proposals based on the bidirectional feature pyramid network features. The proposal features are further fed into four convolutional (conv) layers. Finally, two fully connected layers conduct classification and regression based on the proposals, respectively, and generate the predictions.