Correction to: Scientific Reports https://doi.org/10.1038/s41598-022-23052-9, published online 08 November 2022
The original version of this Article contained errors in Figures 2 and 7 where the images did not display correctly. The original Figures 2 and 7 and accompanying legends appear below.
Figure 2.
Schematic overview of multi-modal retrieval (left) and zero-shot classification tasks (right).
Figure 7.
Item bounding-box detection. Localization maps can be easily extended to provide zero-shot bounding boxes for items of interest. Green bounding boxes show the predicted locations for fashion concepts “Backpack” (left) and “Straw hat” (right). Images above are taken from the publicly available Unsplash Lite Dataset 1.2.0: FashionCLIP was tested extensively on ModaNet - please reach out to authors for links to those images.
The original Article has been corrected.


