Table 4.
Samples of image captions generated by our PW, CW, and baseline as well as ground truths.
| Image | Captions |
|---|---|
|
Baseline: A couple of women standing next to each other. Our PW: Two women standing next to each other holding wine glasses. Our CW: Two women drinking wine in a room. GT1: Two young women are sharing a bottle of wine. GT2: Two female friends posing with a bottle of wine. GT3: Two women posing for a photo with drinks in hand. |
|
Baseline: A group of people walking down a street. Our PW: A group of people standing in the street with an umbrella. Our CW: A group of people standing under an umbrella. GT1: Several people standing on a sidewalk under an umbrella. GT2: Some people standing on a dark street with an umbrella. GT3: Some people standing on a dark street with an umbrella. |
|
Baseline: A close up of a horse in a field. Our PW: A white horse standing in the grass in a field. Our CW: A white horse grazing in a field of grass. GT1: A horse eating grass in a green field. GT2: A while horse bending down eating grass. GT3: A tall black and white horse standing on a lush green field. |
|
Baseline: A group of people on skis in the snow. Our PW: A group of people riding skis down a snow covered slope. Our CW: Two men are skiing down a snow covered slope. GT1: Two cross country skiers heading onto the trail. GT2: Two guys cross country ski in a race. GT3: Skiers on their skis ride on the slope while others watch. |