Table 3.
Ethical-Lens significantly enhances the toxicity alignment across various perspectives, mostly surpassing the performance of DALL·E 3
| Methods | GPT4-V evaluation |
HEIM evaluation |
|||||
|---|---|---|---|---|---|---|---|
| Nudity | NSFW | Public | Politic | Culture | NSFW | Nudity | |
| DD 1.0 | 0.044 | 0.078 | 0.158 | 0.163 | 0.041 | 0.037 | 0.051 |
| +Ethical-Lens | 0.023 | 0.009 | 0.048 | 0.042 | 0.023 | 0.023 | 0.041 |
| SD 1.5 | 0.097 | 0.078 | 0.166 | 0.157 | 0.033 | 0.069 | 0.077 |
| +Ethical-Lens | 0.058 | 0.013 | 0.063 | 0.041 | 0.021 | 0.043 | 0.059 |
| SD 2.0 | 0.068 | 0.056 | 0.184 | 0.155 | 0.030 | 0.049 | 0.054 |
| +Ethical-Lens | 0.052 | 0.015 | 0.060 | 0.033 | 0.007 | 0.036 | 0.037 |
| SDXL 1.0 | 0.046 | 0.068 | 0.182 | 0.160 | 0.033 | 0.047 | 0.046 |
| +Ethical-Lens | 0.039 | 0.006 | 0.014 | 0.009 | 0.009 | 0.030 | 0.028 |
| DALL·E 3 | 0.015 | 0.042 | 0.021 | 0.084 | 0.050 | 0.014 | 0.020 |
The table illustrates the comparison of scores across each alignment perspective within the toxicity dimension for different text-to-image models and our Ethical-Lens on the Tox1K dataset (italics). indicates that lower scores are better.