Table 7.
Case study on gathered HardTest. Each example has a binary “True Label”, with “1” denoting offensive content. This table includes the offensiveness probability assigned by COLD-R Mac and MuDA Mix, as well as the prediction from InstrucGPT and BaiduTC.
