Skip to main content
. Author manuscript; available in PMC: 2022 Jun 2.
Published in final edited form as: Proc Conf. 2021 Jun;2021:4972–4984. doi: 10.18653/v1/2021.naacl-main.395

Algorithm 1.

Used to compute a probability score for a text document D given a masked language model M. The output of the model returned by a call to Forward is a matrix where each row maps to a distribution over all the tokens in the vocabulary. The Append function adds a value to the end of a list.

procedure Masked-Prob(D, M)
 sents ← Sentence-Split(D)
P ← Initialize empty list
for i = 1 … |sents| do
  T ← Tokenize(sents[i])
  for j = 1 … 10 do
   A ← sample 15% from 1… |T|
   T′T
   for all aA do
    T′[a] ← [MASK]
   outputs ← Forward(M, T′)
   for all aA do
    prob ← outputs[a][T[a]]
    Append(P, prob)
return mean(P)