1: |
θ ← 4.5 |
▷ Initialize the reward threshold. |
2: |
h[1: 10] ← [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] |
▷ Initialize the recent reward history. |
3: |
for each second’s vocalization do
|
|
4: |
if
S > θ
then
|
▷ If salience is high, reward. |
5: |
r ← 1 |
|
6: |
else
|
|
7: |
r ← 0 |
|
8: |
h[11] ← r
|
▷ Update the recent reward history. |
9: |
h ← h[2: 11] |
|
10: |
if
then
|
▷ If the recent reward rate is 30% or higher |
11: |
θ ← θ + .1 |
▷ increase the reward threshold |
12: |
h[1: 10] ← [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] |
▷ and reset the recent reward history. |