Algorithm 1.
LinUCB (Li et al., 2010)
| 0: Inputs: α ∈ ℝ+ |
| 1: for t = 1, 2, 3, …, T do |
| 2: Observe features of all arms a ∈ t: xt,a ∈ ℝd |
| 3: for all a ∈ t do |
| 4: if a is new then |
| 5: Aa ← Id (d-dimensional identity matrix) |
| 6: ba ← 0d×1 (d-dimensional zero vector) |
| 7: end if |
| 8: ← ba |
| 8: Pt,a ← xt,a + α |
| 10: end for |
| 11: Choose arm at = arg Pt,a with ties broken arbitrarily and observe a real valued payoff rt |
| 12: Aat ← Aat + xt,at |
| 13: bat ← bat + rt xt,at |
| 14: end for |