In a previous post, I had mentioned about ML Reductions in general.

For the specific case of Contextual bandit class of problems, there are 2 seminal papers. This goes hand in hand with the counterfactual evaluation and learning problem which i have discussed in other posts (here, here)

**References:**

**Video:**

- https://www.youtube.com/watch?v=gzxRDw3lXv8
- this is a fantastic video where Robert E. Schapire discusses the CB problem.

[…] example, in one of my previous posts I noted the ML reduction stack in VW for the contextual bandits […]

[…] the problem of finding the best policy to a weighted multi-class classification problem. In a previous post I had mentioned about this reduction which is implemented in the VW […]