In a previous post, I had mentioned about ML Reductions in general.

For the specific case of Contextual bandit class of problems, there are 2 seminal papers. This goes hand in hand with the counterfactual evaluation and learning problem which i have discussed in other posts (here, here)

**References:**

**Video:**

- https://www.youtube.com/watch?v=gzxRDw3lXv8
- this is a fantastic video where Robert E. Schapire discusses the CB problem.

