Alias method

“You are given an n-sided die where side i has probability pi of being rolled. What is the most efficient data structure for simulating rolls of the die?”

A very similar question was posted to me recently  :

The approaches above are very cool, and illustrate the use of augmented search trees.

However, it seems there is a better method for this problem – and it has been out there for a while now. This was a fascinating read :

Additional pointers for the alias method:



Docker. Getting Started.





docker cp <containerId>:/file/path/within/container /host/path/target

Feature Scaling in SGD

SGD is the perfect algorithm for use in online learning. Except it has one major drawback – is sensitive to feature scaling.

In some of my trials with the SGD learner in scikit-learn, I have seen terrible performance if I don’t do feature scaling.

Which begs the question – How does VW do feature scaling ? After all VW does online learning.


It seems VW uses a kind of SGD that is scale variant: