Visualization Using D3 (and dependent libraries)

This link gives a nice summary of data visualization libraries using D3:

Interestingly, it mentions mermaid and rickshaw! Two cool libraries I recently came across

Real time:


Details running R Scripts

  • trace(functionName, edit=TRUE).  Then write browser() where you want it to break
  • source(‘~/scripts/trial3_criteo_ensembles_ag.R’, echo=TRUE)
  • Rscript -e “.libPaths()”

Other functions in R I didn’t know:

  • class(score)
  • names(score)
  • head(criteoTest)
  • class(criteoTest)
  • rxGetVarInfo(criteoTest)
  • warnings()
  • sapply(score, class)

Alias method

“You are given an n-sided die where side i has probability pi of being rolled. What is the most efficient data structure for simulating rolls of the die?”

A very similar question was posted to me recently  :

The approaches above are very cool, and illustrate the use of augmented search trees.

However, it seems there is a better method for this problem – and it has been out there for a while now. This was a fascinating read :

Additional pointers for the alias method:



Docker. Getting Started.





docker cp <containerId>:/file/path/within/container /host/path/target

Feature Scaling in SGD

SGD is the perfect algorithm for use in online learning. Except it has one major drawback – is sensitive to feature scaling.

In some of my trials with the SGD learner in scikit-learn, I have seen terrible performance if I don’t do feature scaling.

Which begs the question – How does VW do feature scaling ? After all VW does online learning.


It seems VW uses a kind of SGD that is scale variant: