ML Algos in R

I have been trying to understand how ML algos in R fit together / compare with each other.

# R RevoScaleR MML Comment
1 lm rxLinMod  — Linear models
2 glm rxGlm  — Linear models
3 Glm w/

Binomial family and the logit link function

rxLogit rxLogisticRegression Logistic regression
4 rpart rxDtree  — Decision Trees implementations
5 gbm rxBTrees rxFastTrees Boosted Decision Tree implementations
6  —- rxDForest rxFastForest



# Title Links
1 Fitting Logistic Regression Models
2 Generalized Linear Models
3 rxDTree(): a new type of tree algorithm for big data
4 A first look at rxBTrees
5 A First Look at rxDForest()



Using SSH Keys on Cloud Platforms


  • openssl.exe req -x509 -nodes -days 365 -newkey rsa:2048 -keyout myPrivateKey.key -out myCert.pem
    • We will mostly use the .key file
    • The .pem file is only needed for Classic deployments. Typically we wont use this.


  • Look up use of req :
    • The req command primarily creates and processes certificate requests . Thats why the output of req is a cerificate (myCert.pem)
    • But we are interested in the private key (myPrivateKey.key). Hence we are using the -keyout flag




  • In AWS,  the private key is saved in a .pem file . you just use the .pem file to connect to the instances.
    • Ideally the .pem extension is for certificates, not for keys.
    • This was one of my confusions – because AWS saves the key in the .pem file 



  • Use ssh-agent to store private keys. Makes life much simpler!


Visualization Using D3 (and dependent libraries)

This link gives a nice summary of data visualization libraries using D3:

Interestingly, it mentions mermaid and rickshaw! Two cool libraries I recently came across

Real time:


Details running R Scripts

  • trace(functionName, edit=TRUE).  Then write browser() where you want it to break
  • source(‘~/scripts/trial3_criteo_ensembles_ag.R’, echo=TRUE)
  • Rscript -e “.libPaths()”

Other functions in R I didn’t know:

  • class(score)
  • names(score)
  • head(criteoTest)
  • class(criteoTest)
  • rxGetVarInfo(criteoTest)
  • warnings()
  • sapply(score, class)