Some nice links:
- N-Shot Learning: Learning More with Less Data
- Meta-Learning: Learning to Learn Fast
Some nice links:
It can be quite tricky getting around to deploying TF models –
In fact there are multiple ways to save/load TF models, with each serving a slightly different purpose / use-case.
I have been trying to understand how ML algos in R fit together / compare with each other.
Binomial family and the logit link function
|4||rpart||rxDtree||—||Decision Trees implementations|
|5||gbm||rxBTrees||rxFastTrees||Boosted Decision Tree implementations|
|1||Fitting Logistic Regression Models||https://msdn.microsoft.com/en-us/microsoft-r/scaler-user-guide-logistic-regression|
|2||Generalized Linear Models||https://msdn.microsoft.com/en-us/microsoft-r/scaler-user-guide-generalized-linear-mode|
|3||rxDTree(): a new type of tree algorithm for big data||http://blog.revolutionanalytics.com/2013/07/rxdtree-a-new-type-of-tree-algorithm.html|
|4||A first look at rxBTrees||http://blog.revolutionanalytics.com/2015/03/a-first-look-at-rxbtrees.html|
|5||A First Look at rxDForest()||http://blog.revolutionanalytics.com/2014/01/a-first-look-at-rxdforest.html|
SGD is the perfect algorithm for use in online learning. Except it has one major drawback – is sensitive to feature scaling.
In some of my trials with the SGD learner in scikit-learn, I have seen terrible performance if I don’t do feature scaling.
Which begs the question – How does VW do feature scaling ? After all VW does online learning.
It seems VW uses a kind of SGD that is scale variant:
This is a nice link which lists the packages out there for Grpahical Models.
There was a discussion that came up the other day about L1 v/s L2, Lasso v/s Ridge etc.