Getting Started With Scala

The other day I was discussing about how to have a gentle introduction to the Scala language. Here is what I prescribe ..

Sufficient Scala:

Additional Resources:

References:

WordPress.com to Jekyll

Am trying to understand how to go about exporting my blog from WordPress.com to Jekyll.

My goal is to then host it on GitHub pages  under my user credentials.

Some references:

Jekyll. Ubuntu 16.04

Jekyll has a very short ‘getting started’ guide:

Apparently that’s all that’s needed. However, my experience on Ubuntu 16.04 was more intense than this.

I kept hitting the following [issue](http://askubuntu.com/questions/793381/cant-install-rails-on-ubuntu-16-04)

Was almost about to give up on this thing before finding the links

 

L1 / L2 loss functions and regularization

There was a discussion that came up the other day about L1 v/s L2,  Lasso v/s Ridge etc.

In particular,

  • Whats the difference between L1 and L2  loss function
  • Whats the difference between L1 and L2  regularizers
  • Whats the difference between Lasso and Ridge

 

References:

Mahout Spark Shell Locally

I was playing around with Mahout, and one of the things I wanted to try out was to use Mahout’s Spark Shell on my local machine

There is a nice example for doing this. But I hit a stack dump the moment I tried to start up the mahout shell using bin/mahout spark-shell

<br />java.lang.RuntimeException: java.io.InvalidClassException: org.apache.spark.rpc.netty.RequestMessage; local class incompatible: stream classdesc serialVersionUID = -2221986757032131007, local class serialVersionUID = -5447855329526097695
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)

The problem is because the spark version that Mahout was looking for was 1.6.2  (specified in the POM file).  The spark cluster I had started up was with the latest version 2.0.1

Here are the steps I did to get it going:

Installing Mahout & Spark on your local machine

  •  Create a directory for Mahout somewhere on your machine, change to there and checkout the master branch of Apache Mahout from GitHub :
  • Change to the mahout directory and build mahout using mvn -DskipTests clean install
  • Download Apache Spark (http://www.apache.org/dyn/closer.cgi/spark)
    • Note: Download the source code not just the pre-built binaries.
    • Select ‘Source Code’ in the Project type
  • Change to the directory where you unpacked Spark and type `sbt/sbt assembly` to build it
    • Takes close to an hour

 

Starting Mahout’s Spark shell

  • Goto the directory where you unpacked Spark and type `sbin/start-all.sh` to locally start Spark
  • Open a browser, point it to http://localhost:8080/ to check whether Spark successfully started. Copy the url of the spark master at the top of the page (it starts with spark://)
    • This starts spark in the Standalone mode with 1 master and 1 worker
    • Verified the spark version used was 1.6.2
  • Define the following environment variables in a file `mymahoutsparksettings.sh` and source that file so the following variables are set
<br />abgoswam@abgoswam-ubuntu:~/repos/mahout$ cat mymahoutsparksettings.sh
#!/usr/bin/env bash

export MAHOUT_HOME=/home/abgoswam/repos/mahout
export SPARK_HOME=/home/abgoswam/packages/spark-1.6.2
export MASTER=spark://abgoswam-ubuntu:7077

echo "Set variables for Mahout"
abgoswam@abgoswam-ubuntu:~/repos/mahout$

  • Finally, change to the directory where you unpacked Mahout and type `bin/mahout spark-shell`, you should see the shell starting and get the prompt mahout>.

References: