The other day I was discussing about how to have a gentle introduction to the Scala language. Here is what I prescribe ..
- [Paper] An Overview of the Scala Programming Language
- [Doc] Brief Scala Tutorial.
- [Doc] Scala by Example by Martin Odersky
- [Book – Beginner] Atomic Scala
- [Book – Beginner] Scala for the Impatient
- [Code Online] Scala Tutorials
- [Code Online] Scala Exercises
- [Online Tutorial] https://learnxinyminutes.com/docs/scala/
- [Online Tutorial] Official Documentation
- [Online Tutorial] Twitter’s Scala School
- [Online Tutorial] https://www.tutorialspoint.com/scala/
- [Course] Functional Programming Principles in Scala by Martin Odersky
- [Book – Intermediate] Programming in Scala by Martin Odersky, Lex Spoon, and Bill Venners
- [Tutorial] Official Guide and Overview
- [Book – Intermediate] Programming Scala – by Alex Payne and Dean Wampler
- [Book – Intermediate] Functional Programming in Scala – by Paul Chiusano and Rúnar Bjarnason
- [Book – Intermediate] Scala in Depth – by Joshua D. Suereth
- [Book – Intermediate] Scala Puzzlers – Available Now by Andrew Phillips and Nermin Šerifović
Am trying to understand how to go about exporting my blog from WordPress.com to Jekyll.
My goal is to then host it on GitHub pages under my user credentials.
- the above two were quite useful in understanding whats going on
Jekyll has a very short ‘getting started’ guide:
Apparently that’s all that’s needed.
However, my experience on Ubuntu 16.04 was more intense than this.
I kept hitting the following [issue](http://askubuntu.com/questions/793381/cant-install-rails-on-ubuntu-16-04)
Was almost about to give up on this thing before finding the links
- It was interesting to see the role of Zookeeper in this diagram of the Hadoop Ecosystem.
There was a discussion that came up the other day about L1 v/s L2, Lasso v/s Ridge etc.
- Whats the difference between L1 and L2 loss function
- Whats the difference between L1 and L2 regularizers
- Whats the difference between Lasso and Ridge
- [Differences between L1 and L2 as Loss Function and Regularization](http://www.chioka.in/differences-between-l1-and-l2-as-loss-function-and-regularization/)
- Good read
- Good read
I was playing around with Mahout, and one of the things I wanted to try out was to use Mahout’s Spark Shell on my local machine
There is a nice example for doing this. But I hit a stack dump the moment I tried to start up the mahout shell using
<br />java.lang.RuntimeException: java.io.InvalidClassException: org.apache.spark.rpc.netty.RequestMessage; local class incompatible: stream classdesc serialVersionUID = -2221986757032131007, local class serialVersionUID = -5447855329526097695 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
The problem is because the spark version that Mahout was looking for was 1.6.2 (specified in the POM file). The spark cluster I had started up was with the latest version 2.0.1
Here are the steps I did to get it going:
Installing Mahout & Spark on your local machine
- Create a directory for Mahout somewhere on your machine, change to there and checkout the master branch of Apache Mahout from GitHub :
git clone https://github.com/apache/mahout mahout
- Look at the POM file to check for the spark version dependency
- Change to the
mahoutdirectory and build mahout using
mvn -DskipTests clean install
- Download Apache Spark (http://www.apache.org/dyn/closer.cgi/spark)
- Note: Download the source code not just the pre-built binaries.
- Select ‘Source Code’ in the Project type
- Change to the directory where you unpacked Spark and type `
sbt/sbt assembly`to build it
- Takes close to an hour
Starting Mahout’s Spark shell
- Goto the directory where you unpacked Spark and type `
sbin/start-all.sh`to locally start Spark
- Open a browser, point it to http://localhost:8080/ to check whether Spark successfully started. Copy the url of the spark master at the top of the page (it starts with spark://)
- This starts spark in the Standalone mode with 1 master and 1 worker
- Verified the spark version used was 1.6.2
- Define the following environment variables in a file `mymahoutsparksettings.sh` and source that file so the following variables are set
<br />abgoswam@abgoswam-ubuntu:~/repos/mahout$ cat mymahoutsparksettings.sh #!/usr/bin/env bash export MAHOUT_HOME=/home/abgoswam/repos/mahout export SPARK_HOME=/home/abgoswam/packages/spark-1.6.2 export MASTER=spark://abgoswam-ubuntu:7077 echo "Set variables for Mahout" abgoswam@abgoswam-ubuntu:~/repos/mahout$
- Finally, change to the directory where you unpacked Mahout and type `
bin/mahout spark-shell`, you should see the shell starting and get the prompt
- [Playing with Mahout’s Spark Shell](https://mahout.apache.org/users/sparkbindings/play-with-shell.html)
- [MAHOUT SPARK SHELL: AN OVERVIEW](https://datasciencehacks.wordpress.com/2014/10/11/mahout-spark-shell-an-overview/)
The best way to get started with hadoop is to play with it in a single node setting.
These couple of links give a good intro to the Apache Beam Model