Running Tests in Scala IntelliJ IDEA

So I set up my first test suite in IntelliJ IDEA to test out some Scala code.

To run unit tests, I used the FunSuite package in ScalaTest.

import org.scalatest.FunSuite

 

Tip:

[1]  I modified the build.sbt to include the line:

libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.0" % "test"

I reloaded the project. However, for the changes to take effect I had to terminate the running sbt prompt and open a new one.

 

[2]  I did not use junit

 

References:

 

 

Tool Shortcuts

As I keep using different tools, it has become necessary to note down the different shortcuts across the tools.

Spyder:

  • (control + 1)  : comment/uncomment lines
  • (control + d)  : delete lines
  • is there a way to cut lines in spyder ?  (besides the standard select and cut technique)

Notepad++

  • (control + q)  : comment/uncomment lines for different file formats e.g. .sh / .bat

GVim

  • yy  : cut lines
  • pp : paste lines

VS:

  • (control + x) : cut lines
  • (control + kd) : format correction

IntelliJ IDEA:

  • (alt + F12) : command prompt
  • (control + alt + l) : format correction
  • (control + d) : move back from sbt’s scala console (‘sbt’ -> ‘console’) back to the sbt console.
  • (control + ‘/’) : comment / uncomment

 

 

References:

 

 

Terminal Choices and Tips

It seems there is a deluge of terminal options on the windows platform.

  • Cygwin
  • Mintty
    • Whats the relationship between cygwin and mintty ? 
    • The cygwin shortcut on my machine looks as follows:
      • C:\cygwin64\bin\mintty.exe -i /Cygwin-Terminal.ico –
  • GitBash
  • MYSYS
  • CMD
  • PS

Tips:

 

References:

 

Bash Scripting on Windows

There are a couple of ways to make bash scripts run on windows (cygwin).  This is to take care of the \r\n issue that comes up. File editors in windows use \r\n to denote end of line.

  • Modify file ending using Notepad++  (Edit->EOL Conversion)
  • Run the Dos2Unix tool to make the file Unix compatible.

References:

 

Code:

Getting Started with Spark on Windows 10 (Part 1)

The references below helped me get started with spark on windows. I am listing down a few additional tips based on my experience:

Tips:

  • Added the following as system environment variables for Sbt, Spark, Scala, Hadoop and Java.
    • sbt_spark_scala
    • hadoop_java
  • Added the following to the system PATH environment variable:
    • systempath
    • Note the java path is automatically picked up. I believe that’s because there is a java class path already present inside PATH [C:\ProgramData\Oracle\Java\javapath]
  • spark-shell on Cygwin given an error on lauch itself.
    • Error looks something [: too many arguments
    • I think Spark on Cygwin has not been fully tried out yet. Read this.
  • After getting spark-shell to launch on the Cmd window, there is another weird stack trace that I hit
  • There is a weird stack dump that happens when exiting the spark-shell. (i.e on hitting :q within he spark shell)
    • It seems it is non-deterministic. i am not aware of the root cause of this issue..

References:

 

ML Reductions for Contextual Bandit

In a previous post, I had mentioned about ML Reductions in general.

For the specific case of Contextual bandit class of problems, there are 2 seminal papers. This goes hand in hand with the counterfactual evaluation and learning problem which i have discussed in other posts (here, here)

References:

Video:

 

 

Frequency Counting in Python.

One of the most frequent operations when doing data analysis is looking at the frequency counts information.

I wanted to list down the various ways of doing this task:

  • using python collections: Counter and Defaultdict
  • using numpy
    • with numpy.unique, with return_counts argument
    • with bincount, nonzero, zip / vstack
  • using pandas
  • using scipy

 

References:

Code: