Java, Maven, Scala, SBT Concepts

Am toe dipping into maven. Trying to make sense of how maven fits in with  IDE, command line maven, POM files blah blah etc.

tips:

  • intellij will have default support within the IDE for both Maven and SBT.  So as long as we are not using mvn and sbt  from the command line we should be good.

java fundamental concepts:

maven:

base scala in intellij:

scala with SBT:

Ubuntu Learnings

Remote to Ubuntu 16.04 from Windows 10:

Copy On Select:

Install Java

profile and bashrc files

abgoswam@abgoswam-ubuntu:~$ ls .bash* -lh
-rw------- 1 abgoswam abgoswam 4.1K Oct 23 15:20 .bash_history
-rw-r--r-- 1 abgoswam abgoswam  220 Oct 21 22:44 .bash_logout
-rw-r--r-- 1 abgoswam abgoswam 3.7K Oct 21 22:44 .bashrc

abgoswam@abgoswam-ubuntu:~$ ls .profile* -lh
-rw-r--r-- 1 abgoswam abgoswam 655 Oct 21 22:44 .profile

abgoswam@abgoswam-ubuntu:~$ ls -lh /etc/profile*
-rw-r--r-- 1 root root  670 Oct 23 14:55 /etc/profile
-rw-r--r-- 1 root root  884 Oct 23 15:18 /etc/profile.save

/etc/profile.d:
total 20K
-rw-r--r-- 1 root root   40 Nov 30  2015 appmenu-qt5.sh
-rw-r--r-- 1 root root  101 Jun 29 12:03 apps-bin-path.sh
-rw-r--r-- 1 root root  663 May 18 02:19 bash_completion.sh
-rw-r--r-- 1 root root 1003 Dec 29  2015 cedilla-portuguese.sh
-rw-r--r-- 1 root root 1.9K Mar 16  2016 vte-2.91.sh

abgoswam@abgoswam-ubuntu:~$ ls -lh /etc/bash*
-rw-r--r-- 1 root root 2.2K Aug 31  2015 /etc/bash.bashrc
-rw-r--r-- 1 root root   45 Aug 12  2015 /etc/bash_completion

Gvim

Making Gvim auto open in new tab:

 

Spark:

Windowing Operations in Azure Stream Analytics

Windowing is a very common operation in stream analytics.

Beneath the surface, there is a whole bunch of complex data structuring that’s going on to support the windowing operations. I would love to dig deeper into these someday.

Example:

Here is an example of a query I wrote recently using windowing operators in azure stream analytics. It shows 3 interesting things :
1. Windowing
2. CTEs
3. Aggregation over string columns (using TopOne)

WITH ContextReward AS (
    SELECT 
        eventid,
        TopOne() OVER (ORDER BY [EventEnqueuedUtcTime] ASC) CR,
        MAX (reward) AS reward
    FROM Input
    GROUP BY eventid, HoppingWindow(Duration(hour, 2), Hop(hour, 1))
)

SELECT 
    reward,
    eventid, 
    CR.actionname AS actionname,
    CR.age AS age,
    CR.gender AS gender,
    CR.weight AS weight,
    CR.actionprobability
INTO OutputWindow
FROM ContextReward

SELECT * INTO Output FROM Input 
SELECT * INTO OutputCSV FROM Input

 

References:

802.3 v/s 803.11

This gives a nice overview of the differences between  Ethernet and Wifi at a protocol level.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.456.9874&rep=rep1&type=pdf

  • The crux of the problem is this :  “The CSMA/CD protocol is not used in a wireless environment due to the user has no capability to sense/listen to the channel for collision while sending the packet [12].
  • This necessitates things like Collision Avoidance techniques to be used for Wifi.  And that imposes limits on how fast you can transmit packets at a certain frequency band leading to slower speeds.

 

REST Calls in Python. JSON. Pandas.

I recently had to make REST calls in Python for sending data to Azure EventHub.

In this particular case I could not use the Python SDK to talk to EventHub. As I wrote down the code to make the raw REST calls, I came across several gems. Am listing them down below.

Tips:

  • Use the python ‘requests’ library.
    • i am yet to figure out how to make async calls. can i use this library for async as well or would I have to use something else
  • Sending JSON is way to go.
    • Don’t even try sending anything else
  • Pandas has great functionality to convert  Series/DataFrames to JSON.
    • the ‘to_json’ function has awesome functionality including orient by ‘records’ etc
  • Python has an awesome library called ‘json’ to deal with JSON data.
    • To deserialize ,use json.loads()
    • In particular,  to convert dict to JSON use  json.dumps().
    • Note: If you want to preserve the order, one would have to use ‘collections.OrderedDict’. Check this link

Check this out:


myj = '[{"reward":30,"actionname":"x","age":60,"gender":"M","weight":150,"Scored Labels":30.9928596354},{"reward":20,"actionname":"y","age":60,"gender":"M","weight":150,"Scored Labels":19.0217225957}]'

myj_l = json.loads(myj, object_pairs_hook=collections.OrderedDict)

myj_l
Out[177]:
[OrderedDict([(u'reward', 30), (u'actionname', u'x'), (u'age', 60), (u'gender', u'M'), (u'weight', 150), (u'Scored Labels', 30.9928596354)]),
 OrderedDict([(u'reward', 20), (u'actionname', u'y'), (u'age', 60), (u'gender', u'M'), (u'weight', 150), (u'Scored Labels', 19.0217225957)])]

for item in myj_l:
    print json.dumps(item)

{"reward": 30, "actionname": "x", "age": 60, "gender": "M", "weight": 150, "Scored Labels": 30.9928596354}
{"reward": 20, "actionname": "y", "age": 60, "gender": "M", "weight": 150, "Scored Labels": 19.0217225957}

References:

Code: