Factory Pattern in Python

I recently used the factory Pattern in Python.

It was a little surprising to me comparing typical usage of this pattern in C# v/s Python.

In particular, in C# you’s typically have an Interface defining the methods and then provide an implementation of these methods in the concrete classes.

In Python, there is no interface – In the example above I embed a static  method in the base  to select the appropriate derived class.

Web requests in Python

Recently tried making web requests in Python.

I used the urllib2 library for making 10 requests to the Azure Machine Learning web service.

Interestingly I found that using urllib2  was incurring a lot of latency.   I replaced urllib2 with the requests libarary and boom, the latency improved tremendously.


  • it seems the requests library  by default uses KeepAlive.  As such, it was not re-initiating the connection each time for the multiple requests. urllib2 on the other hand was re-initiating the connection for each request.
  • Note:  the requests library is still making synchronous calls.

REST Calls in Python. JSON. Pandas.

I recently had to make REST calls in Python for sending data to Azure EventHub.

In this particular case I could not use the Python SDK to talk to EventHub. As I wrote down the code to make the raw REST calls, I came across several gems. Am listing them down below.


  • Use the python ‘requests’ library.
    • i am yet to figure out how to make async calls. can i use this library for async as well or would I have to use something else
  • Sending JSON is way to go.
    • Don’t even try sending anything else
  • Pandas has great functionality to convert  Series/DataFrames to JSON.
    • the ‘to_json’ function has awesome functionality including orient by ‘records’ etc
  • Python has an awesome library called ‘json’ to deal with JSON data.
    • To deserialize ,use json.loads()
    • In particular,  to convert dict to JSON use  json.dumps().
    • Note: If you want to preserve the order, one would have to use ‘collections.OrderedDict’. Check this link

Check this out:

myj = '[{"reward":30,"actionname":"x","age":60,"gender":"M","weight":150,"Scored Labels":30.9928596354},{"reward":20,"actionname":"y","age":60,"gender":"M","weight":150,"Scored Labels":19.0217225957}]'

myj_l = json.loads(myj, object_pairs_hook=collections.OrderedDict)

[OrderedDict([(u'reward', 30), (u'actionname', u'x'), (u'age', 60), (u'gender', u'M'), (u'weight', 150), (u'Scored Labels', 30.9928596354)]),
 OrderedDict([(u'reward', 20), (u'actionname', u'y'), (u'age', 60), (u'gender', u'M'), (u'weight', 150), (u'Scored Labels', 19.0217225957)])]

for item in myj_l:
    print json.dumps(item)

{"reward": 30, "actionname": "x", "age": 60, "gender": "M", "weight": 150, "Scored Labels": 30.9928596354}
{"reward": 20, "actionname": "y", "age": 60, "gender": "M", "weight": 150, "Scored Labels": 19.0217225957}



K-means Clustering

One of my friends recently asked me about the K-means algorithm.

  • how does it work ?
  • what are the typical applications i.e. where/how is it used in the industry ?

In the discussion that followed, we ended up playing around with several visualizations available that do an awesome job of explaining this technique.

we also hacked around with some code from joel grus’  book (data science from scratch) to develop more intuition on the K-means algorithm.




There are some very interesting insights which we got playing around with trying to use K-means to cluster an image containing different colors: