# Calculating Pi using distributed clusters.

Came across this interesting concept of computing the value of Pi using distributed clusters.

References:

Code:

# Techniques for Approximate Counting

In many scenarios,  we can live with approximations in counting.

There are a few techniques in particular which are increasingly being used in online scenarios. Example, services like RedShift, Redis and Spark have in built support for these data structures.

1. Hyper Log Log.
2. Count Min Sketch.
3. Bloom Filters.
4. Locality Sensitive Hashing.

References:

Code:

# Algorithms on Parenthesis. Catalan Numbers.

There are a few interesting questions I came across recently related to parenthesis.

These gave me interesting insights into a range of algorithms. Also, gave me insight into Catalan Numbers.

Questions:

• Balanced Brackets.
• Given “n”, gen­er­ate all valid paren­the­sis strings of length “2n”.
• Check for balanced parenthesis in an expression.

References:

Code:

# Understanding the SynchronizationContext and Best Practices in Asynchronous Programming.

For doing asynchronous programming in C#, one needs to understand the concept of SynchronizationContext.

I came across this nice set of posts by Stephen Cleary, where he explains some of the fundamental design choices and best practices.

The primary reason for this line of exploration for me was because of a deadlock I hit when working on a web app using the async pattern. This problem is described here.

References:

# Comparing ML algos : Multi Armed bandit, Contextual Bandit, Logistic Regression, Online Learning

We have a system running Multi-Armed Bandit.

So when it came to select the next generation of ML algo to try out, we had a few choices:

1. Multi-Armed Bandit  (we had this running)
• This entails ranking the items based on their respective conversion rates till that point of time.
2. Contextual Bandit
• We use Vowpal Wabbit for this.
• Internally Vowpal Wabbit treats contextual bandit in 2 distinct ways:
• Without Action Dependent Features (non-ADF)
• With Action Dependent Features (ADF)
• In non-ADF mode, the VW creates multiple models (i.e. creates a model for each class).
• In ADF mode, VW creates a single model.
3. Logistic Regression.
• This entails reducing the problem to a binary classification problem.
• Then using the model to score the items. Finally ranking the items based on the model score.
4. Online ML
• Again treating this as a binary classification model, except this time we are updating the model in an online fashion.

Interestingly, on the dataset I was using I didn’t see much of a difference in algorithmic performance across the 4 different algorithms above.

Code:

_trials_compare3

# Consistency Guarantees

• Strong
• Bounded staleness
• Session
• Eventual