Web Server Log Analysis with Apache Spark

References:

 

 

Advertisements

Balanced Ternary

I was solving this math problem which had to do with representing every Natural number as a summation/subtraction of distinct power of 3

Interestingly this led me to this branch of mathematics called ‘Balanced Ternary’. Check it out!

Exploration of this problem gave me interesting insights about base representation of a number, something that I have been keeping in the backburner for a long while now. Finally got a chance to follow up on this.

References:

Problem:

Code:

 

‘map’ in python

In some of my previous posts (Google while coding…, Functional tools in python.., Applying operations over dataframes)  I have noted the use of the ‘map keyword in pandas.

‘map is used in Python for the following scenarios :

  • as a functional operator in Python
    • Return a list of the results of applying the function to the items of
      the argument sequence(s)
  • as an element-wise function on a Series
  • pyspark

 

Code:


#------PYTHON---------
#https://bradmontgomery.net/blog/pythons-zip-map-and-lambda/
#Problem : that you've got two collections of values and you need to keep the largest (or smallest) from each. These could be metrics from two different systems, stock quotes from two different services, or just about anything. 

a = [1, 2, 3, 4, 5]
b = [2, 2, 9, 0, 9]

#Approach 1.
maxval = []
for i in range(len(a)):
    if a[i] >= b[i]:
        maxval.append(a[i])
    else:
        maxval.append(b[i])

#Approach 2.
print map(lambda pair: max(pair), zip(a,b))

#---------PANDAS----------
#http://stackoverflow.com/questions/19798153/difference-between-map-applymap-and-apply-methods-in-pandas
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(5, 3), columns=list('bde'), index=['Seattle', 'Utah', 'Ohio', 'Texas', 'Oregon'])

#1. Using DataFrame 'apply'
#applying a function on 1D arrays to each column or row.
f = lambda x: x.max() - x.min()
df.apply(f)

#2. Using DataFrame 'applymap'
#Element-wise Python functions can be used, too. Suppose you wanted to compute a formatted string from each floating point value in frame. You can do this with applymap
formatf = lambda x: '%.2f' % x
df.applymap(formatf)

#3. Using Series 'map'
formatf = lambda x: '%.2f' % x
df['f'] = df['e'].map(formatf)

#Summing up, apply works on a row / column basis of a DataFrame, applymap works element-wise on a DataFrame, and map works element-wise on a Series