Day: March 24, 2022

Libraries of Map-Reduce Libraries

Limitations of Map Reduce Libraries¶ Here are some of the limitations of using Map Reduce Libraries. We cannot refer attributes with names directly. Functions are scattered and lack consistency. Readability and maintainability are addressed using libraries such as Pandas. Libraries such as PySpark takes care of scalability. {note} We use the approach of loops or …

Libraries of Map-Reduce Libraries Read More »

Revenue Per Order Using Itertools

Revenue per order using itertools¶ Get revenue per order using order_items data set. In [1]: %run 02_preparing_data_sets.ipynb In [2]: order_items[:4] Out[2]: [‘1,1,957,1,299.98,299.98’, ‘2,2,1073,1,199.99,199.99’, ‘3,2,502,5,250.0,50.0’, ‘4,2,403,1,129.99,129.99’] In [3]: order_subtotals = map(lambda oi: (int(oi.split(‘,’)[1]), float(oi.split(‘,’)[4])), order_items) In [4]: list(order_subtotals)[:3] Out[4]: [(1, 299.98), (2, 199.99), (2, 250.0)] In [5]: order_subtotals = map(lambda oi: (int(oi.split(‘,’)[1]), float(oi.split(‘,’)[4])), order_items) order_subtotals_sorted = sorted(order_subtotals) In [6]: order_subtotals_sorted[:3] Out[6]: [(1, …

Revenue Per Order Using Itertools Read More »

Order Count by Status Using Itertools

Order Count by Status using itertools¶ Get count by order status using orders data set. In [1]: %run 02_preparing_data_sets.ipynb In [2]: orders[:3] Out[2]: [‘1,2013-07-25 00:00:00.0,11599,CLOSED’, ‘2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT’, ‘3,2013-07-25 00:00:00.0,12111,COMPLETE’] In [3]: orders_sorted = sorted(orders, key=lambda k: k.split(‘,’)[3]) In [4]: orders_sorted[:3] Out[4]: [‘50,2013-07-25 00:00:00.0,5225,CANCELED’, ‘112,2013-07-26 00:00:00.0,5375,CANCELED’, ‘527,2013-07-28 00:00:00.0,5426,CANCELED’] In [5]: import itertools as iter In [6]: orders_grouped = iter.groupby(orders_sorted, lambda order: order.split(‘,’)[3]) …

Order Count by Status Using Itertools Read More »

Cumulative Operations Using Itertools

Cumulative Operations using itertools¶ Get cumulative sales from list of transactions. In [1]: ns = [1, 2, 3, 4] # Cumulative totals [1, 3, 6, 10] # Cumulative product [1, 2, 6, 24] In [2]: import itertools as iter In [3]: iter.accumulate? Init signature: iter.accumulate(iterable, func=None, *, initial=None) Docstring: Return series of accumulated sums (or other binary function …

Cumulative Operations Using Itertools Read More »