# Day: March 24, 2022

## Libraries of Map-Reduce Libraries

Limitations of Map Reduce Libraries¶ Here are some of the limitations of using Map Reduce Libraries. We cannot refer attributes with names directly. Functions are scattered and lack consistency. Readability and maintainability are addressed using libraries such as Pandas. Libraries such as PySpark takes care of scalability. {note} We use the approach of loops or …

## Revenue Per Order Using Itertools

Revenue per order using itertools¶ Get revenue per order using order_items data set. In [1]: %run 02_preparing_data_sets.ipynb In [2]: order_items[:4] Out[2]: [‘1,1,957,1,299.98,299.98’, ‘2,2,1073,1,199.99,199.99’, ‘3,2,502,5,250.0,50.0’, ‘4,2,403,1,129.99,129.99’] In [3]: order_subtotals = map(lambda oi: (int(oi.split(‘,’)[1]), float(oi.split(‘,’)[4])), order_items) In [4]: list(order_subtotals)[:3] Out[4]: [(1, 299.98), (2, 199.99), (2, 250.0)] In [5]: order_subtotals = map(lambda oi: (int(oi.split(‘,’)[1]), float(oi.split(‘,’)[4])), order_items) order_subtotals_sorted = sorted(order_subtotals) In [6]: order_subtotals_sorted[:3] Out[6]: [(1, …

## Order Count by Status Using Itertools

Order Count by Status using itertools¶ Get count by order status using orders data set. In [1]: %run 02_preparing_data_sets.ipynb In [2]: orders[:3] Out[2]: [‘1,2013-07-25 00:00:00.0,11599,CLOSED’, ‘2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT’, ‘3,2013-07-25 00:00:00.0,12111,COMPLETE’] In [3]: orders_sorted = sorted(orders, key=lambda k: k.split(‘,’)[3]) In [4]: orders_sorted[:3] Out[4]: [‘50,2013-07-25 00:00:00.0,5225,CANCELED’, ‘112,2013-07-26 00:00:00.0,5375,CANCELED’, ‘527,2013-07-28 00:00:00.0,5426,CANCELED’] In [5]: import itertools as iter In [6]: orders_grouped = iter.groupby(orders_sorted, lambda order: order.split(‘,’)[3]) …

## Overview of Itertools Groupby

Overview of itertools groupby¶ Let us understand how we can use itertools.groupby to take care of aggregations by key. itertools.groupby can be used to get the data grouped by a key. It can be used to take care of use cases similar to following by using aggregate functions after grouping by key. Get count by …

## Using Itertools Starmap

Using itertools starmap¶ Let us understand the usage of starmap. We will first create a collection with sales and commission percentage. Using that collection compute total commission amount. If the commission percent is None or not present, treat it as 0. Each element in the collection should be a tuple. First element is the sales …

## Cumulative Operations Using Itertools

Cumulative Operations using itertools¶ Get cumulative sales from list of transactions. In [1]: ns = [1, 2, 3, 4] # Cumulative totals [1, 3, 6, 10] # Cumulative product [1, 2, 6, 24] In [2]: import itertools as iter In [3]: iter.accumulate? Init signature: iter.accumulate(iterable, func=None, *, initial=None) Docstring: Return series of accumulated sums (or other binary function …