Sai

Run Other Python Notebook Demo

In [1]: %run 07_preparing_data_sets.ipynb In [2]: orders[:10] Out[2]: [‘1,2013-07-25 00:00:00.0,11599,CLOSED’, ‘2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT’, ‘3,2013-07-25 00:00:00.0,12111,COMPLETE’, ‘4,2013-07-25 00:00:00.0,8827,CLOSED’, ‘5,2013-07-25 00:00:00.0,11318,COMPLETE’, ‘6,2013-07-25 00:00:00.0,7130,COMPLETE’, ‘7,2013-07-25 00:00:00.0,4530,COMPLETE’, ‘8,2013-07-25 00:00:00.0,2911,PROCESSING’, ‘9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT’, ‘10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT’] In [3]: order_items[:10] Out[3]: [‘1,1,957,1,299.98,299.98’, ‘2,2,1073,1,199.99,199.99’, ‘3,2,502,5,250.0,50.0’, ‘4,2,403,1,129.99,129.99’, ‘5,4,897,2,49.98,24.99’, ‘6,4,365,5,299.95,59.99’, ‘7,4,502,3,150.0,50.0’, ‘8,4,1014,4,199.92,49.98’, ‘9,5,957,1,299.98,299.98’, ‘10,5,365,5,299.95,59.99’]

Preparing Data Sets

Preparing Data Sets¶ We will be primarily using orders and order_items data set to understand about manipulating collections. orders is available at path /data/retail_db/orders/part-00000 order_items is available at path /data/retail_db/order_items/part-00000 orders – columns order_id – it is of type integer and unique order_date – it can be considered as string order_customer_id – it is of …

Preparing Data Sets Read More »

Filtering Data

Filtering Data¶ Let us perform few tasks to understand how to filter the data in collections using loops and conditionals. Here are the details about orders. Data is in text file format Each line in the file contains one record. Each record contains 4 attributes which are separated by “,” order_id order_date order_customer_id order_status In [1]: …

Filtering Data Read More »

Row Level Transformations

Row level transformations¶ Let us understand how to perform row level transformations using orders data set. Here are the details about orders. Data is in text file format Each line in the file contains one record. Each record contains 4 attributes which are separated by “,” order_id order_date order_customer_id order_status In [1]: %%sh ls -ltr /data/retail_db/orders/part-00000 …

Row Level Transformations Read More »