Section 2: 9.Cloudera Quickstart VM – Setup NYSE Dataset in MYSQL
We provide NYSE datasets as part of our state of art labs. However, if you want to setup in Cloudera Quickstart VM for the practice, you can follow the instructions.
- Create users in MYSQL Database available in cloudera Quickstart VM
- Clone data from Github Repository
- Unzip files that are cloned
- Load all the files into the MYSQL table created
Use below gist for your reference gist.github.com
https://gist.github.com/dgadiraju/ce1ccf641911c5d6cb414a0b06b93a06
Create Database
CREATE USER nyse;
CREATE USER 'nyse_user' IDENTIFIED BY 'itversity';
GRANT ALL ON nyse.* TO nyse_user;
GRANT FILE ON *.* TO nyse_user;
GRANT SUPER ON *.* TO nyse_user;
FLUSH PRIVILEGES;
USE nyse;
CREATE TABLE `stock_eod` (
`stockticker` varchar(10) NOT NULL DEFAULT '',
`tradedate` varchar(30) NOT NULL DEFAULT '',
`openprice` decimal(10,2) DEFAULT NULL,
`highprice` decimal(10,2) DEFAULT NULL,
`lowprice` decimal(10,2) DEFAULT NULL,
`closeprice` decimal(10,2) DEFAULT NULL,
`volume` bigint(20) DEFAULT NULL,
PRIMARY KEY (`stockticker`,`tradedate`)
);
Load_nyse.data.sh
# We will be using /data as the reference directory
sudo mkdir -p /data
sudo su - root
cd /data
git clone https://github.com/dgadiraju/nyse_all.git
gunzip /data/nyse_all/nyse_data/
for f in /data/nyse_all/nyse_data/*.txt
do
unlink stock_eod.txt
ln -s $f stock_eod.txt
mysqlimport \
--host=127.0.0.1 \
--user=nyse_user \
--password=itversity \
--fields-terminated-by=',' \
--lines-terminated-by='\n' \
--local \
--lock-tables \
--verbose \
nyse stock_eod.txt
echo "Done: '"$f"' at $(date)"
done