Setting up winutils.exe and Data sets

Setting up winutils.exe on Windows (64 bit)

In the above video, we will setup Winutils.exe

  • This is only in Windows, not in Mac or Linux.
  • To download Winutils.exe, click here
  • Unzip the folder, Create a new folder in C Drive names as Hadoop.
  • Create a new folder in Hadoop named as bin, paste the winutils.exe in the bin folder.
  • Setup environment variables, under the system variables, click on new, give a variable name as HADOOP_HOME, and variable value as C:\hadoop.
  • Go to PATH, give the path as C:\hadoop\bin
  • In Command Prompt, enter winutils.exe, to check whether it is accessible to us or not.
  • Then, winutils.exe setup is done.

Setup Data Sets – retail_db

In the above video, we will see how to setup datasets

  • To access the Datasets, click here
  • Download the repository as a zip file.
  • Unzip the file and copy all the files into C drive by creating a new folder and paste.
  • Go to retail_db folder, it is having some data sets.
  • Open the retail_db dataset and it is having some values.
  • Make sure you have this dataset, before building Spark based applications.