It is better to have Linux based environment for setting up Big Data Clusters. It provides us below capabilities.
- Ability to connect to remote servers using ssh
- Copy files using scp or rsync
- Enable proxy such as sshuttle to access web applications running on the servers behind firewall with out opening up ports to the public using ssh authentication.
Setup Process
Here are the instructions related to setting up of Ubuntu using Windows Subsystem.
- Open Powershell as Administrator and run this command –
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
- Go to Windows Store and search for Ubuntu
- Click on Launch and complete setup process by providing username and password.
- Install sshuttle using apt-get –
apt-get install sshuttle -y