Hadoop The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. Install Hadoop on Ubuntu 14.04 In this chapter, we'll install a single-node Hadoop cluster backed by the Hadoop Distributed File System on Ubuntu. Installing Updates: Open terminal and run command $sudo apt-get update Installing Java: $ sudo apt-get install default-jdk Check java Version $java -version Installing SSH $sudo apt-get install ssh check ssh $which ssh Create and Setup SSH Certificates Hadoop requires SSH access to manage its nodes, i.e. remote machines plus our local machine. For our single-node setup of Hadoop, we therefore need to configure SSH access to localhost. So, we need to have SSH up and running on our machine and configured it to allow SSH public key authentication. $ssh-keygen -t rsa This command adds the