Posts

Showing posts from August, 2016

Hadoop - How to setup a Hadoop Cluster

Image
Below is a step-by-step guide which I had used to setup a Hadoop Cluster
Scenario
3 VMs involved:

1) NameNode, ResourceManager - Host name: NameNode.net
2) DataNode 1 - Host name: DataNode1.net
3) DataNode 2 - Host name: DataNode2.net


Pre-requisite
1) You could create a new Hadoop user or use an existing user. But make sure the user have access to the Hadoop installation in ALL nodes

2) Install JAVA. Refer here for a good version. In this guide, Java is installed at /usr/java/latest

3) Download a stable version of Hadoop from Apache Mirrors

This guide is based on Hadoop 2.7.1 and assume that we had create a user call hadoop


Setup Passphaseless SSH from NameNode to all Nodes.
1) Run the command

ssh-keygen
This command will ask you a set of questions and accepting the default is fine. Eventually, it will create a set of private key (id_rsa) and public key (id_rsa.pub) at the user directory (/home/hadoop/.ssh)

2) Copy the public key to all Nodes with the following

ssh-copy-id -i /home/h…

JAVA - _JAVA_OPTIONS and JAVA_TOOL_OPTIONS environment variable

JAVA_TOOL_OPTIONS and _JAVA_OPTIONS are 2 useful environment variables which allow user to set JVM options in the form of environment variables, rather than setting it at the command line. But, they have slight differences

1. Precedence - From my testing, the precedence (order of evaluation) is

_JAVA_OPTIONS > Command line > JAVA_TOOL_OPTIONS

With this, there is different use-case for _JAVA_OPTIONS and JAVA_TOOL_OPTIONS

For _JAVA_OPTIONS, you could use it to overwrite the JVM options which has been defined in the command line.

For JAVA_TOOL_OPTIONS, you could use it to put additional JVM options for the predefined command line.

2. Documentation - JAVA_TOOL_OPTIONS is well documented but _JAVA_OPTIONS. So, _JAVA_OPTIONS may not be officially supported.

3. Support - _JAVA_OPTIONS is Oracle specific. The IBM Java equivalent will be IBM_JAVA_OPTIONS. JAVA_TOOL_OPTIONS is platform independent.

Reference:

1. JAVA_TOOLS_OPTIONS
2. IBM_JAVA_OPTIONS
3. http://stackoverflow.com/questions…