Install HBase and setting for single node & distributed mode
Installing HBase
We can install HBase in any of the three modes: Standalone mode, Pseudo Distributed mode, and Fully Distributed mode.
Installing HBase in Standalone Mode
Download the latest stable version of HBase form https://www.interior-dsgn.com/apache/hbase/stable/ using “wget” command, and extract it using the tar “zxvf” command. See the following command.
$cd /home/(username)/ $wget https://mirror.nexcess.net/apache/hbase/stable/hbase-1.0.0-bin.tar.gz
$tar -zxvf hbase-1.0.0-bin.tar.gz
Move the HBase folder to /(username)/hbase as shown below.
mv hbase-1.0.0/ Hbase
Configuring HBase in Standalone Mode
Before proceeding with HBase, you have to edit the following files and configure HBase.
hbase-env.sh
Set the java Home for HBase and open hbase-env.sh file from the conf folder. Edit JAVA_HOME environment variable and change the existing path to your current JAVA_HOME variable as shown below.
cd /home/(username)/Hbase/conf gedit hbase-env.sh
This will open the env.sh file of HBase. Now replace the existing JAVA_HOME value with your current value as shown below.
export JAVA_HOME=/opt/java //(your java path)
hbase-site.xml
This is the main configuration file of HBase. Set the data directory to an appropriate location by opening the HBase home folder in /usr/local/HBase. Inside the conf folder, you will find several files, open the hbase-site.xml file as shown below.
gedit hbase-site.xml
Inside the hbase-site.xml file, you will find the
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/(username)/zookeeper</value>
</property>
With this, the HBase installation and configuration part is successfully complete. We can start HBase by using start-hbase.sh script provided in the bin folder of HBase. For that, open HBase Home Folder and run HBase start script as shown below.
$start-hbase.sh
If everything goes well, when you try to run HBase start script, it will prompt you a message saying that HBase has started.
starting master, logging to /home/username/HBase/bin/../logs/hbase-tpmaster-localhost.localdomain.out
Installing HBase in Pseudo-Distributed Mode
Let us now check how HBase is installed in pseudo-distributed mode.
Configuring HBase
Before proceeding with HBase, configure Hadoop and HDFS on your local system or on a remote system and make sure they are running. Stop HBase if it is running.
hbase-site.xml
Edit hbase-site.xml file to add the following properties.
hbase.cluster.distributed true
It will mention in which mode HBase should be run. In the same file from the local file system, change the hbase.rootdir, your HDFS instance address, using the hdfs://// URI syntax. We are running HDFS on the localhost at port 8020.
hbase.rootdir hdfs://localhost:8020/hbase
Starting HBase
After configuration is over, browse to HBase home folder and start HBase using the following command.
$cd /home/(username)/HBase $bin/start-hbase.sh
Note: Before starting HBase, make sure Hadoop is running.
Checking the HBase Directory in HDFS
HBase creates its directory in HDFS. To see the created directory, browse to Hadoop bin and type the following command.
$ ./bin/hadoop fs -ls /hbase
If everything goes well, it will give you the following output.
Found 7 items drwxr-xr-x - hbase users 0 2014-06-25 18:58 /hbase/.tmp drwxr-xr-x - hbase users 0 2014-06-25 21:49 /hbase/WALs drwxr-xr-x - hbase users 0 2014-06-25 18:48 /hbase/corrupt drwxr-xr-x - hbase users 0 2014-06-25 18:58 /hbase/data -rw-r--r-- 3 hbase users 42 2014-06-25 18:41 /hbase/hbase.id -rw-r--r-- 3 hbase users 7 2014-06-25 18:41 /hbase/hbase.version drwxr-xr-x - hbase users 0 2014-06-25 21:49 /hbase/oldWALs
Starting and Stopping a Master
Using the “local-master-backup.sh” you can start up to 10 servers. Open the home folder of HBase, master and execute the following command to start it.
$ ./bin/local-master-backup.sh 2 4
To kill a backup master, you need its process id, which will be stored in a file named “/tmp/hbase-USER-X-master.pid.” you can kill the backup master using the following command.
$ cat /tmp/hbase-user-1-master.pid |xargs kill -9
Starting and Stopping RegionServers
You can run multiple region servers from a single system using the following command.
$ .bin/local-regionservers.sh start 2 3
To stop a region server, use the following command.
$ .bin/local-regionservers.sh stop 3
Starting HBaseShell
After Installing HBase successfully, you can start HBase Shell. Below given are the sequence of steps that are to be followed to start the HBase shell. Open the terminal, and login as super user.
Start Hadoop File System
Browse through Hadoop home sbin folder and start Hadoop file system as shown below.
$cd $HADOOP_HOME/sbin $start-all.sh
Start HBase
Browse through the HBase root directory bin folder and start HBase.
$cd /usr/local/HBase $./bin/start-hbase.sh
Start HBase Master Server
This will be the same directory. Start it as shown below.
$./bin/local-master-backup.sh start 2 (number signifies specific server.)
Start Region
Start the region server as shown below.
$./bin/./local-regionservers.sh start 3
Start HBase Shell
You can start HBase shell using the following command.
$cd bin $./hbase shell
This will give you the HBase Shell Prompt as shown below.
2014-12-09 14:24:27,526 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available HBase Shell; enter 'help' for list of supported commands. Type "exit " to leave the HBase Shell Version 0.98.8-hadoop2, r6cfc8d064754251365e070a10a82eb169956d5fe, Fri Nov 14 18:26:29 PST 2014 hbase(main):001:0>
HBase Web Interface
To access the web interface of HBase, type the following url in the browser.
https://localhost:60010
This interface lists your currently running Region servers, backup masters and HBase tables.
HBase Region servers and Backup Masters
HBase Tables
Setting Java Environment
We can also communicate with HBase using Java libraries, but before accessing HBase using Java API you need to set classpath for those libraries.
Setting the Classpath
Before proceeding with programming, set the classpath to HBase libraries in .bashrc file. Open .bashrc in any of the editors as shown below.
$ gedit ~/.bashrc
Set classpath for HBase libraries (lib folder in HBase) in it as shown below.
export CLASSPATH = $CLASSPATH://home/hadoop/hbase/lib/*
This is to prevent the “class not found” exception while accessing the HBase using java API.