paint-brush
Sneak Peak into Apache Zookeeperby@abhishekamralkar
1,373 reads
1,373 reads

Sneak Peak into Apache Zookeeper

by Abhishek AmralkarAugust 28th, 2018
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Apache Zookeeper is open source tool from Apache Foundation. Originally developed at Yahoo. Thanks Yahoo for the Zookeeper.

Company Mentioned

Mention Thumbnail
featured image - Sneak Peak into Apache Zookeeper
Abhishek Amralkar HackerNoon profile picture

Apache Zookeeper is open source tool from Apache Foundation. Originally developed at Yahoo. Thanks Yahoo for the Zookeeper.

Zookeeper is written in Java and it is platform independent.

What is Distributed Systems?

Multiple independent computers connected together and appears as single computer to the users. Distributed System communicate through network by passing messages. All components in distributed system interact with each other to performs subsets of tasks to achieve common goals

Why to use Distributed System?

  • Reliability : System will continue to run even if one or more servers in Distributed system fails.
  • Scalability: System can be horizontally upscale and downscale as per the workload requirement.

Challenges of Distributed System?

  • Race Condition: A race condition occurs when two or more threads access shared data and they try to change it at the same time.
  • Deadlock: A deadlock occur when each node in the cluster is waiting for the other member in the cluster to release the lock.

Because Coordinating Distributed Systems is a Zoo

Zookeeper is a high-performance coordination service used by distributed systems. Zookeeper allows distributed systems to coordinate each other through a hierarchical name-space of data registrars.

Why we need service like Zookeeper?

1. Configuration Management

2. Synchronization

3. Leader Election

4. Naming Service

5. Notification System

Zookeeper provides API for achieving all the above tasks.

Zookeeper Architecture

Zookeeper service runs in 2 mode

  • Standalone
  • Quorum

In standalone mode Zookeeper service runs on single server and no state is replicated.

In quorum mode, a group of Zookeeper servers, which we call a Zookeeper ensemble, replicates the state, and together they serve client requests.

  1. Collection of Zookeeper servers is called as Ensemble.
  2. At any point of time Zookeeper will have 1 leader and rest followers in ensemble. Its good practice to have at least 3 Zookeeper servers in ensemble to run production environment.
  3. Client will always connect to any one of the Zookeeper servers at any point of time.
  4. Any Zookeeper server in ensemble can serve read client requests.
  5. All write request will always goes to leader Zookeeper. Zookeeper uses ZAB (Zookeeper Atomic Broadcast) protocol which make sure to propagates changes from leader to followers for all the writes.
  6. Zookeeper consist of hierarchical name-spaces.
  7. Each node in name-space is called as Znode.

Znode:

Every node in Zookeeper tree is Znode. Znode in Zookeeper maintain stat structure.

A Znode can either be persistent or ephemeral. As name says persistent Znode will remain in Zookeeper until and unless a delete call not made and ephemeral Znode will get deleted when the Zookeeper client gets disconnect. There is also a sequential Znode and it appends a unique number to the Znode. So there are basically 4 types of Znode persistent, ephemeral, persistent_sequential, ephemeral_sequential.

ZAB:

All write requests from clients to followers will always get forwarded to leader. The leader executes the request and broadcasts the result of the execution as a state update, in the form of a transaction. A transaction comprises the exact set of changes that a server must apply to the data tree when the transaction is committed.

How to install standalone Zookeeper up and running?

Download Zookeeper latest stable version from here. Note this link may vary from mirror to mirror.

I usually keep my installations in /opt but you guys can download it in your choice of location on your server/machine.

* wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz

* tar xzf zookeeper-3.4.13.tar.gz

* cd zookeeper-3.4.13

* cp conf/zoo_sample.cfg conf/zoo.cfg

To start Zookeeper server

* sudo bin/zkServer.sh start

You sgould see output like below id Zookeeper start successfully



ZooKeeper JMX enabled by defaultUsing config: /opt/zookeeper-3.4.13/bin/../conf/zoo.cfgStarting zookeeper … STARTED

Zookeeper CLI

  1. To create /abhishek a persistent Znode

create /abhishek "anay"

Output should be like below

Created /abhishek

To create a sequential Znode use -s flag and to create ephemeral Znode use -e flag with above command.

2. To get data from Zookeeper

get /abhishek

You should see output like below












anaycZxid = 0x9ctime = Sun Nov 11 20:55:34 IST 2018mZxid = 0x9mtime = Sun Nov 11 20:55:34 IST 2018pZxid = 0x9cversion = 0dataVersion = 0aclVersion = 0ephemeralOwner = 0x0dataLength = 4numChildren = 0

Lets understand the above output it is also a Znode metadata

  • anay: Its data that Znode holds.
  • cZxid: 0x9 is the Zookeeper Transaction Id.
  • ctime: Znode creation time.
  • mZxid: Zxid of the change when last modified.
  • mtime: Last modified time for the Znode.
  • pZxid: Zxid of the change that last modified children of this znode.
  • cversion: Number of changes to the child of the Znode.
  • dataVersion: Number of changes to the Znode.
  • aclVersion: Number of changes to the acl of the Znode.
  • ephemerlOwner: The session id of the owner of this Znode if the Znode is an ephemeral node. If it is not an ephemeral node, it will be zero.
  • dataLength: Length of the data in Znode.
  • numChildren: Number of children to this Znode.

To create a sub Znode

create /abhishek/anayamralkar loveyouson

3. watch command

It shows notification when the data in the Znode change we can use watch command with get command.

get /abhishek [watch] 1

Output similar to get command but it shows notification when data in Znode get change.













get /abhishek [watch] 1anaycZxid = 0x9ctime = Sun Nov 11 20:55:34 IST 2018mZxid = 0x9mtime = Sun Nov 11 20:55:34 IST 2018pZxid = 0x9cversion = 0dataVersion = 0aclVersion = 0ephemeralOwner = 0x0dataLength = 4numChildren = 0

4. set command is use to change the data associated with Znode

set /abhishek anayamralkar

5. delete command to delete the znode

delete /abhishek

6. ls command to list the Znodes and children




[zk: localhost:2181(CONNECTED) 31] ls /[zookeeper, abhishek][zk: localhost:2181(CONNECTED) 32] ls /abhishek[anayamralkar]

7. stat command to check the status of Znode and children

stat /

stat /abhishek

8. help command to see all available commands






















ZooKeeper -server host:port cmd argsstat path [watch]set path data [version]ls path [watch]delquota [-n|-b] pathls2 path [watch]setAcl path aclsetquota -n|-b val pathhistoryredo cmdnoprintwatches on|offdelete path [version]sync pathlistquota pathrmr pathget path [watch]create [-s] [-e] path data acladdauth scheme authquitgetAcl pathcloseconnect host:port

9. quit command to quit the Zookeeper shell

quit