Wednesday, August 21, 2013

LVM setup - Some hadoop stories

when creating our first cluster we dint listen to the advices. we made a number of mistakes :

We assigned the hadoop data and log directories to subdirectories within the root(/). It ran fine for some months when finally the hdfs got full.The hdfs consumed entire space in (/) leaving the Os to go for a slow death.
I had other empty partions availabel which i configured as alternaive to the dfs.data.dir argument.

But that couldnt save my root fs. Frequent read write by hdfs into the (/) fragmented my filesystem that a good chunk of the free space became non recoverable. atlast we decided to pull down the cluster.

Learning from the experience we decided that there are two problems when we move with conventional approach to hadoop node.

1) We need to have a mechanism by which we should be able to add extra hard disk to the nodes grow their size, without affecting the cluster existing filesystem. We also want such a system to effectively remove the space allocated when the cluster is not much heavily used. The solution was to go for LVM - Logical Volume Management.

2) Never ever use a root directory for storing the log and data directories of Hadoop.

With this we delved into creating a LVM in each node of our cluster.

I basicaly followed the instruction as given in
http://www.howtoforge.com/linux_lvm

some commands to keep in touch for the exercise are :

fdisk

create and display physical volume

pvcreate
pvdisplay

create and display volume group

vgcreate
vgdisplay

create and display logical volume

lvcreate
lvdisplay

mkfs


Once these logical volumes are created we can dynamically increase and decrease their size by issuing commands like:

lvextend -L80G /dev/fileserver/media
extends the drive form current size to 80GB

and resize the filesystem - 
resize2fs /dev/fileserver/media

for reducing the device..
first unmoun it
umount /media/data1
resize2fs /dev/fileserver/media 70GB
reduce the fs to 70GB

lvreduce -L70G /dev/fileserver/media
reduce size by 10G to 70GB

That way if our hdfs runs out of storage we can add disk in the fly and keep increasing the logicalvolumes .

7 comments:

Unknown said...

I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.
Web designing course in chennai | Web design training in chennai

Unknown said...

Thanks for sharing this valid information, it's a good post i have seen so keep sharing..
Regards,
PHP Training in Chennai

Unknown said...

Great Story regarding Hadoop technology, thanks for sharing this useful post,...
Regards,
Ethical Hacking Course in Chennai | Ethical Hacking Course

Unknown said...

Thanks of sharing this post…Python is the fastest growing language that helps to get your dream job in a best way, so if you wants to become a expertise in python get some training on that language.
Regards,
Python Training in Chennai|python training chennai|Python Course in Chennai

Unknown said...

Thanks Admin for sharing such a useful post, I hope it’s useful to many individuals for developing their skill to get good career.
Regards,
Ethical hacking Course in Chennai|Ethical hacking course

Divit said...


interesting blog. It would be great if you can provide more details about it. Thanks you

Hadoop Certification in Chennai

Unknown said...


Hadoop is the hottest new skill on which IT professionals are aimed to get it. Hadoop certification gives assurance that the candidate can handle data and he is the responsible person for it. There are many institutes to provide this big data hadoop training institute in chennai with a team of professional experts.