University of VictoriaFaculty of EngineeringCoop Workterm ReportHadoop Distributed File System Propagation Adapterfor NimbusDepartment of PhysicsUniversity of VictoriaVictoria, BCMatthew VlietV00644304Computer [email protected] 31, 2010In partial fulfilment of the requirements of theBachelor of Engineering Degree1

Contents1 Report Specification1.1 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1.3 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44442 Introduction2.1 Hadoop Distributed File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .453 HDFS Propagation Adapter3.1 Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.2 Unpropagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7784 Future Work85 Conclusion86 Acknowledgments97 Glossary10Appendices12A Source Code12A.1 propagate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

List of Figures12345Example request and retrival of a file from HDFSHDFS vs. SCP vs. HTTP file transfer speeds . .hadoop source test command . . . . . . . . . . .hadoop copyToLocal command . . . . . . . . . .hadoop copyFromLocal command . . . . . . . . .3.67888

Hadoop Distributed File System Propagation Adapter for NimbusMatthew Vliet - [email protected] 31, 2010AbstractNimbus is an open source toolkit providing Infrastructure as a Service capabilities (IaaS) to a Linuxcluster. Nimbus, as with all Infrastructure as a Service software, needs a method of storing virtualmachine disk images in a reliable and easily accessible manner. The current bottleneck preventingNimbus from scaling to very large deployments is the lack of a high performance and reliable methodfor distributing and storing virtual machine images. The Hadoop Distributed File System (HDFS) isintroduced as a possible solution to alleviate the bottleneck and allow Nimbus to scale to very largeIaaS deployments. HDFS provides is fault tolerant distributed file system that can easily be adapted tofunction as an image storage and distribution system for Nimbus.11.1Report SpecificationAudienceThe intended audience of this report are the members of the High Energy Physics Research Computinggroup at the University of Victoria, and future co-op students within the group. Users of the Nimbus toolkitmay also be interested as the work directly relates to that software.1.2PrerequisitesIt is assumed that the readers of this document are familiar with the general concepts of virtualization, gridcomputing, compute clusters, networking, and file systems. A general knowledge of the Nimbus[1] toolkitwill also be helpful.1.3PurposeThe contents of this report are intended to explain how the Hadoop Distributed File System(HDFS[2]) canbe used alleviate a performance bottleneck observed in the virtual machine image distribution mechanismscurrently used by the Nimbus toolkit. By adapting Nimbus to use HDFS as a storage and distributionmechanism, the performance bottleneck can be eliminated. Integration of Nimbus and HDFS is achieved bythe implimentation of an adapter allowing Nimbus to store and retrieve virtual machine images from HDFS.2IntroductionIn the field of cloud computing, one of the primary difficulties often encountered is the efficient distributionand management of large VM image files to nodes within the cloud environment. The most commonlyemployed solution to the issue of image management and distribution is to use make use of an image repositorywhere all images are stored and managed. The repository acts as a central location to which cloud users canupload their virtual machine images, and from which machines in the cloud can retrieve the images neededto boot virtual machines. The Nimbus[1] toolkit, used by researchers at the High Energy Physics Research4

Computing group (HEPRC) at the University of Victoria, is one of the many available toolkits that allowsfor the creation of Infrastructure-As-A-Service (IaaS) computing environments and employs the concept ofa central repository for the storage of disk images.Currently when Nimbus starts a virtual machine it must first get a copy of the disk image from a remoteimage repository, this process is referred to as propagation. The default method for the propagation ofimages from the image repository to a virtual machine manager(VMM) is to use SCP. This means that theall VM disk images will be propagated from a single server where the disk image resides. When there areonly a handful of concurrent propagation requests to the single repository server there are no issues, theserver load will be relatively low. When there are many tens or hundreds of concurrent propagation requeststo the single repository server, the load on the server is too high to deal with in a timely manner. Thisreport describes a method of using a distributed file system to reduce to the load experienced by a singleserver by spreading the load to the many servers that comprise a HDFS.2.1Hadoop Distributed File SystemThe Hadoop Distributed File Systems main goal is to provide a distributed fault tolerant file system. HDFSachieves its fault tolerance by splitting up files into blocks of a set size and distributing multiple replicas ofthose blocks across multiple storage nodes called data-nodes. The distribution of blocks of data takes intoaccount factors such as the physical location of data nodes as well as where other related blocks of data arestored. Whenever possible no two replicas of the same block will be stored on the same data-node.When accessing a file located on a HDFS the client talks to a single server referred to as the name-node.This name-node can be thought of as a namespace lookup as it does not store any of the data blocks, itsimply contains a listing of which data-nodes contain which blocks of a file. When the name-node receivesa request to access a file the name-node responds to the client not with the data, but with a list block idsand which data-node contains each block. The client then proceeds to request the individual blocks fromthe many data-nodes where the blocks are stored. The blocks of data are reassembled on the client side tocreate the requested file. Since no single server is responsible for transmitting the file to the client, the loadis distributed across the many data-nodes that contain the blocks of data. This results in faster file transfertimes over that of a single file server using a protocol such as SCP or HTTP. The speed increase is especiallytrue when many clients are requesting files from HDFS as the load is distributed to all data-nodes ratherthen the single file server. A simplified example of a client requesting a file from HDFS is shown in Figure1. In this example a client attempts to retrieve a single file, which has been broken into nine blocks, froman HDFS where there are one name nodes and four data nodes.5

Figure 1: Example request and retrieval of a file from HDFSTo illustrate the improvement that HDFS can provide, the speeds of HDFS, SCP, and HTTP file transferswere compared by by simultaneously transmitting a 500MB file to 11 nodes of a cluster. The results from thistest can be seen in Figure 2. The block replication values refers to the number of times that a unique datablock is replicated across all data-nodes, a value of 10 would mean there are ten copies of the block scatteredacross all data-nodes. In this particular experiment, there are ten data-nodes and a single name-node in thetest cluster. As can be seen in Figure 2, the fastest transfer times were achieved when using HDFS with ahigh block replication value.6

Figure 2: HDFS vs. SCP vs. HTTP file transfer speeds3HDFS Propagation AdapterIn order for Nimbus to communicate with HDFS a propagation adapter must be written to translate Nimbuspropagation requests to requests for files from HDFS. Several methods of interfacing Nimbus with HDFSwere examined, with only one method be viable for use in a propagation adapter for Nimbus. The exploredmethods included using the native Java hdfs library, a C language library named libhdfs[4], a languageindependent Thrift[5] gateway server, and finally the hadoop command line program supplied by a Hadoopinstall.Due to workspace-control being written in Python, the method used to interface with HDFS needs to beaccessible from a Python script. This immediately removes the possibility of using the native Java packageeasily or efficiently. The C library was initially tried, but it was found to take take significant work to writea error free wrapper allowing the library to be called Python. Another downside of the C library is that it isactually just a wrapper around the native Java library. The Thrift gateway server was also turned down asits read and write methods force UTF-8 encoding making it useless for binary encoded data. This left thehadoop command line program as the only option for an interface to HDFS.The usage of the hadoop program is quite simple and is easy incorporated into a propagation adapter.Among its many features, the hadoop program provides an interface that bridges the gap between a posixsystem and HDFS. Among the available commands, only copyToLocal, copyFromLocal, and test are neededby the HDFS propagation adapter.3.1PropagationWhen attempting to propagate an image from an HDFS repository onto a VMM, the propagation adapterfirst checks the propagation source to see if the image exists. This is done by calling the hadoop programand utilizing the test command as seen in Figure 3. The -e flag informs the program to test if the specified7

path exists on the specified HDFS.hadoop fs -fs hdfs:// name-node -test -e path to image Figure 3: hadoop source test commandOnce the source has been confirmed to exist, the adapter then runs the copyToLocal command to transfera copy of the image from the specified HDFS to a location local to the VMM. The format of this commandis seen in Figure 4. Once the file transfer has completed, the image has been propagated.hadoop fs -fs hdfs:// name-node -copyToLocal path to image local path Figure 4: hadoop copyToLocal command3.2UnpropagationUnpropagation is a feature supported by Nimbus that allows for a virtual machine image to be saved back toa repository after it has been used to execute a VM. The HDFS propagation adapter supports unpropagationwith one restriction. This restriction being that the unpropagation destination must not already exist. Inother words, there can not exist a file on the HDFS with the same path and name as requested destinationfor unpropagation. The naming restriction is imposed as a safeguard to prevent overwriting existing dataon the HDFS.Once the destination is confirmed to not exist, unpropagation to an HDFS repository is accomplishedby running the copyFromLocal command. This command acts to transfer a file from the local system to aremote HDFS. The structure of this command can be seen in Figure 5hadoop fs -fs hdfs:// name-node -copyFromLocal local path remote destination Figure 5: hadoop copyFromLocal command4Future WorkFuture work for the HDFS propagation adapter will mainly consist of maintenance and bug fixes. Theadapter is complete, however future work can take place on better integration of HDFS with Nimbus. In thecurrent version of Nimbus HDFS is supported as a propagation scheme, it is not however intimately awareof the HDFS itself. In future versions of Nimbus it will be possible to give the Nimbus service more controlover the management of disk images stored on the HDFS.5ConclusionIn order to compare the performance benefit of using HDFS verses the previous methods of image propagationa fully working propagation adapter was written to allow Nimbus to communicate with HDFS. This adapterprovides the basic functionality needed for Nimbus to retrieve and store images on a HDFS. Althoughthe provided functionality works and is all that is need, the current implementation does not a provide atight integration with the all of the Nimbus tools. The project is considered a success in that the initialrequirements of allowing Nimbus to retrieve and store images on HDFS have been met. There is a largeamount of room to further improve the system as mentioned in the Future Work section.8

6AcknowledgementsI would like to thank Dr. Randall Sobie for this work term opportunity. Additional thanks to Ian Gable forguidance on this work term, the Nimbus developers, and all members of the HEPRC group.9

7GlossaryHadoop A map-reduce based data analysis environment developed by Apache.HDFS Hadoop Distributed File System. The default file system bundled with a Hadoop installation.Nimbus EC2 compatible VM management software aimed at the scientific community.Node A physical computer within a cluster.SCP Secure Copy Protocol. A secure remote file transfer utility.UTF-8 A form of Unicode text encodingVM Virtual MachineVMM VM Manager. A piece of software that manages the creation and termination of VMs.10

References[1] Nimbus toolkit -[2] Hadoop Distributed File System -[3] Hadoop -[4] libhdfs -[5] Apache Thrift -

AppendicesAA.1Source Codepropagate hdfs.pyfrom commands import getstatusoutputimport osfrom time import timefrom propagate adapter import PropagationAdapterfrom workspacecontrol.api.exceptions import *class propadapter(PropagationAdapter):"""Propagation adapter for HDFS.Image file must exist on HDFS to validate propagationImage file must not exist on HDFS to validate unpropagation"""def init (self, params, common):PropagationAdapter. init (self, params, common)self.hadoop None# Hadoop executable locationself.parsed source url None# Source URL when propagatingself.parsed dest url None# Destination URL when unpropagatingdef validate(self):self.c.log.debug("Validating hdfs propagation adapter")self.hadoop self.p.get conf or none("propagation", "hdfs")if not self.hadoop:raise InvalidConfig("no path to hadoop")# Expand any enviroment variables firstself.hadoop os.path.expandvars(self.hadoop)if os.path.isabs(self.hadoop):if not os.access(self.hadoop, os.F OK):raise InvalidConfig("HDFS resolves to an absolute path, but it does not seem to exist: ’if not os.access(self.hadoop, os.X OK):raise InvalidConfig("HDFS resolves to an absolute path, but it does not seem executable:else:raise InvalidConfig("HDFS contains unknown enviroment varibales: ’%s’" % self.hadoop)self.c.log.debug("HDFS configured: %s" % self.hadoop)def validate propagate source(self, imagestr):# Validate uri formatif imagestr[:7] ! ’hdfs://’:raise InvalidInput("Invalid hdfs url, must be of the form hdfs://")12

# Validate file exists on filesystemcmd self. generate hdfs test cmd(imagestr)try:status,output getstatusoutput(cmd)except:errmsg "HDFS validation - unknown error when checking that file exists."self.c.log.error(errmsg)raise UnexpectedError(errmsg)# 0 returned if file exists# 1 returned if file does not existif status:errmsg "HDFS validation - file does not exist on hdfs."self.c.log.error(errmsg)raise InvalidInput(errmsg)def validate unpropagate target(self, imagestr):# Validate uri formatif imagestr[:7] ! ’hdfs://’:raise InvalidInput("Invalid hdfs url, must be of the form hdfs://")# Validate file does not exists on filesystem alreadycmd self. generate hdfs test cmd(imagestr)try:status,output getstatusoutput(cmd)except:errmsg "HDFS validation - unknown error when checking that directory exists."self.c.log.error(errmsg)raise UnexpectedError(errmsg)# 0 returned if file exists# 1 returned if file does not existif not status:errmsg "HDFS validation - File already exists at destination: %s" % imagestrself.c.log.error(errmsg)raise InvalidInput(errmsg)def propagate(self, remote source, local absolute target)"HDFS propagation - remote source: %s" % remote source)"HDFS propagation - local target: %s" % local absolute target)cmd self. generate hdfs pull cmd(remote source, local absolute target)"Running HDFS command: %s" % cmd)transfer time -time()try:status,output getstatusoutput(cmd)except:errmsg "HDFS propagation - unknown error. Propagation failed"self.c.log.error(errmsg)raise UnexpectedError(errmsg)if status:errmsg "problem running command: ’%s’ ::: return code" % cmd13

errmsg ": %d ::: output:\n%s" % (status, output)self.c.log.error(errmsg)raise UnexpectedError(errmsg)else:transfer time time()"Transfer complete: %fs" % round(transfer time))def unpropagate(self, local absolute source, remote target)"HDFS unpropagation - local target: %s" % local absolute target)"HDFS unpropagation - remote source: %s" % remote source)cmd self. generate hdfs push cmd(remote source, local absolute target)"Running HDFS command: %s" % cmd)transfer time -time()try:status,output getstatusoutput(cmd)except:errmsg "HDFS unpropagation - unknown error. Unpropagation failed"self.c.log.error(errmsg)raise UnexpectedError(errmsg)if status:errmsg "problem running command: ’%s’ ::: return code" % cmderrmsg ": %d ::: output:\n%s" % (status, output)self.c.log.error(errmsg)raise UnexpectedError(errmsg)else:transfer time time()"Unpropagation transfer complete: %fs" % round(transfer ------------------------------# Private helper functionsdef generate hdfs pull cmd(self, remote target, local absolute target):# Generate command in the form of:# /path/to/hadoop/bin/hadoop fs -fs files system uri -copyToLocal src localdst if not self.parsed source url:self.parsed source url self. parse url(remote target)ops [self.hadoop, "fs","-fs", self.parsed source url[0] ’://’ self.parsed source url[1],"-copyToLocal", self.parsed source url[2], local absolute target]cmd " ".join(ops)return cmddef generate hdfs push cmd(self, remote target, local absolute target):# Generate command in the form of:# /path/to/hadoop/bin/hadoop fs -fs files system uri -copyFromLocal local dst if not self.parsed dest url:self.parsed dest url self. parse url(remote target)14

ops [self.hadoop, "fs","-fs", self.parsed dest url[0] ’://’ self.parsed dest url[1],"-copyFromLocal", local absolute target ,self.parsed dest url[2]]cmd " ".join(ops)return cmddef generate hdfs test cmd(self, imagestr):# Generate command in the form of:# /path/to/hadoop/bin/hadoop dfs -fs file system uri -test -e path if not self.parsed source url:self.parsed source url self. parse url(imagestr)ops [self.hadoop, "fs","-fs", self.parsed source url[0] ’://’ self.parsed source url[1],"-test", "-e", self.parsed source url[2]]cmd " ".join(ops)return cmddef parse url(self, url):# Since Python2.4 urlparse library doesn’t recognize the hdfs scheme,# we need to parse the url by hand.url url.split(’://’)if len(url) ! 2:raise InvalidInput("url not of the form scheme :// netloc / path ")scheme url[0]netloc, path url[1].split(’/’, 1)# Add leading / back in since it was used to partition netloc from pathpath ’/’ pathreturn (scheme, netloc, path)15