Rocks Cluster : A Cluster Oriented Linux Distribution Or How To . - HPCKP

Transcription

Rocks cluster : a cluster oriented linux distributionor how to install a computer cluster in a day

Physical setup

Installing the Frontend

Installing the FrontendIf you have home made rolls or community rollsNow is the time to provide them

Installing the Frontend

What are “rolls” ?Rolls are packages of packages designed to integratethemselves in the managing system in the same way as thebase software, some of them are provided by the distributiondevelopers. On the other hand, extended documentation onhow to create new ones has promoted the appearance ofothers created by the community.Examples:- HPC: The primary purpose of the HPC Roll is to provide configured softwaretools that can be used to run parallel applications on your cluster.- SGE:- BIO- Area51

What are “rolls” ?Rolls are packages of packages designed to integratethemselves in the managing system in the same way as thebase software, some of them are provided by the distributiondevelopers. On the other hand, extended documentation onhow to create new ones has promoted the appearance ofothers createdthe community.HPC: Thebyprimarypurpose of the HPC Roll is to provideconfigured software tools that can be used to run parallelapplications on your cluster.Examples:- HPC: The primary purpose of the HPC Roll is to provide configured softwareThe following software packages are included in the HPC Roll:tools that can be used to run parallel applications on your cluster.- MPI over ethernet environments (OpenMPI, MPICH, MPICH2)- SGE: - PVM- BIO - Benchmarks (stream, iperf, IOzone)- Area51

What are “rolls” ?Rolls are packages of packages designed to integratethemselves in the managing system in the same way as thebase software, some of them are provided by the distributiondevelopers. On the other hand, extended documentation onhow to create new ones has promoted the appearance ofothers created by the community.Examples:- HPC: The primary purpose of the HPC Roll is to provide configured softwaretools that can be used to run parallel applications on your cluster.- SGE: The SGE Roll installs and configures the SUN Grid Engine scheduler.- BIO- Area51

What are “rolls” ?Rolls are packages of packages designed to integratethemselves in the managing system in the same way as thebase software, some of them are provided by the distributiondevelopers. On the other hand, extended documentation onhow to create new ones has promoted the appearance ofothers createdtheRollcommunity.SGE: ThebySGEinstalls and configures the SUN GridEngine scheduler.Provides:Examples:SGEready purposeto be used(preconfiguredhostsgroups, pe,- HPC: -Theprimaryof theHPC Roll isqueue,to provideconfiguredsoftwareetc. )tools that can be used to run parallel applications on your cluster.- Integrated with HPC roll (no extra configuration is needed to use- SGE: OpenMPI,The SGE Rollinstallsand configures the SUN Grid Engine scheduler.MPICor OpenMPI)- BIO- Area51

What are “rolls” ?Rolls are packages of packages designed to integratethemselves in the managing system in the same way as thebase software, some of them are provided by the distributiondevelopers. On the other hand, extended documentation onhow to create new ones has promoted the appearance ofothers created by the community.Examples:- HPC: The primary purpose of the HPC Roll is to provide configured softwaretools that can be used to run parallel applications on your cluster.- SGE: The SGE Roll installs and configures the SUN Grid Engine scheduler.- BIO: The Bio-informatics Roll is a collection of some of the most common bioinformatics tools that are being used by the community today.- Area51

What are “rolls” ?BIO: The Bio-informatics Roll is a collection of some of the most commonbio-informatics tools that are being used by the community today.Rollsare packagesof packages designed to integrate- HMMER- From Janelia Farm research institutethemselvesin thesystemin thesame way as the- NCBI BLAST- FrommanagingNational Center forBiotechnologyInformation- MpiBLAST- FromLos AlamosNational Laboratorybasesoftware,someof themare provided by the distribution- biopythondevelopers.the other hand, extended documentation on- ClustalW - OnFrom the European BioInformatics Institutehow- MrBayesto createnewoneshas promotedof- FromSchoolof ComputationalScience at the theFloridaappearanceState University- T Coffee- FrombyInformationGenomique et Structurale at Centre National de la Rechercheotherscreatedthecommunity.Scientifique- Emboss - From European Molecular Biology Institute- Phylip - From the Dept. of Biology at the University of WashingtonExamples:- fasta - From the University of Virginia- HPC:The primary purpose of the HPC Roll is to provide configured software- Glimmer - From Center for Bioinformatics and Computational Biology at the University oftoolsMarylandthat can be used to run parallel applications on your cluster.- TIGRAssemblerthe J.CraigVenter Institute- SGE:TheSGE Roll- Frominstallsandconfiguresthe SUN Grid Engine scheduler.-All theutilities mentionedare fromCPANof the most common bio- BIO:TheperlBio-informaticsRoll is belowa collectionof someperl-bioperlinformatics toolsthat are being used by the community today.perl-bioperl-ext- Area51 perl-bioperl-runperl-bioperl-db

What are “rolls” ?Rolls are packages of packages designed to integratethemselves in the managing system in the same way as thebase software, some of them are provided by the distributiondevelopers. On the other hand, extended documentation onhow to create new ones has promoted the appearance ofothers created by the community.Examples:- HPC: The primary purpose of the HPC Roll is to provide configured softwaretools that can be used to run parallel applications on your cluster.- SGE: The SGE Roll installs and configures the SUN Grid Engine scheduler.- BIO: The Bio-informatics Roll is a collection of some of the most common bioinformatics tools that are being used by the community today.- Area51: The Rocks Area51 Roll contains utilities and services used to analyzethe integrity of the files and the kernel on your cluster.

What are “rolls” ?Rolls are packages of packages designed to integratethemselves in the managing system in the same way as thebase software, some of them are provided by the distributiondevelopers. On the other hand, extended documentation onhow to create new ones has promoted the appearance ofothers createdthe community.Area51:byThe Rocks Area51 Roll contains utilities and servicesused to analyze the integrity of the files and the kernel on yourcluster.Examples:softwareofpackagesare includedin theArea51 Roll:- HPC: TheThe followingprimary purposethe HPC Rollis to provideconfiguredsoftwaretools that can be used to run parallel applications on your cluster.- Tripwire- SGE: -TheSGE Roll installs and configures the SUN Grid Engine scheduler.chkrootkit- BIO: The Bio-informatics Roll is a collection of some of the most common bioinformatics tools that are being used by the community today.- Area51: The Rocks Area51 Roll contains utilities and services used to analyzethe integrity of the files and the kernel on your cluster.

Installing the Frontend

Installing the Frontend

Installing the Frontend

Installing the Frontend

Installing the Frontend

Installing the Frontend

Installing the Frontend

Installing the Frontend

Installing the Frontend

Installing the Frontend

Installing the FrontendCompute node default partitioningPartition NameSize/16 GBSwap1 GB/var4 GB/state/partition1remainder of root disk

Installing the FrontendDONE!!!

Installing compute nodes]# insert-ethers

Installing compute nodes

Customization and postconfigurationAdding external NFS servers]# echo "data0 -fstype nfs4,rsize 32768,wsize 32768,nodev,nosuid, netdev,intr,noatime,nostrict 10.3.1.3:/&data1 -fstype nfs4,rsize 32768,wsize 32768,nodev,nosuid, netdev,intr,noatime,nostrict 10.3.1.3:/&data2 -fstype nfs4,rsize 32768,wsize 32768,nodev,nosuid, netdev,intr,noatime,nostrict 10.3.1.3:/&apps -fstype nfs4,rsize 32768,wsize 32768,nodev,nosuid, netdev,intr,noatime,nostrict 10.3.1.3:/&" /etc/auto.share]# rocks sync config]# cd /var/411]# makeAdding extra RPMs]#]#]#]#]#cp my new rpm.el5.x86 64.rpm /export/rocks/install/contrib/5.4/x86 64/RPMS/vi nd-login.xmlcd /export/rocks/install/rocks sync configrocks create distroNow we should reinstall all nodes :( but we can do this:]# rocks run host rpm -Uvh /share/rocks/install/contrib/5.4/x86 64/RPMS/

Monitoring the cluster

Monitoring the clusterGanlia Is installed and configured automaticallyhttp://your frontend adress/gangliaBut, most of the times, I prefer to use:#] qstat -f -u \* less

I have not talk about:( and you are probably going to ask)-Lustre.There Is a lustre roll and lots of “How to install lustre on rocks”- Infiniband.How to configure a fast network for message passing is explained in the basicmanual, apart from that, there are infiniband rolls and “Howtos”and I know acompany in Barcelona (ANIMA) that installs computer clusters with rocks clusterand infiniband

No todo el monte es orégano- By default it is intended that users access the cluster using the frontend.This can be corrected installing a node with the login appliance- /home and /share/apps are in the frontend node and are exported to allnodes, even to itself, through NFS, this can cause huge “iowait” problems.It can be corrected “never” using those directories and using external file serversto store your data and applications-Compute nodes OS is dependent to frontend OSThis can not be corrected :(

The Rocks Area51 Roll contains utilities and services used to analyze the integrity of the files and the kernel on your cluster. Area51: The Rocks Area51 Roll contains utilities and services used to analyze the integrity of the files and the kernel on your cluster. The following software packages are included in the Area51 Roll: - Tripwire .