Ganglia Users Guide - Central-7-0-x86-64.rocksclusters

Transcription

Ganglia Users Guide7.0 Edition

Ganglia Users Guide7.0 EditionPublished Dec 01 2017Copyright 2017 University of CaliforniaThis document is subject to the Rocks License (see Appendix: Rocks Copyright).

Table of ContentsPreface. v1. Overview . 12. Installing . 22.1. On a New Server . 22.2. On an Existing Server. 23. Using the Ganglia Roll. 43.1. Using the Ganglia Roll . 44. Customizing the Ganglia Roll . 64.1. Customizing the Ganglia Web interface. 64.2. Monitoring Multiple Clusters with Ganglia . 6A. Frequently Asked Questions . 9B. Rocks Copyright. 10C. Third Party Copyrights and Licenses . 11C.1. Ganglia. 11C.2. phpsysinfo. 11C.3. rrdtool . 20iii

List of Tables1-1. Summary. 11-2. Compatibility . 1iv

PrefaceThis Roll installs and configures the Ganglia1 cluster monitoring system.Notes1. http://ganglia.info/v

Chapter 1. OverviewTable 1-1. SummaryNamegangliaVersion7.0Maintained ByRocks GroupArchitecturei386, x86 64Compatible with Rocks 7.0The ganglia roll has the following requirements of other rolls. Compatability with all known rolls is assured, andall known conflicts are listed. There is no assurance of compatiblity with third-party rolls.Table 1-2. CompatibilityRequiresConflictsBaseKernelOSWeb Server1

Chapter 2. Installing2.1. On a New ServerThe ganglia roll should be installed during the initial installation of your server (or cluster). This procedure isdocumented in section 3.2 of the Rocks usersguide. You should select the ganglia roll from the list of availablerolls when you see a screen that is similar to the one below.2.2. On an Existing ServerThe Ganglia Roll can be installed on running frontend.The following procedure will install the roll on the frontend. After the frontend reboots, the roll will be fullyconfigured.First download the Ganglia Roll ISO from the Rocks web site. Then, as root, execute:#####rocks add roll ganglia*isorocks enable roll gangliacd /export/rocks/installrocks create distrorocks run roll ganglia bashThen reboot:# init 62

Chapter 2. InstallingTo apply ganglia to the compute nodes, you will need to reinstall the compute nodes, e.g.:# rocks set host boot compute action install# rocks run host compute command "reboot"3

Chapter 3. Using the Ganglia Roll3.1. Using the Ganglia Roll3.1.1. Cluster StatusYou can check the status of your cluster by pointing a browser to http://YOUR FRONTEND NAME/ganglia/(see the image below for an example). This link provides a graphical interface to live cluster informationprovided by Ganglia monitors1 running on each cluster node. The monitors gather values for various metricssuch as CPU load, free memory, disk usage, network I/O, operating system version, etc. These metrics are sentthrough the private cluster network and are used by the frontend node to generate the historical graphs.In addition to metric parameters, a heartbeat message from each node is collected by the Ganglia monitors.When a number of heartbeats from any node are missed, this web page will declare it "dead". These dead nodesoften have problems that require additional attention, and are marked with the Skull-and-Crossbones icon, or ared background.Ganglia2 was designed at Berkeley by Matt Massie (massie@cs.berkeley.edu) in 2000, and is currentlydeveloped by an open source partnership between Berkeley, SDSC, and others. It is distributed throughSourceforge.net and GitHub.com under the GPL software license.4

Chapter 3. Using the Ganglia RollNotes1. http://ganglia.info/2. http://ganglia.info/5

Chapter 4. Customizing the Ganglia Roll4.1. Customizing the Ganglia Web interfaceThe Ganglia Web interface (at http://YOUR FRONTEND NAME/ganglia/) allows extensive customization.This is done by modifying the file /var/www/html/ganglia/conf.php on your frontend. The defaultconfiguration file contains: ?php conf[’rrdtool’] "/opt/rocks/bin/rrdtool";? If you would like to change the font used in the various graphs, for example, use something like the followinginstead of the conf[’rrdtool’] line above; "Sans" is the font to use: conf[’rrdtool’] "env RRD DEFAULT FONT ’Sans’ /opt/rocks/bin/rrdtool";You can also set the default metric and prevent certain graphs from appearing; simply add something like thefollowing to somewhere between the ?php and ? lines: conf[’show stacked graphs’] 0; conf[’default metric’] ’cpu report’;You can also override the installation defaults supplied in the file/var/www/html/ganglia/conf default.php. For example, if you would like to modify the list of timeranges available, you could add something like the following to conf.php: conf[’time ranges’] array(’15min’ 900,’hour’ 3600,’2hr’ 7200,’4hr’ 14400,’day’ 86400,’3day’ 259200,’week’ 604800,’month’ 2419200,’year’ 31449600);Note that you should not modify conf default.php directly!For further ideas on customizing conf.php, please read the default configuration file/var/www/html/ganglia/conf default.php. You should also see the Ganglia Web 2 homepage1.6

Chapter 4. Customizing the Ganglia Roll4.2. Monitoring Multiple Clusters with GangliaGanglia has the ability to track and present monitoring data from multiple clusters. A collection of monitoredclusters is called a Grid in Ganglia’s nomenclature. This section describes the steps required to setup amulti-cluster monitoring grid.The essential idea is to instruct the gmetad daemon on one of your frontend nodes to track the second cluster inaddition to its own. This procedure can be repeated to monitor a large set clusters from one location.For this discussion, your two clusters are named "A" and "B". We will choose the frontend on cluster "A" to bethe top-level monitor.1. On "A" frontend, add the line to /etc/gmetad.conf:data source "Cluster B" B.frontend.domain.nameThen restart the gmetad server on "A" frontend.2. On "B" frontend, get the IP address of "A.frontend.domain.name" and edit /etc/ganglia/gmond.confand change the section from:tcp accept channel {port 8649acl {default "deny"access {ip 127.0.0.1mask 32action "allow"}access {ip 10.0.0.0mask 8action "allow"}}}to:tcp accept channel {port 8649acl {default "deny"access {ip 127.0.0.1mask 32action "allow"}access {ip 10.0.0.0mask 8action "allow"}access {ip ip-address-of-A.frontendmask 32action "allow"}7

Chapter 4. Customizing the Ganglia Roll}}Then restart gmond server on "B" frontend.3. Take a look at the Ganglia page on "A". It should include statistics for B, and a summary or "roll-up" viewof both clusters.This screenshot is from the iVDGL Physics Grid3 project. It is a very large grid monitored by Ganglia in asimilar manner as specified here.Notes1. lia-web-28

Appendix A. Frequently Asked Questions1. I see IP addresses but not names in my Ganglia graphs. Why is this?The DNS system in the cluster sometimes causes Ganglia to record bogus node names (usually their IPaddresses). To clear this situation, restart the "gmond" and "gmetad" services on the frontend. This action may beuseful later, as it will flush any dead nodes from the Ganglia output.# service gmond restart# service gmetad restartThis method is also useful when replacing or renaming nodes in your cluster.2. When looking at the Ganglia page, I don’t see graphs, just the error:There was an error collecting ganglia data (127.0.0.1:8652): XML error: notwell-formed (invalid token) at xxxThis indicates a parse error in the Ganglia gmond XML output. It is generally caused by non-XML characters (&especially) in the cluster name or cluster owner fields, although any Ganglia field (including node names) withthese characters will cause this problem.We hope future versions of Ganglia will correctly escape all names to make them XML safe. If you have a badname, edit /etc/ganglia/gmond.conf on the frontend node, remove the offending characters, then restartgmond.9

Appendix B. Rocks CopyrightRocks(r)www.rocksclusters.orgversion 6.2 (SideWinder)version 7.0 (Manzanita)Copyright (c) 2000 - 2017 The Regents of the University of California.All rights reserved.Redistribution and use in source and binary forms, with or withoutmodification, are permitted provided that the following conditions aremet:1. Redistributions of source code must retain the above copyrightnotice, this list of conditions and the following disclaimer.2. Redistributions in binary form must reproduce the above copyrightnotice unmodified and in its entirety, this list of conditions and thefollowing disclaimer in the documentation and/or other materials providedwith the distribution.3. All advertising and press materials, printed or electronic, mentioningfeatures or use of this software must display the following acknowledgement:"This product includes software developed by the Rocks(r)Cluster Group at the San Diego Supercomputer Center at theUniversity of California, San Diego and its contributors."4. Except as permitted for the purposes of acknowledgment in paragraph 3,neither the name or logo of this software nor the names of itsauthors may be used to endorse or promote products derived from thissoftware without specific prior written permission. The name of thesoftware includes the following terms, and any derivatives thereof:"Rocks", "Rocks Clusters", and "Avalanche Installer". For licensing ofthe associated name, interested parties should contact TechnologyTransfer & Intellectual Property Services, University of California,San Diego, 9500 Gilman Drive, Mail Code 0910, La Jolla, CA 92093-0910,Ph: (858) 534-5815, FAX: (858) 534-7345, E-MAIL:invent@ucsd.eduTHIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS “AS ISAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULARPURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORSBE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, ORCONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OFSUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; ORBUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCEOR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVENIF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.10

Appendix C. Third Party Copyrights andLicensesThis section enumerates the licenses from all the third party software components of this Roll. A "best effort"attempt has been made to insure the complete and current licenses are listed. In the case of errors or ommisionsplease contact the maintainer of this Roll. For more information on the licenses of any components pleaseconsult with the original author(s) or see the Rocks CVS repository1.C.1. GangliaCopyright (c) 2001, 2002, 2003, 2004, 2005 byThe Regents of the University of California. All rights reserved.Permission to use, copy, modify, and distribute this software and itsdocumentation for any purpose, without fee, and without written agreement ishereby granted, provided that the above copyright notice and the followingtwo paragraphs appear in all copies of this software.IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FORDIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUTOF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OFCALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES,INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITYAND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER ISON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATION TOPROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.C.2. phpsysinfoGNU LIBRARY GENERAL PUBLIC LICENSEVersion 2, June 1991Copyright (C) 1991 Free Software Foundation,59 Temple Place, Suite 330, Boston,Everyone is permitted to copy and distributeof this license document, but changing it isInc.MA 02111-1307 USAverbatim copiesnot allowed.[This is the first released version of the library GPL. It isnumbered 2 because it goes with version 2 of the ordinary GPL.]PreambleThe licenses for most software are designed to take away yourfreedom to share and change it. By contrast, the GNU General PublicLicenses are intended to guarantee your freedom to share and change11

Appendix C. Third Party Copyrights and Licensesfree software--to make sure the software is free for all its users.This license, the Library General Public License, applies to somespecially designated Free Software Foundation software, and to anyother libraries whose authors decide to use it. You can use it foryour libraries, too.When we speak of free software, we are referring to freedom, notprice. Our General Public Licenses are designed to make sure that youhave the freedom to distribute copies of free software (and charge forthis service if you wish), that you receive source code or can get itif you want it, that you can change the software or use pieces of itin new free programs; and that you know you can do these things.To protect your rights, we need to make restrictions that forbidanyone to deny you these rights or to ask you to surrender the rights.These restrictions translate to certain responsibilities for you ifyou distribute copies of the library, or if you modify it.For example, if you distribute copies of the library, whether gratisor for a fee, you must give the recipients all the rights that we gaveyou. You must make sure that they, too, receive or can get the sourcecode. If you link a program with the library, you must providecomplete object files to the recipients so that they can relink themwith the library, after making changes to the library and recompilingit. And you must show them these terms so they know their rights.Our method of protecting your rights has two steps: (1) copyrightthe library, and (2) offer you this license which gives you legalpermission to copy, distribute and/or modify the library.Also, for each distributor’s protection, we want to make certainthat everyone understands that there is no warranty for this freelibrary. If the library is modified by someone else and passed on, wewant its recipients to know that what they have is not the originalversion, so that any problems introduced by others will not reflect onthe original authors’ reputations.Finally, any free program is threatened constantly by softwarepatents. We wish to avoid the danger that companies distributing freesoftware will individually obtain patent licenses, thus in effecttransforming the program into proprietary software. To prevent this,we have made it clear that any patent must be licensed for everyone’sfree use or not licensed at all.Most GNU software, including some libraries, is covered by the ordinaryGNU General Public License, which was designed for utility programs. Thislicense, the GNU Library General Public License, applies to certaindesignated libraries. This license is quite different from the ordinaryone; be sure to read it in full, and don’t assume that anything in it isthe same as in the ordinary license.The reason we have a separate public license for some libraries is thatthey blur the distinction we usually make between modifying or adding to aprogram and simply using it. Linking a program with a library, withoutchanging the library, is in some sense simply using the library, and isanalogous to running a utility program or application program. However, in12

Appendix C. Third Party Copyrights and Licensesa textual and legal sense, the linked executable is a combined work, aderivative of the original library, and the ordinary General Public Licensetreats it as such.Because of this blurred distinction, using the ordinary GeneralPublic License for libraries did not effectively promote softwaresharing, because most developers did not use the libraries. Weconcluded that weaker conditions might promote sharing better.However, unrestricted linking of non-free programs would deprive theusers of those programs of all benefit from the free status of thelibraries themselves. This Library General Public License is intended topermit developers of non-free programs to use free libraries, whilepreserving your freedom as a user of such programs to change the freelibraries that are incorporated in them. (We have not seen how to achievethis as regards changes in header files, but we have achieved it as regardschanges in the actual functions of the Library.) The hope is that thiswill lead to faster development of free libraries.The precise terms and conditions for copying, distribution andmodification follow. Pay close attention to the difference between a"work based on the library" and a "work that uses the library". Theformer contains code derived from the library, while the latter onlyworks together with the library.Note that it is possible for a library to be covered by the ordinaryGeneral Public License rather than by this special one.GNU LIBRARY GENERAL PUBLIC LICENSETERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION0. This License Agreement applies to any software library whichcontains a notice placed by the copyright holder or other authorizedparty saying it may be distributed under the terms of this LibraryGeneral Public License (also called "this License"). Each licensee isaddressed as "you".A "library" means a collection of software functions and/or dataprepared so as to be conveniently linked with application pr

with the distribution. 3. All advertising and press materials, printed or electronic, mentioning features or use of this software must display the following acknowledgement: "This product includes software developed by the Rocks(r) Cluster Group at the San Diego Supercomputer Cent