Hp XC System 3.x Software Bedienungsanleitung

Stöbern Sie online oder laden Sie Bedienungsanleitung nach Software Hp XC System 3.x Software herunter. HP XC System 3.x Software User Manual [en] Benutzerhandbuch

  • Herunterladen
  • Zu meinen Handbüchern hinzufügen
  • Drucken
  • Seite
    / 133
  • Inhaltsverzeichnis
  • LESEZEICHEN
  • Bewertet. / 5. Basierend auf Kundenbewertungen

Inhaltsverzeichnis

Seite 1 - User's Guide

HP XC System SoftwareUser's GuideVersion 3.1Printed in the USHP Part Number: 5991-7400Published: November 2006

Seite 3 - Table of Contents

Table 10-2 LSF-HPC Equivalents of SLURM srun Options (continued)LSF-HPC EquivalentDescriptionsrun Option-ext "SLURM[constraint=list]"Specifi

Seite 4 - 4 Table of Contents

Table 10-2 LSF-HPC Equivalents of SLURM srun Options (continued)LSF-HPC EquivalentDescriptionsrun OptionYou cannot use this option. LSF-HPC uses this

Seite 5 - Table of Contents 5

Table 10-2 LSF-HPC Equivalents of SLURM srun Options (continued)LSF-HPC EquivalentDescriptionsrun OptionMeaningless under LSF-HPC integrated withSLURM

Seite 6 - 6 Table of Contents

11 Advanced TopicsThis chapter covers topics intended for the advanced user. This chapter addresses the following topics:• “Enabling Remote Execution

Seite 7 - List of Figures

$ hostnamemymachineThen, use the host name of your local machine to retrieve its IP address:$ host mymachinemymachine has address 14.26.206.134Step 2.

Seite 8

First, examine the available nodes on the HP XC system. For example:$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite

Seite 9

the rule). Typically the rules for an object file target is a single compilation line, so it is common to talkabout concurrent compilations, though GN

Seite 10

testall: @ \ for i in ${HYPRE_DIRS}; \ do \ if [ -d $$i ]; \ then \ echo "Making $$i ..."

Seite 11 - List of Examples

$ make PREFIX=’srun –n1 –N1 MAKE_J='-j4'11.3.2 Example Procedure 2Go through the directories in parallel and have the make procedure within

Seite 12

The modified Makefile is invoked as follows:$ make PREFIX='srun -n1 -N1' MAKE_J='-j4'11.4 Local Disks on Compute NodesThe use of a

Seite 13 - About This Document

List of Examples5-1 Submitting a Job from the Standard Input...

Seite 14 - 14 About This Document

Verify with your system administrator that MPICH has been installed on your system. The HP XC SystemSoftware Administration Guide provides procedures

Seite 15 - 5 Related information

IMPORTANT: Be sure that the number of nodes and processors in the bsub command corresponds tothe number specified by the appropriate options in the wr

Seite 17 - Linux Web Sites 17

A ExamplesThis appendix provides examples that illustrate how to build and run applications on the HP XC system.The examples in this section show you

Seite 18 - 7 HP Encourages Your Comments

Examine the partition information:$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 6 idle n[5-10]Examine the loca

Seite 19 - 1.1 System Architecture

View the job:$ bjobs -l 8Job <8>, User <smith>, Project <default>, Status <DONE>, Queue <normal>,Interactive mode, Extsc

Seite 20 - 1.1.4 Node Specialization

A.4 Launching a Parallel Interactive Shell Through LSF-HPCThis section provides an example that shows how to launch a parallel interactive shell throu

Seite 21 - 1.1.6 File System

date and time stamp: Submitted from host <n2>, to Queue <normal>, CWD <$HOME>,4 Processors Requested, Requested Resources <type=a

Seite 22

$ lshostsHOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCESlsfhost.loc SLINUX6 DEFAULT 1.0 8 1M - Yes (slurm)$

Seite 23 - 1.3 User Environment

$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 4 idle n[13-16]Submit the job:$ bsub -n8 -Ip /bin/shJob <1008

Seite 25 - 1.5.3 Standard LSF

loadSched - - - - - - - - - - - loadStop - - - - - - - - - - -View the finished jobs:$ bhist -

Seite 26 - 1.5.5 HP-MPI

Greetings from process 2! from ( n14 pid 14011)Greetings from process 3! from ( n14 pid 14012)Greetings from process 4! from ( n15 pid 18227)Greetings

Seite 27 - 2 Using the System

If myjob runs on an HP XC host, the SLURM[nodes=4-4] allocation option is applied. If it runs onan Alpha/AXP host, the SLURM option is ignored.• Run m

Seite 28 - 2.2.1 Introduction

GlossaryAadministrationbranchThe half (branch) of the administration network that contains all of the general-purposeadministration ports to the nodes

Seite 29

external networknodeA node that is connected to a network external to the HP XC system.Ffairshare An LSF job-scheduling policy that specifies how reso

Seite 30 - 30 Using the System

Integrated LightsOutSee iLO.interconnect A hardware component that provides high-speed connectivity between the nodes in the HPXC system. It is used f

Seite 31 - 3.1 Overview of Modules

MCS An optional integrated system that uses chilled water technology to triple the standard coolingcapacity of a single rack. This system helps take t

Seite 32 - 3.2 Supplied Modulefiles

PXE Preboot Execution Environment. A standard client/server interface that enables networkedcomputers that are not yet installed with an operating sys

Seite 34 - 3.7 Unloading a Modulefile

IndexAACML library, 42application development, 37building parallel applications, 42building serial applications, 39communication between nodes, 109com

Seite 35 - 3.10 Creating a Modulefile

About This DocumentThis document provides information about using the features and functions of the HP XC System Software.It describes how the HP XC u

Seite 36

compute node, 37configuring local disk, 109core availability, 38CP3000, 20MKL library, 42system interconnect, 22CP3000BL, 20CP4000, 20ACML library, 42

Seite 37 - 4 Developing Applications

submission, 47submission from non-HP XC host, 55job accounting, 81job allocation informationobtaining, 92job manager, 84job scheduler, 84JOBID transla

Seite 38 - 4.4 Interrupting a Job

Pparallel applicationbuild environment, 40building, 42compiling and linking, 42debugging, 57debugging with TotalView, 57developing, 37environment for

Seite 39 - 4.5 Setting Debugging Options

TTotalView, 57debugging an application, 59exiting, 61setting preferences, 59setting up, 58tuning applications, 73Uuser environment, 31utilization metr

Seite 40 - 4.7.1.4 Pthreads

Ctrl+x A key sequence. A sequence such as Ctrl+x indicates that you must holddown the key labeled Ctrl while you press another key or mouse button.ENV

Seite 41

See the following sources for information about related HP products.HP XC Program Development EnvironmentThe Program Development Environment home page

Seite 42 - 42 Developing Applications

— Administering Platform LSF— Administration Primer— Platform LSF Reference— Quick Reference Card— Running Jobs with Platform LSFLSF procedures and in

Seite 43 - 4.8 Developing Libraries

• http://sourceforge.net/projects/modules/Web site for Modules, which provide for easy dynamic modification of a user's environment throughmodule

Seite 44 - 44 Developing Applications

Software RAID Web Sites• http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html andhttp://www.ibiblio.org/pub/Linux/docs/HOWTO/other-formats/pdf/Software-

Seite 45 - 4.8 Developing Libraries 45

1 Overview of the User EnvironmentThe HP XC system is a collection of computer nodes, networks, storage, and software, built into a cluster,that work

Seite 46

© Copyright 2003, 2005, 2006 Hewlett-Packard Development Company, L.P.Confidential computer software. Valid license from HP required for possession, u

Seite 47 - 5 Submitting Jobs

Table 1-1 Determining the Node PlatformPartial Output of /proc/cpuinfoPlatformprocessor : 0vendor_id : GenuineIntelcpu family : 15 mo

Seite 48 - 48 Submitting Jobs

nodes must be launched from nodes with the login role. Nodes with the compute roleare referred to as compute nodes in this manual.1.1.5 Storage and I/

Seite 49 - 5.3 Submitting a Parallel Job

and keeps software from conflicting with user installed software. Files are segregated into the followingtypes and locations:• Software specific to HP

Seite 50 - 50 Submitting Jobs

free -mUse the following command to display the amount of free andused memory in megabytes:cat /proc/partitionsUse the following command to display th

Seite 51

SLURM commands HP XC uses the Simple Linux Utility for Resource Management (SLURM) forsystem resource management and job scheduling. Standard SLURM co

Seite 52 - 52 Submitting Jobs

1.5.2 Load Sharing Facility (LSF-HPC)The Load Sharing Facility for High Performance Computing (LSF-HPC) from Platform ComputingCorporation is a batch

Seite 53

HP-MPI Determines HOW the job runs. It is part of the application, so it performs communication.HP-MPI can also pinpoint the processor on which each r

Seite 54 - 54 Submitting Jobs

2 Using the SystemThis chapter describes the tasks and commands that the general user must know to use the system. Itaddresses the following topics:•

Seite 55

2.2.1 IntroductionAs described in “Run-Time Environment” (page 24), SLURM and LSF-HPC cooperate to run and managejobs on the HP XC system, combining L

Seite 56 - 56 Submitting Jobs

For more information about using this command and a sample of its output, see “Getting InformationAbout the LSF Execution Host Node” (page 91) .• The

Seite 57 - 6 Debugging Applications

Table of ContentsAbout This Document...131 Intende

Seite 58 - 6.2.1.2 Setting Up TotalView

My cluster name is hptclsfMy master name is lsfhost.localdomainIn this example, hptclsf is the LSF cluster name, and lsfhost.localdomain is the name o

Seite 59

3 Configuring Your Environment with ModulefilesThe HP XC system supports the use of Modules software to make it easier to configure and modify theyour

Seite 60 - 60 Debugging Applications

(perhaps with incompatible shared objects) installed, it is probably wise to set MPI_CC (and others) explicitlyto the commands made available by the c

Seite 61 - 6.2.1.8 Exiting TotalView

Table 3-1 Supplied Modulefiles (continued)Sets the HP XC User Environment to Use:ModulefileIntel Math Kernel Library.imkl/8.0 (default)Intel Version 7

Seite 62

3.5 Viewing Loaded ModulefilesA loaded modulefile is a modulefile that has been explicitly loaded in your environment by the moduleload command. To vi

Seite 63 - 7 Monitoring Node Activity

3.8 Viewing Modulefile-Specific HelpYou can view help information for any of the modulefiles on the HP XC system. For example, to accessmodulefile-spe

Seite 64 - 64 Monitoring Node Activity

To install a random product or package should look at the manpages for modulefiles, examine the existingmodulefiles, and create a new modulefile for t

Seite 65

4 Developing ApplicationsThis chapter discusses topics associated with developing applications in the HP XC environment. Beforereading this chapter, y

Seite 66 - 66 Monitoring Node Activity

Table 4-1 Compiler CommandsNotesCompilersTypeFortranC++CAll HP XC platforms.The HP XC System Software supplies thesecompilers by default.g77gcc++gccSt

Seite 67

The Ctrl/Z key sequence is ignored.4.5 Setting Debugging OptionsIn general, the debugging information for your application that is needed by most debu

Seite 68 - 68 Monitoring Node Activity

3 Configuring Your Environment with Modulefiles...313.1 Overview of Modules...

Seite 69

For further information about developing parallel applications in the HP XC environment, see the following:• “Launching Jobs with the srun Command” (p

Seite 70 - 70 Monitoring Node Activity

Intel-pthreadPGI-lpgthreadFor example:$ mpicc object1.o ... -pthread -o myapp.exe4.7.1.5 Quadrics SHMEMThe Quadrics implementation of SHMEM runs on HP

Seite 71

Information about using the GNU parallel Make is provided in “Using the GNU Parallel Make Capability”.For further information about using GNU parallel

Seite 72

If you have not already loaded the mpi compiler utilities module , load it now as follows:$ module load mpiTo compile and link a C application using t

Seite 73 - 8 Tuning Applications

For released libraries, dynamic and archive, the usual custom is to have a ../lib directory that containsthe libraries. This, by itself, will work if

Seite 74 - 74 Tuning Applications

NOTE: There is no shortcut as there is for the dynamic loader.4.8 Developing Libraries 45

Seite 76 - Running a Program

5 Submitting JobsThis chapter describes how to submit jobs on the HP XC system; it addresses the following topics:• “Overview of Job Submission” (page

Seite 77

launched on LSF-HPC node allocation (compute nodes). LSF-HPC node allocation is created by -nnum-procs parameter, which specifies the number of cores

Seite 78

return 0; }The following is the command line used to compile this program:$ cc hw_hostname.c -o hw_hostnameWhen run on the login node, it shows the

Seite 79 - 9 Using SLURM

5.4 Submitting a Batch Job or Job Script...535.5 Sub

Seite 80 - 9.3.1.2 The srun Modes

The SLURM srun command is required to run jobs on an LSF-HPC node allocation. The srun commandis the user job launched by the LSF bsub command. SLURM

Seite 81 - 9.7 Job Accounting

Example 5-7 Submitting an MPI Job$ bsub -n4 -I mpirun -srun ./hello_worldJob <24> is submitted to default queue <normal>. <<Waiting

Seite 82 - 9.8 Fault Tolerance

bsub -n num-procs -ext "SLURM[slurm-arguments]" [bsub-options][ -srun[srun-options]] [jobname] [job-options]The slurm-arguments parameter ca

Seite 83 - 10 Using LSF-HPC

Example 5-11 Using the External Scheduler to Submit a Job That Excludes One or More Nodes$ bsub -n4 -ext "SLURM[nodes=4; exclude=n3]" -I sru

Seite 84 - 84 Using LSF-HPC

Example 5-14 Submitting a Job Script$ cat myscript.sh#!/bin/sh srun hostname mpirun -srun hellompi$ bsub -I -n4 myscript.shJob <29> is submitted

Seite 85

Example 5-17 Submitting a Batch job Script That Uses the srun --overcommit Option$ bsub -n4 -I ./myscript.shJob <81> is submitted to default que

Seite 86 - 10.4 Job Terminology

5.6 Running Preexecution ProgramsA preexecution program is a program that performs needed setup tasks that an application needs. It maycreate director

Seite 87 - 10.5.3 Preemption

6 Debugging ApplicationsThis chapter describes how to debug serial and parallel applications in the HP XC developmentenvironment. In general, effectiv

Seite 88 - 10.6 Submitting Jobs

This section provides only minimum instructions to get you started using TotalView. Instructions forinstalling TotalView are included in the HP XC Sys

Seite 89 - SLURM_NPROCS=4

6.2.1.4 Using TotalView with LSF-HPCHP recommends the use of xterm when debugging an application with LSF-HPC. You also need to allocatethe nodes you

Seite 90 - 90 Using LSF-HPC

10.5 Using LSF-HPC Integrated with SLURM in the HP XC Environment...8710.5.1 Useful Commands...

Seite 91

Use the -g option to enable debugging information.2. Run the application in TotalView:$ mpirun -tv -srun -n2 ./Psimple3. The TotalView main control wi

Seite 92 - 92 Using LSF-HPC

$ mpicc -g -o Psimple simple.c -lm2. Run the application:$ mpirun -srun -n2 Psimple3. Start TotalView:$ totalview4. Select Unattached in the TotalView

Seite 94 - 94 Using LSF-HPC

7 Monitoring Node ActivityThis chapter describes the optional utilities that provide performance information about the set of nodesassociated with you

Seite 95

Figure 7-1 The xcxclus Utility DisplayThe icons show most node utilization statistics as a percentage of the total resource utilization. For example,F

Seite 96 - 96 Using LSF-HPC

1The node designator is on the upper left of the icon.2The left portion of the icon represents the Ethernet connection or connections.In this illustra

Seite 97

Figure 7-3 The clusplot Utility DisplayThe clusplot utility uses the GNUplot open source plotting program.7.4 Using the xcxperf Utility to Display Nod

Seite 98 - 98 Using LSF-HPC

$ xcxperf -o testFigure 7-4 The xcxperf Utility DisplaySpecifying the data file prefix when you invoke the xcxperf utility from the command line plays

Seite 99

Figure 7-5 The perfplot Utility Display7.6 Running Performance Health TestsYou can run the ovp command to generate reports on the performance health o

Seite 100 - 100 Using LSF-HPC

NOTE: The --nodelist=nodelist option is particularly useful for determiningproblematic nodes.If you use this option and the --nnodes=n option, the --n

Seite 101

List of Figures4-1 Library Directory Structure...

Seite 102 - 102 Using LSF-HPC

$ ovp --verify=perf_health/cpu_usageXC CLUSTER VERIFICATION PROCEDUREdate timeVerify perf_health: Testing cpu_usage ... +++ PASSED +++This v

Seite 103 - 11 Advanced Topics

Verify perf_health: Testing memory ... Specified nodelist is n[11-15] Number of nodes allocated for this test is 5 Job <103

Seite 105

8 Tuning ApplicationsThis chapter discusses how to tune applications in the HP XC environment.8.1 Using the Intel Trace Collector and Intel Trace Anal

Seite 106 - 106 Advanced Topics

Example 8-1 The vtjacobic Example ProgramFor the purposes of this example, the examples directory under /opt/IntelTrace/ITC is copied to theuser'

Seite 107 - 11.3.1 Example Procedure 1

8.2 The Intel Trace Collector and Analyzer with HP-MPI on HP XCNOTE: The Intel Trace Collector (OTA) was formerly known as VampirTrace. The Intel Trac

Seite 108 - 11.3.2 Example Procedure 2

Running a ProgramEnsure that the -static-libcxa flag is used when you use mpirun.mpich to launch a C or Fortranprogram.The following is a C example ca

Seite 109 - 11.5.2 Private File View

86 Difference is 2.809467246160129E-005 88 Difference is 2.381154327036583E-005 90 Difference is 2.01814296456522

Seite 111

9 Using SLURMHP XC uses the Simple Linux Utility for Resource Management (SLURM) for system resource managementand job scheduling.This chapter address

Seite 113 - A Examples

Example 9-1 Simple Launch of a Serial Program$ srun hostname n19.3.1 The srun Roles and ModesThe srun command submits jobs to run under SLURM manageme

Seite 114 - Submit the job:

Example 9-3 Reporting on Failed Jobs in the Queue$ squeue --state=FAILED JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)

Seite 115 - View the job:

# chmod a+r /hptc_cluster/slurm/job/jobacct.logYou can find detailed information on the sacct command and job accounting data in the sacct(1) manpage.

Seite 116 - 116 Examples

10 Using LSF-HPCThe Load Sharing Facility (LSF-HPC) from Platform Computing Corporation is a batch system resourcemanager used on the HP XC system.On

Seite 117 - Show the environment:

• The bsub command is used to submit jobs to LSF.• The bjobs command provides information on batch jobs.10.2 Overview of LSF-HPC Integrated with SLURM

Seite 118 - Run the job:

10.3 Differences Between LSF-HPC and LSF-HPC Integrated with SLURMLSF-HPC integrated with SLURM for the HP XC environment supports all the standard fe

Seite 119 - Submit the job: 119

$ lshostsHOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCESlsfhost.loc SLINUX6 Opteron8 60.0 8 2007M - Yes (slurm)$

Seite 120 - View the node state:

Pseudo-parallel job A job that requests only one slot but specifies any of these constraints:• mem• tmp• nodes=1• mincpus > 1Pseudo-parallel jobs a

Seite 121 - View the finished job:

10.6 Submitting JobsThe bsub command submits jobs to LSF-HPC; it is used to request a set of resources on which to launcha job. This section focuses o

Seite 122 - 122 Examples

Figure 10-1 How LSF-HPC and SLURM Launch and Manage a JobN 1 6N16User124666677775job_starter.sh$ srun -nl myscriptLogin node$ bsub-n4 -ext ”SLURM[node

Seite 123 - Glossary

List of Tables1-1 Determining the Node Platform...

Seite 124 - 124 Glossary

4LSF-HPC prepares the user environment for the job on the LSF execution host node and dispatchesthe job with the job_starter.sh script. This user envi

Seite 125

LSF-HPC daemons run on only one node in the HP XC system, so the bhosts command will list one host,which represents all the resources of the HP XC sys

Seite 126 - 126 Glossary

In the previous example output, the LSF execution host (lsfhost.localdomain) is listed under theHOST_NAME column. The status is listed as ok, indicati

Seite 127

After LSF-HPC integrated with SLURM allocates nodes for a job, it attaches allocation information to thejob.The bjobs -l command provides job allocati

Seite 128

Example 10-2 Job Allocation Information for a Finished Job$ bhist -l 24Job <24>, User <lsfadmin>, Project <default>,

Seite 129

Example 10-4 Using the bjobs Command (Long Output)$ bjobs -l 24Job <24>, User <msmith>,Project <default>,Status <RUN>,

Seite 130 - 130 Index

For detailed information about a finished job, add the -l option to the bhist command, shown inExample 10-6. The -l option specifies that the long for

Seite 131

123 hptclsf@99 lsf 8 RUNNING 0123.0 hptclsf@99 lsf 0 RUNNING 0 In these examples, the job

Seite 132 - 132 Index

You can simplify this by first setting the SLURM_JOBID environment variable to the SLURM JOBID in theenvironment, as follows:$ export SLURM_JOBID=150$

Seite 133

$ export SLURM_JOBID=150$ export SLURM_NPROCS=4$ mpirun -tv srun additional parameters as neededAfter you finish with this interactive allocation, exi

Kommentare zu diesen Handbüchern

Keine Kommentare