Hp XC System 3.x Software Bedienungsanleitung

Stöbern Sie online oder laden Sie Bedienungsanleitung nach Software Hp XC System 3.x Software herunter. HP XC System 3.x Software User Manual Benutzerhandbuch

  • Herunterladen
  • Zu meinen Handbüchern hinzufügen
  • Drucken
  • Seite
    / 145
  • Inhaltsverzeichnis
  • LESEZEICHEN
  • Bewertet. / 5. Basierend auf Kundenbewertungen

Inhaltsverzeichnis

Seite 1 - User's Guide

HP XC System SoftwareUser's GuideVersion 3.2.1HP Part Number: A-XCUSR-321Published: October 2007

Seite 3 - Table of Contents

sometime in the future, depending on resource availability andbatch system scheduling policies.Batch job submissions typically provide instructions on

Seite 4 - 4 Table of Contents

LSF-HPC allocates the appropriate whole node for exclusiveuse by the serial job in the same manner as it does for paralleljobs, hence the name “pseudo

Seite 5 - Table of Contents 5

The HP XC system has several features that make it optimal for running parallel applications,particularly (but not exclusively) MPI applications. You

Seite 6 - 6 Table of Contents

Figure 10-1 How LSF-HPC and SLURM Launch and Manage a JobN 1 6N16User124666677775job_starter.sh$ srun -nl myscriptLogin node$ bsub-n4 -ext ”SLURM[node

Seite 7 - Table of Contents 7

4. LSF-HPC prepares the user environment for the job on the LSF execution host node anddispatches the job with the job_starter.sh script. This user en

Seite 8

10.10.1 Examining System Core StatusThe bhosts command displays LSF-HPC resource usage information. This command is usefulto examine the status of the

Seite 9 - List of Figures

• The maxmem column displays minimum maxmem over all available computer nodes in thelsf partition.• The maxtmp column (not shown) displays minimum max

Seite 10

10.11 Getting Information About JobsThere are several ways you can get information about a specific job after it has been submittedto LSF-HPC integrat

Seite 11 - List of Tables

Example 10-3 Job Allocation Information for a Running Job$ bjobs -l 24Job <24>, User <lsfadmin>, Project <default>,

Seite 12

Example 10-5 Using the bjobs Command (Short Output)$ bjobs 24JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME24 msmith

Seite 13 - List of Examples

List of Tables1-1 Determining the Node Platform...241-

Seite 14

Table 10-2 Output Provided by the bhist CommandDescriptionFieldThe job ID that LSF-HPC assigned to the job.JOBIDThe user who submitted the job.USERThe

Seite 15 - About This Document

$ bjobs -l 99 | grep slurmdate and time stamp: slurm_id=123;ncpus=8;slurm_alloc=n[13-16];The SLURM JOBID is 123 for the LSF JOBID 99.You can also find

Seite 16

$ bjobs -l 124 | grep slurmdate and time stamp: slurm_id=150;ncpus=8;slurm_alloc=n[1-4];LSF allocated nodes n[1-4] for this job. The SLURM JOBID is 15

Seite 17 - Related Information

Example 10-10 Launching an Interactive MPI Job on All Cores in the AllocationThis example assumes 2 cores per node.$ mpirun -srun --jobid=150 -n8 hell

Seite 18

10.14 LSF-HPC Equivalents of SLURM srun OptionsTable 10-3 describes the srun options and lists their LSF-HPC equivalents.Table 10-3 LSF-HPC Equivalent

Seite 19 - Linux Web Sites

Table 10-3 LSF-HPC Equivalents of SLURM srun Options (continued)LSF-HPC EquivalentDescriptionsrun OptionYou can use when launching parallel tasks.Spec

Seite 20 - Manpages

Table 10-3 LSF-HPC Equivalents of SLURM srun Options (continued)LSF-HPC EquivalentDescriptionsrun OptionUse as an argument to srun when launchingparal

Seite 21 - HP Encourages Your Comments

11 Advanced TopicsThis chapter covers topics intended for the advanced user. This chapter addresses the followingtopics:• “Enabling Remote Execution w

Seite 22

$ echo $DISPLAY:0Next, get the name of the local machine serving your display monitor:$ hostnamemymachineThen, use the host name of your local machine

Seite 23 - 1.1 System Architecture

$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 2 idle n[46,48]According to the information returned about this

Seite 25 - 1.1.6 File System

One way is to prefix the actual compilation line in the rule with an srun command. So, insteadof executing cc foo.c -o foo.o it would execute srun cc

Seite 26 - File System Layout

then \ echo "Making $$i ..."; \ (cd $$i; make); \ echo ""; \ fi; \ done

Seite 27 - 1.3 User Environment

$(MAKE) $(MAKE_J) struct_matrix_vector/libHYPRE_mv.astruct_linear_solvers/libHYPRE_ls.a utilities/libHYPRE_utilities.a $(PREFIX) $(MAK

Seite 28 - 1.4.1 Parallel Applications

11.5 I/O Performance ConsiderationsBefore building and running your parallel application, I/O performance issues on the HP XCcluster must be considere

Seite 29 - 1.5 Run-Time Environment

respectively. These subsections are not full solutions for integrating MPICH with the HP XCSystem Software.Figure 11-1 MPICH Wrapper Script#!/bin/cshs

Seite 30 - 1.5.5 HP-MPI

A ExamplesThis appendix provides examples that illustrate how to build and run applications on the HPXC system. The examples in this section show you

Seite 31

Examine the partition information:$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 6 idle n[5-10]Examine the loca

Seite 32

date and time stamp: Submitted from host <lsfhost.localdomain>, CWD <$HOME>, 2 Processors Requested;date and time stam

Seite 33 - 2 Using the System

example steps through a series of commands that illustrate what occurs when you launch aninteractive shell.Examine the LSF execution host information:

Seite 34 - 2.2.1 Introduction

Summary of time in seconds spent in various states by date and time PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL 11 0 124 0 0 0

Seite 35 - 2.2.8 Resuming Suspended Jobs

List of Examples5-1 Submitting a Job from the Standard Input...545-2

Seite 36 - 36 Using the System

srun hostnamesrun uname -aRun the job:$ bsub -I -n4 myjobscript.shJob <1006> is submitted to default queue <normal>.<<Waiting for di

Seite 37 - 3.1 Overview of Modules

Show the SLURM job ID:$ env | grep SLURMSLURM_JOBID=74SLURM_NPROCS=8Run some commands from the pseudo-terminal:$ srun hostnamen13n13n14n14n15n15n16n16

Seite 38 - 3.2 Supplied Modulefiles

View the node state:$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 4 idle n[13-16]A.7 Submitting an HP-MPI Jo

Seite 39

EXTERNAL MESSAGES:MSG_ID FROM POST_TIME MESSAGE ATTACHMENT0 - - - - 1 lsfadmin

Seite 41 - 3.9 Modulefile Conflicts

GlossaryAadministrationbranchThe half (branch) of the administration network that contains all of the general-purposeadministration ports to the nodes

Seite 42 - 3.10 Creating a Modulefile

operating system and its loader. Together, these provide a standard environment for bootingan operating system and running preboot applications.enclos

Seite 43 - 4 Developing Applications

image server A node specifically designated to hold images that will be distributed to one or more clientsystems. In a standard HP XC installation, th

Seite 44 - 4.2 Compilers

LVS Linux Virtual Server. Provides a centralized login capability for system users. LVS handlesincoming login requests and directs them to a node with

Seite 45 - 4.5 Setting Debugging Options

onboardadministratorSee OA.PparallelapplicationAn application that uses a distributed programming model and can run on multiple processors.An HP XC MP

Seite 47 - 4.7.1.5 Quadrics SHMEM

an HP XC system, the use of SMP technology increases the number of CPUs (amount ofcomputational power) available per unit of space.ssh Secure Shell. A

Seite 48 - 48 Developing Applications

IndexAACML library, 49application development, 43building parallel applications, 49building serial applications, 46communication between nodes, 123com

Seite 49 - $ module load mpi

CP3000, 24MKL library, 49system interconnect, 26CP3000BL, 24CP4000, 24ACML library, 49compilers, 44, 48designing libraries for, 50MKL library, 49softw

Seite 50 - 4.8 Developing Libraries

job accounting, 94job allocation informationobtaining, 107job manager, 96job scheduler, 96JOBID translation, 110Llaunching jobssrun, 91libraries, 31bu

Seite 51 - 4.8 Developing Libraries 51

building, 49compiling and linking, 49debugging, 67debugging with TotalView, 68developing, 43environment for developing, 28examples of, 125partitionrep

Seite 52 - 52 Developing Applications

setting preferences, 69setting up, 68tuning applications, 85UUnified Parallel C (see UPC)UPC, 43user environment, 37utilization metrics, 73VVampir, 87

Seite 53 - 5 Submitting Jobs

About This DocumentThis document provides information about using the features and functions of the HP XC SystemSoftware. It describes how the HP XC u

Seite 54 - 54 Submitting Jobs

audit(5) A manpage. The manpage name is audit, and it is located inSection 5.CommandA command name or qualified command phrase.Computer outputText dis

Seite 55 - 5.3 Submitting a Parallel Job

Provides an overview of the HP XC system administrativeenvironment, cluster administration tasks, node maintenancetasks, LSF® administration tasks, an

Seite 56 - 56 Submitting Jobs

Supplementary Software Products This section provides links to third-party and open sourcesoftware products that are integrated into the HP XC System

Seite 57

• http://www.llnl.gov/linux/pdsh/Home page for the parallel distributed shell (pdsh), which executes commands across HPXC client nodes in parallel.• h

Seite 58

© Copyright 2003, 2005, 2006, 2007 Hewlett-Packard Development Company, L.P.Confidential computer software. Valid license from HP required for possess

Seite 59

MPI Web Sites• http://www.mpi-forum.orgContains the official MPI standards documents, errata, and archives of the MPI Forum. TheMPI Forum is an open g

Seite 60 - 60 Submitting Jobs

Manpages for third-party software components might be provided as a part of the deliverablesfor that component.Using discover(8) as an example, you ca

Seite 62 - 62 Submitting Jobs

1 Overview of the User EnvironmentThe HP XC system is a collection of computer nodes, networks, storage, and software, built intoa cluster, that work

Seite 63

$ head /proc/cpuinfoTable 1-1 presents the representative output for each of the platforms. This output may differaccording to changes in models and s

Seite 64 - 64 Submitting Jobs

distributes login requests from users. A node with the login role is referredto as a login node in this manual.compute role The compute role is assign

Seite 65 - -R "type=SLINUX64"

the HP XC. So, for example, if the HP XC system interconnect is based on a Quadrics® QsNetII® switch, then the SFS will serve files over ports on that

Seite 66

Additional information on supported system interconnects is provided in the HP XC HardwarePreparation Guide.1.1.8 Network Address Translation (NAT)The

Seite 67 - 6 Debugging Applications

Modulefiles can be loaded into the your environment automatically when you log in to thesystem, or any time you need to alter the environment. The HP

Seite 68 - 6.2.1.2 Setting Up TotalView

1.4.2 Serial ApplicationsYou can build and run serial applications under the HP XC development environment. A serialapplication is a command or applic

Seite 69

Table of ContentsAbout This Document...15Intended

Seite 70 - 70 Debugging Applications

1.5.3 Standard LSFStandard LSF is also available on the HP XC system. The information for using standard LSF isdocumented in the LSF manuals from Plat

Seite 71 - 6.2.1.8 Exiting TotalView

— however, it manages the global MPI exchange so that all processes can communicate witheach other.See the HP-MPI documentation for more information.1

Seite 73 - 7 Monitoring Node Activity

2 Using the SystemThis chapter describes the tasks and commands that the general user must know to use thesystem. It addresses the following topics:•

Seite 74 - 74 Monitoring Node Activity

overview about some basic ways of running and managing jobs. Full information and detailsabout the HP XC job launch environment are provided in “Using

Seite 75

For more information about using this command and a sample of its output, see “ExaminingSystem Core Status” (page 105)• The LSF lshosts command displa

Seite 76

2.3.1 Determining the LSF Cluster Name and the LSF Execution HostThe lsid command returns the LSF cluster name, the LSF-HPC version, and the name of t

Seite 77

3 Configuring Your Environment with ModulefilesThe HP XC system supports the use of Modules software to make it easier to configure andmodify the your

Seite 78 - $ xcxperf test

access the mpi** scripts and libraries. You can specify the compiler it uses through a variety ofmechanisms long after the modulefile is loaded.The pr

Seite 79

Table 3-1 Supplied Modulefiles (continued)Sets the HP XC User Environment to Use:ModulefileIntel C/C++ Version 8.1 compilers.icc/8.1/defaultIntel C/C+

Seite 80 - 80 Monitoring Node Activity

2.3.1 Determining the LSF Cluster Name and the LSF Execution Host...362.4 Getting System Help and Information..

Seite 81

Each module supplies its own online help. See “Viewing Modulefile-Specific Help” for informationon how to view it.3.3 Modulefiles Automatically Loaded

Seite 82 - 82 Monitoring Node Activity

For example, if you wanted to automatically load the TotalView modulefile when you log in,edit your shell startup script to include the following inst

Seite 83

In this example, a user attempted to load the ifort/8.0 modulefile. After the user issued thecommand to load the modulefile, an error message occurred

Seite 84

4 Developing ApplicationsThis chapter discusses topics associated with developing applications in the HP XC environment.Before reading this chapter, y

Seite 85 - 8 Tuning Applications

HP UPC is a parallel extension of the C programming language, which runs on both commontypes of multiprocessor systems: those with a common global add

Seite 86 - 86 Tuning Applications

4.3 Examining Nodes and Partitions Before Running JobsBefore launching an application, you can determine the availability and status of the system&apo

Seite 87 - 8.2.1 Installation Kit

4.6.1 Serial Application Build EnvironmentYou can build and run serial applications in the HP XC programming environment. A serialapplication is a com

Seite 88 - Running a Program

4.7.1.1 ModulefilesThe basics of your working environment are set up automatically by your system administratorduring the installation of HP XC. Howev

Seite 89

To compile programs that use SHMEM, it is necessary to include the shmem.h file and to usethe SHMEM and Elan libraries. For example:$ gcc -o shping sh

Seite 90

4.7.1.12 MKL LibraryMKL is a math library that references pthreads, and in enabled environments, can use multiplethreads. MKL can be linked in a singl

Seite 91 - 9 Using SLURM

5.2 Submitting a Serial Job Using LSF-HPC...535.2.1 Submitting a

Seite 92 - 9.3.1.2 The srun Modes

To compile and link a C application using the mpicc command:$ mpicc -o mycode hello.c To compile and link a Fortran application using the mpif90 comma

Seite 93

names. However, HP recommends an alternative method. The dynamic linker, during its attemptto load libraries, will suffix candidate directories with t

Seite 94 - 9.9 Security

NOTE: There is no shortcut as there is for the dynamic loader.52 Developing Applications

Seite 95 - 10 Using LSF-HPC

5 Submitting JobsThis chapter describes how to submit jobs on the HP XC system; it addresses the following topics:• “Overview of Job Submission” (page

Seite 96 - 96 Using LSF-HPC

The srun command is only necessary to launch the job on the allocated node if the HP XC JOBSTARTER script is not configured to run a job on the comput

Seite 97

The following is the C source code for this program; the file name is hw_hostname.c. #include <unistd.h> #include <stdio.h> int main() {

Seite 98

bsub -n num-procs [bsub-options] srun [srun-options] jobname [job-options]The bsub command submits the job to LSF-HPC.The -n num-procs parameter, whi

Seite 99 - 10.4 Job Terminology

The srun command, used by the mpirun command to launch the MPI tasks in parallel in thelsf partition, determines the number of tasks to launch from th

Seite 100 - 100 Using LSF-HPC

With LSF-HPC integrated with SLURM, you can use the LSF-SLURM External Scheduler tospecify SLURM options that specify the minimum number of nodes requ

Seite 101 - 10.6 Submitting Jobs

Example 5-9 Using the External Scheduler to Submit a Job to Run on Specific Nodes$ bsub -n4 -ext "SLURM[nodelist=n6,n8]" -I srun hostnameJob

Seite 102 - 102 Using LSF-HPC

9.3.3 Using the srun Command with LSF-HPC...929.4 Monitoring Jobs with the

Seite 103 - SLURM_NPROCS=4

Example 5-13 Using the External Scheduler to Constrain Launching to Nodes with a Given Feature$ bsub -n 10 -ext "SLURM[constraint=dualcore]"

Seite 104

Example 5-15 Submitting a Batch Script with the LSF-SLURM External Scheduler Option$ bsub -n4 -ext "SLURM[nodes=4]" -I ./myscript.shJob <

Seite 105

Example 5-18 Environment Variables Available in a Batch Job Script$ cat ./envscript.sh#!/bin/sh name=`hostname` echo "hostname = $name" echo

Seite 106 - 106 Using LSF-HPC

The ping_pong_ring application is submitted twice in a Makefile named mymake; the firsttime as run1 and the second as run2:$ cat mymake PPR_ARGS=1000

Seite 107

1This line attempts to submita program that does notexist.The following command line makes the program and executes it:$ bsub -o %J.out -n2 -ext "

Seite 108 - 108 Using LSF-HPC

5.6 Submitting a Job from a Host Other Than an HP XC HostTo submit a job from a host other than an HP XC host to the HP XC system, use the LSF -Roptio

Seite 110

6 Debugging ApplicationsThis chapter describes how to debug serial and parallel applications in the HP XC developmentenvironment. In general, effectiv

Seite 111

6.2.1 Debugging with TotalViewTotalView is a full-featured, debugger based on GUI and specifically designed to fill therequirements of parallel applic

Seite 112 - 112 Using LSF-HPC

6.2.1.3 Using TotalView with SLURMUse the following commands to allocate the nodes you need before you debug an applicationwith SLURM, as shown here:$

Seite 113

A.2 Launching a Serial Interactive Shell Through LSF-HPC...125A.3 Running LSF-HPC Jobs with

Seite 114

6.2.1.6 Debugging an ApplicationThis section describes how to use TotalView to debug an application.1. Compile the application to be debugged. For exa

Seite 115

6.2.1.7 Debugging Running ApplicationsAs an alternative to the method described in “Debugging an Application”, it is also possible to"attach"

Seite 117 - 11 Advanced Topics

7 Monitoring Node ActivityThis chapter describes the optional utilities that provide performance information about the setof nodes associated with you

Seite 118 - 118 Advanced Topics

Figure 7-1 The xcxclus Utility DisplayThe icons show most node utilization statistics as a percentage of the total resource utilization.For example, F

Seite 119

1The node designator is on the upper left of the icon.2The left portion of the icon represents the Ethernet connection or connections.In this illustra

Seite 120 - 120 Advanced Topics

Enables you to display core utilization in terms of user or system statistics,or both.CPU InfoEnables you to specify the data in terms of bandwidth or

Seite 121 - 11.3.2 Example Procedure 2

Figure 7-4 Plotting the Data from the xcxclus UtilityThe xcxclus utility uses the GNUplot open source plotting program.See xcxclus(1) for more informa

Seite 122 - 11.3.3 Example Procedure 3

Use the -node option to specify another node in your allocation, for example, enter thefollowing to display the performance metrics for node n6:$ xcxp

Seite 123 - 11.5.2 Private File View

Displays the color map for the data on interrupts.EthernetDisplays the color map for the data on interrupts.InterruptsDisplays the color map for the d

Seite 125 - A Examples

Figure 7-7 Plotting Node Data from the xcxperf UtilityThe xcxperf utility plots the performance data and displays as much data that can be shownon one

Seite 126 - View the job:

NOTE: The --nodelist=nodelist option is particularly useful fordetermining problematic nodes.If you use this option and the --nnodes=n option, the --n

Seite 127

network_stress network_bidirectional network_unidirectionalBy default, the ovp command reports if the nodes passed or failed the given test.

Seite 128 - 128 Examples

Details of this verification have been recorded in:HOME_DIRECTORY/ovp_n16_mmddyy.logThe following example tests the memory of nodes n11, n12, n13, n14

Seite 130 - Show the job allocation:

8 Tuning ApplicationsThis chapter discusses how to tune applications in the HP XC environment.8.1 Using the Intel Trace Collector and Intel Trace Anal

Seite 131 - View the finished jobs:

Example 8-1 The vtjacobic Example ProgramFor the purposes of this example, the examples directory under /opt/IntelTrace/ITC iscopied to the user'

Seite 132 - View the running job:

8.2 The Intel Trace Collector and Analyzer with HP-MPI on HP XCNOTE: The Intel Trace Collector (OTA) was formerly known as VampirTrace. The Intel Trac

Seite 133 - View the finished job:

Running a ProgramEnsure that the -static-libcxa flag is used when you use mpirun.mpich to launch a C orFortran program.The following is a C example ca

Seite 134

[0] Intel Trace Collector INFO: Writing tracefile vtjacobif.stf in /nis.home/user_name/xc_PDE_work/ITC_examples_xc6000mpirun exits with status: 0Runni

Seite 135 - Glossary

List of Figures4-1 Library Directory Structure...

Seite 137

9 Using SLURMHP XC uses the Simple Linux Utility for Resource Management (SLURM) for system resourcemanagement and job scheduling.This chapter address

Seite 138 - 138 Glossary

The srun command handles both serial and parallel jobs.The srun command has a significant number of options to control the execution of yourapplicatio

Seite 139

Example 9-2 Displaying Queued Jobs by Their JobIDs$ squeue --jobs 12345,12346 JOBID PARTITION NAME USER ST TIME_USED NODES NODELIST(REASON)

Seite 140 - 140 Glossary

Example 9-8 Reporting Reasons for Downed, Drained, and Draining Nodes$ sinfo -R REASON NODELIST Memory errors

Seite 141

10 Using LSF-HPCThe Load Sharing Facility (LSF-HPC) from Platform Computing Corporation is a batch systemresource manager used on the HP XC system.On

Seite 142 - 142 Index

LSF-HPC is installed and configured on all nodes of the HP XC system by default. Nodes withoutthe compute role are closed with '0' job slots

Seite 143

SLURM_JOBID This environment variable is created so that subsequent srun commandsmake use of the SLURM allocation created by LSF-HPC for the job. This

Seite 144 - 144 Index

Example 10-2 Examples of Launching LSF-HPC Jobs Without the srun CommandThe following bsub command line invokes the bash shell to run the hostname com

Seite 145

• LSF-HPC integrated with SLURM only runs daemons on one node within the HP XC system.This node hosts an HP XC LSF Alias, which is an IP address and c

Kommentare zu diesen Handbüchern

Keine Kommentare