Hp XC System 4.x Software Bedienungsanleitung

Stöbern Sie online oder laden Sie Bedienungsanleitung nach Software Hp XC System 4.x Software herunter. HP XC System 4.x Software User Manual Benutzerhandbuch

  • Herunterladen
  • Zu meinen Handbüchern hinzufügen
  • Drucken
  • Seite
    / 135
  • Inhaltsverzeichnis
  • LESEZEICHEN
  • Bewertet. / 5. Basierend auf Kundenbewertungen

Inhaltsverzeichnis

Seite 1 - XC User Guide

HP XC System SoftwareXC User GuideVersion 4.0HP Part Number: A-XCUSR-40aPublished: February 2009

Seite 2

List of Examples5-1 Submitting a Job from the Standard Input...505-2

Seite 3 - Table of Contents

Table 10-2 Output Provided by the bhist Command (continued)DescriptionFieldThe total unknown time of the job.UNKWNThe total time that the job has spen

Seite 4 - 4 Table of Contents

$ sacct -j 123Jobstep Jobname Partition Ncpus Status Error---------- ------------------ ---------- ------- ---------- -----123

Seite 5 - Table of Contents 5

$ export SLURM_JOBID=150$ srun hostnamen1n2n3n4Note:Be sure to unset the SLURM_JOBID when you are finished with the allocation, to prevent aprevious S

Seite 6 - 6 Table of Contents

Note:If you exported any variables, such as SLURM_JOBID and SLURM_NPROCS, be sure to unsetthem as follows before submitting any further jobs from the

Seite 7 - Table of Contents 7

Table 10-3 LSF Equivalents of SLURM srun Options (continued)LSF EquivalentDescriptionsrun Option-ext"SLURM[nodelist=node1,...nodeN]"Requests

Seite 8 - List of Figures

Table 10-3 LSF Equivalents of SLURM srun Options (continued)LSF EquivalentDescriptionsrun OptionYou cannot use this option. LSF uses this optionto cre

Seite 9 - List of Tables

Table 10-3 LSF Equivalents of SLURM srun Options (continued)LSF EquivalentDescriptionsrun OptionUse as an argument to srun when launchingparallel task

Seite 10 - List of Examples

11 Advanced TopicsThis chapter covers topics intended for the advanced user. This chapter addresses the followingtopics:• “Enabling Remote Execution w

Seite 11 - About This Document

$ echo $DISPLAY:0Next, get the name of the local machine serving your display monitor:$ hostnamemymachineThen, use the host name of your local machine

Seite 12

$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 2 idle n[46,48]According to the information returned about this

Seite 13 - Related Information

About This DocumentThis document provides information about using the features and functions of the HP XC SystemSoftware. It describes how the HP XC u

Seite 14

proceed while another is waiting for I/O. On an HP XC system, there is the potential to usecompute nodes to do compilations, and there are a variety o

Seite 15 - Linux Web Sites

for i in ${HYPRE_DIRS}; \ do \ if [ -d $$i ]; \ then \ echo "Making $$i ..."; \ (cd $

Seite 16 - Manpages

Modified Makefile:all: $(MAKE) $(MAKE_J) struct_matrix_vector/libHYPRE_mv.astruct_linear_solvers/libHYPRE_ls.a utilities/libHYPRE_utilities.a

Seite 17 - HP Encourages Your Comments

11.5 I/O Performance ConsiderationsBefore building and running your parallel application, I/O performance issues on the HP XCcluster must be considere

Seite 18

respectively. These subsections are not full solutions for integrating MPICH with the HP XCSystem Software.Figure 11-1 MPICH Wrapper Script#!/bin/cshs

Seite 19 - 1.1 System Architecture

A ExamplesThis appendix provides examples that illustrate how to build and run applications on the HPXC system. The examples in this section show you

Seite 20 - 1.1.4 Node Specialization

Examine the partition information:$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 6 idle n[5-10]Examine the loca

Seite 21 - 1.1.6 File System

date and time stamp: Submitted from host <lsfhost.localdomain>, CWD <$HOME>, 2 Processors Requested;date and time stam

Seite 22 - File System Layout

steps through a series of commands that illustrate what occurs when you launch an interactiveshell.Examine the LSF execution host information:$ bhosts

Seite 23 - 1.3 User Environment

Summary of time in seconds spent in various states by date and time PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL 11 0 124 0 0 0

Seite 24 - 1.4.1 Parallel Applications

Ctrl+x A key sequence. A sequence such as Ctrl+x indicates that youmust hold down the key labeled Ctrl while you press anotherkey or mouse button.ENVI

Seite 25 - 1.5 Run-Time Environment

srun hostnamesrun uname -aRun the job:$ bsub -I -n4 myjobscript.shJob <1006> is submitted to default queue <normal>.<<Waiting for di

Seite 26 - 1.5.5 HP-MPI

Show the SLURM job ID:$ env | grep SLURMSLURM_JOBID=74SLURM_NPROCS=8Run some commands from the pseudo-terminal:$ srun hostnamen13n13n14n14n15n15n16n16

Seite 27

View the node state:$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 4 idle n[13-16]A.7 Submitting an HP-MPI Jo

Seite 28

EXTERNAL MESSAGES:MSG_ID FROM POST_TIME MESSAGE ATTACHMENT0 - - - - 1 lsfadmin

Seite 30 - 2.2.1 Introduction

GlossaryAadministrationbranchThe half (branch) of the administration network that contains all of the general-purposeadministration ports to the nodes

Seite 31 - 2.2.8 Resuming Suspended Jobs

operating system and its loader. Together, these provide a standard environment for bootingan operating system and running preboot applications.enclos

Seite 32 - 32 Using the System

image server A node specifically designated to hold images that will be distributed to one or more clientsystems. In a standard HP XC installation, th

Seite 33 - 3.1 Overview of Modules

LVS Linux Virtual Server. Provides a centralized login capability for system users. LVS handlesincoming login requests and directs them to a node with

Seite 34 - 3.2 Supplied Modulefiles

onboardadministratorSee OA.PparallelapplicationAn application that uses a distributed programming model and can run on multiple processors.An HP XC MP

Seite 35

Provides an overview of managing the HP XC user environmentwith modules, managing jobs with LSF, and describes how tobuild, run, debug, and troublesho

Seite 36 - 3.6 Loading a Modulefile

an HP XC system, the use of SMP technology increases the number of CPUs (amount ofcomputational power) available per unit of space.ssh Secure Shell. A

Seite 37 - 3.9 Modulefile Conflicts

IndexAACML library, 45application development, 39building parallel applications, 45building serial applications, 42communication between nodes, 113com

Seite 38 - 3.10 Creating a Modulefile

CP3000, 20MKL library, 45system interconnect, 22CP3000BL, 20CP4000, 20ACML library, 45compilers, 40, 44designing libraries for, 46MKL library, 45softw

Seite 39 - 4 Developing Applications

job accounting, 84job allocation informationobtaining, 97job manager, 86job scheduler, 86JOBID translation, 100Llaunching jobssrun, 81libraries, 27bui

Seite 40 - 4.2 Compilers

developing, 39environment for developing, 24examples of, 115partitionreporting state of, 83PATH environment variablesetting with a module, 34Pathscale

Seite 41 - 4.5 Setting Debugging Options

UPC, 39user environment, 33VVampir, 77VampirTrace/Vampir, 75WWeb siteHP XC System Software documentation, 12Xxtermrunning from remote node, 107135

Seite 42 - 42 Developing Applications

software components are generic, and the HP XC adjective is not added to any reference to athird-party or open source command or product name. For exa

Seite 43 - 4.7.1.5 Quadrics SHMEM

• http://www.balabit.com/products/syslog_ng/Home page for syslog-ng, a logging tool that replaces the traditional syslog functionality.The syslog-ng t

Seite 44 - 44 Developing Applications

MPI Web Sites• http://www.mpi-forum.orgContains the official MPI standards documents, errata, and archives of the MPI Forum. TheMPI Forum is an open g

Seite 45 - $ module load mpi

Manpages for third-party software components might be provided as a part of the deliverablesfor that component.Using discover(8) as an example, you ca

Seite 47 - 4.8 Developing Libraries 47

1 Overview of the User EnvironmentThe HP XC system is a collection of computer nodes, networks, storage, and software, built intoa cluster, that work

Seite 48 - 48 Developing Applications

© Copyright 2003, 2005, 2006, 2007, 2008, 2009 Hewlett-Packard Development Company, L.P.Confidential computer software. Valid license from HP required

Seite 49 - 5 Submitting Jobs

$ head /proc/cpuinfoTable 1-1 presents the representative output for each of the platforms. This output may differaccording to changes in models and s

Seite 50 - 50 Submitting Jobs

distributes login requests from users. A node with the login role is referredto as a login node in this manual.compute role The compute role is assign

Seite 51 - 5.3 Submitting a Parallel Job

the HP XC. So, for example, if the HP XC system interconnect is based on a Quadrics® QsNetII® switch, then the SFS will serve files over ports on that

Seite 52 - 52 Submitting Jobs

Additional information on supported system interconnects is provided in the HP XC HardwarePreparation Guide.1.1.8 Network Address Translation (NAT)The

Seite 53

Modulefiles can be loaded into the your environment automatically when you log in to thesystem, or any time you need to alter the environment. The HP

Seite 54

1.4.2 Serial ApplicationsYou can build and run serial applications under the HP XC development environment. A serialapplication is a command or applic

Seite 55

1.5.4 How LSF and SLURM InteractIn the HP XC environment, LSF cooperates with SLURM to combine the powerful schedulingfunctionality of LSF with the sc

Seite 56 - 56 Submitting Jobs

1.6 Components, Tools, Compilers, Libraries, and DebuggersThis section provides a brief overview of the some of the common tools, compilers, libraries

Seite 58 - 58 Submitting Jobs

2 Using the SystemThis chapter describes the tasks and commands that the general user must know to use thesystem. It addresses the following topics:•

Seite 59

Table of ContentsAbout This Document...11Intended

Seite 60 - 60 Submitting Jobs

overview about some basic ways of running and managing jobs. Full information and detailsabout the HP XC job launch environment are provided in “Using

Seite 61 - -R "type=SLINUX64"

For more information about using this command and a sample of its output, see “ExaminingSystem Core Status” (page 95)• The LSF lshosts command display

Seite 62

2.3.1 Determining the LSF Cluster Name and the LSF Execution HostThe lsid command returns the LSF cluster name, the LSF version, and the name of the L

Seite 63 - 6 Debugging Applications

3 Configuring Your Environment with ModulefilesThe HP XC system supports the use of Modules software to make it easier to configure andmodify the your

Seite 64 - 6.2.1.2 Setting Up TotalView

access the mpi** scripts and libraries. You can specify the compiler it uses through a variety ofmechanisms long after the modulefile is loaded.The pr

Seite 65

Table 3-1 Supplied Modulefiles (continued)Sets the HP XC User Environment to Use:ModulefileIntel C/C++ Version 8.1 compilers.icc/8.1/defaultIntel C/C+

Seite 66 - 66 Debugging Applications

Each module supplies its own online help. See “Viewing Modulefile-Specific Help” for informationon how to view it.3.3 Modulefiles Automatically Loaded

Seite 67 - 6.2.1.8 Exiting TotalView

For example, if you wanted to automatically load the TotalView modulefile when you log in,edit your shell startup script to include the following inst

Seite 68

In this example, a user attempted to load the ifort/8.0 modulefile. After the user issued thecommand to load the modulefile, an error message occurred

Seite 69 - 7 Monitoring Node Activity

4 Developing ApplicationsThis chapter discusses topics associated with developing applications in the HP XC environment.Before reading this chapter, y

Seite 70 - 70 Monitoring Node Activity

2.3.1 Determining the LSF Cluster Name and the LSF Execution Host...322.4 Getting System Help and Information..

Seite 71

HP UPC is a parallel extension of the C programming language, which runs on both commontypes of multiprocessor systems: those with a common global add

Seite 72 - 72 Monitoring Node Activity

4.3 Examining Nodes and Partitions Before Running JobsBefore launching an application, you can determine the availability and status of the system&apo

Seite 73

4.6.1 Serial Application Build EnvironmentYou can build and run serial applications in the HP XC programming environment. A serialapplication is a com

Seite 74

4.7.1.1 ModulefilesThe basics of your working environment are set up automatically by your system administratorduring the installation of HP XC. Howev

Seite 75 - 8 Tuning Applications

To compile programs that use SHMEM, it is necessary to include the shmem.h file and to usethe SHMEM and Elan libraries. For example:$ gcc -o shping sh

Seite 76 - 76 Tuning Applications

4.7.1.12 MKL LibraryMKL is a math library that references pthreads, and in enabled environments, can use multiplethreads. MKL can be linked in a singl

Seite 77 - 8.2.1 Installation Kit

To compile and link a C application using the mpicc command:$ mpicc -o mycode hello.c To compile and link a Fortran application using the mpif90 comma

Seite 78 - Running a Program

names. However, HP recommends an alternative method. The dynamic linker, during its attemptto load libraries, will suffix candidate directories with t

Seite 79

NOTE: There is no shortcut as there is for the dynamic loader.48 Developing Applications

Seite 80

5 Submitting JobsThis chapter describes how to submit jobs on the HP XC system; it addresses the following topics:• “Overview of Job Submission” (page

Seite 81 - 9 Using SLURM

5.2 Submitting a Serial Job Using LSF...495.2.1 Submit

Seite 82 - 9.3.1.2 The srun Modes

The srun command is only necessary to launch the job on the allocated node if the HP XC JOBSTARTER script is not configured to run a job on the comput

Seite 83

#include <unistd.h> #include <stdio.h> int main() { char name[100]; gethostname(name, sizeof(name)); printf("%s says Hello!\n&

Seite 84 - 9.9 Security

The bsub command submits the job to LSF.The -n num-procs parameter, which is required for parallel jobs, specifies the number of coresrequested for th

Seite 85 - 10 Using LSF

variable that was set by LSF; this environment variable is equivalent to the number provided bythe -n option of the bsub command.Any additional SLURM

Seite 86 - $ squeue --jobs $SLURM_JOBID

options that specify the minimum number of nodes required for the job, specific nodes for thejob, and so on.Note:The SLURM external scheduler is a plu

Seite 87

Example 5-9 Using the External Scheduler to Submit a Job to Run on Specific Nodes$ bsub -n4 -ext "SLURM[nodelist=n6,n8]" -I srun hostnameJob

Seite 88

Example 5-13 Using the External Scheduler to Constrain Launching to Nodes with a Given Feature$ bsub -n 10 -ext "SLURM[constraint=dualcore]"

Seite 89 - 10.4 Job Terminology

Example 5-15 Submitting a Batch Script with the LSF-SLURM External Scheduler Option$ bsub -n4 -ext "SLURM[nodes=4]" -I ./myscript.shJob <

Seite 90 - 90 Using LSF

Example 5-18 Environment Variables Available in a Batch Job Script$ cat ./envscript.sh#!/bin/sh name=`hostname` echo "hostname = $name" echo

Seite 91 - 10.6 Submitting Jobs

The ping_pong_ring application is submitted twice in a Makefile named mymake; the firsttime as run1 and the second as run2:$ cat mymake PPR_ARGS=1000

Seite 92 - 92 Using LSF

9.7 Job Accounting...849.

Seite 93 - SLURM_NPROCS=4

1This line attempts to submita program that does notexist.The following command line makes the program and executes it:$ bsub -o %J.out -n2 -ext "

Seite 94

5.6 Submitting a Job from a Host Other Than an HP XC HostTo submit a job from a host other than an HP XC host to the HP XC system, use the LSF -Roptio

Seite 96 - 96 Using LSF

6 Debugging ApplicationsThis chapter describes how to debug serial and parallel applications in the HP XC developmentenvironment. In general, effectiv

Seite 97

6.2.1 Debugging with TotalViewTotalView is a full-featured, debugger based on GUI and specifically designed to fill therequirements of parallel applic

Seite 98 - 98 Using LSF

6.2.1.3 Using TotalView with SLURMUse the following commands to allocate the nodes you need before you debug an applicationwith SLURM, as shown here:$

Seite 99

6.2.1.6 Debugging an ApplicationThis section describes how to use TotalView to debug an application.1. Compile the application to be debugged. For exa

Seite 100

6.2.1.7 Debugging Running ApplicationsAs an alternative to the method described in “Debugging an Application”, it is also possible to"attach"

Seite 102 - 102 Using LSF

7 Monitoring Node ActivityThis chapter describes the optional utilities that provide performance information about the setof nodes associated with you

Seite 103 - $ srun --jobid=250 uptime

A.4 Launching a Parallel Interactive Shell Through LSF...117A.5 Submitting a Simple J

Seite 104

7.2 Running Performance Health TestsYou can run the ovp command to generate reports on the performance health of the nodes. Usethe following format to

Seite 105

NOTE: Except for the network_stress and network_bidirectional tests,these tests only apply to systems that install LSF incorporated with SLURM. Thenet

Seite 106

Verify perf_health: Testing cpu_usage ... The headnode is excluded from the cpu usage test. Number of nodes allocated for this test i

Seite 107 - 11 Advanced Topics

The tests execution directory has been saved in:HOME_DIRECTORY/ovp_n16_mmddyy.testsDetails of this verification have been recorded in:HOME_DIRECTORY/o

Seite 109

8 Tuning ApplicationsThis chapter discusses how to tune applications in the HP XC environment.8.1 Using the Intel Trace Collector and Intel Trace Anal

Seite 110 - 110 Advanced Topics

Example 8-1 The vtjacobic Example ProgramFor the purposes of this example, the examples directory under /opt/IntelTrace/ITC iscopied to the user'

Seite 111 - 11.3.2 Example Procedure 2

8.2 The Intel Trace Collector and Analyzer with HP-MPI on HP XCNOTE: The Intel Trace Collector (OTA) was formerly known as VampirTrace. The Intel Trac

Seite 112 - 11.3.3 Example Procedure 3

Running a ProgramEnsure that the -static-libcxa flag is used when you use mpirun.mpich to launch a C orFortran program.The following is a C example ca

Seite 113 - 11.5.2 Private File View

[0] Intel Trace Collector INFO: Writing tracefile vtjacobif.stf in /nis.home/user_name/xc_PDE_work/ITC_examples_xc6000mpirun exits with status: 0Runni

Seite 114 - % bsub -I options... wrapper

List of Figures4-1 Library Directory Structure...

Seite 116 - View the job:

9 Using SLURMHP XC uses the Simple Linux Utility for Resource Management (SLURM) for system resourcemanagement and job scheduling.This chapter address

Seite 117

The srun command handles both serial and parallel jobs.The srun command has a significant number of options to control the execution of yourapplicatio

Seite 118 - 118 Examples

Example 9-2 Displaying Queued Jobs by Their JobIDs$ squeue --jobs 12345,12346 JOBID PARTITION NAME USER ST TIME_USED NODES NODELIST(REASON)

Seite 119 - Display the script:

Example 9-8 Reporting Reasons for Downed, Drained, and Draining Nodes$ sinfo -R REASON NODELIST Memory errors

Seite 120 - Show the job allocation:

10 Using LSFThe Load Sharing Facility (LSF) from Platform Computing is a batch system resource managerused on the HP XC system.On an HP XC system, a j

Seite 121 - View the finished jobs:

The LSF environment is set up automatically for the user on login; LSF commands and theirmanpages are readily accessible:• The bhosts command is usefu

Seite 122 - View the running job:

“Translating SLURM and LSF JOBIDs” describes the relationship betweenthe SLURM_JOBID and the LSF JOBID.SLURM_NPROCSThis environment variable passes al

Seite 123 - View the finished job:

Example 10-2 Examples of Launching LSF Jobs Without the srun CommandThe following bsub command line invokes the bash shell to run the hostname command

Seite 124

• LSF integrated with SLURM only runs daemons on one node within the HP XC system. Thisnode hosts an HP XC LSF Alias, which is an IP address and corre

Seite 125 - Glossary

List of Tables1-1 Determining the Node Platform...201-

Seite 126 - 126 Glossary

sometime in the future, depending on resource availability andbatch system scheduling policies.Batch job submissions typically provide instructions on

Seite 127

allocates the appropriate whole node for exclusive use by theserial job in the same manner as it does for parallel jobs, hencethe name “pseudo-paralle

Seite 128 - 128 Glossary

request more than one core for a job. This option, coupled with the external SLURM scheduler,discussed in “LSF-SLURM External Scheduler”, gives you mu

Seite 129

Figure 10-1 How LSF and SLURM Launch and Manage a JobN 1 6N16User124666677775job_starter.sh$ srun -nl myscriptLogin node$ bsub-n4 -ext ”SLURM[nodes-4]

Seite 130 - 130 Glossary

4. LSF prepares the user environment for the job on the LSF execution host node and dispatchesthe job with the job_starter.sh script. This user enviro

Seite 131

10.10.1 Examining System Core StatusThe bhosts command displays LSF resource usage information. This command is useful toexamine the status of the sys

Seite 132 - 132 Index

10.10.3 Getting Host Load InformationThe LSF lsload command displays load information for LSF execution hosts.$ lsloadHOST_NAME status r15s r1m

Seite 133

on this topic. See the LSF manpages for full information about the commands described in thissection.The following LSF commands are described in this

Seite 134 - 134 Index

date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>;date and time stamp: slurm_id=22;ncpus=8;slurm_alloc=n[5-8];Example 1

Seite 135

Example 10-6 Using the bjobs Command (Long Output)$ bjobs -l 24Job <24>, User <msmith>,Project <default>,Status <RUN>,

Kommentare zu diesen Handbüchern

Keine Kommentare