Hp XC System 3.x Software Bedienungsanleitung Seite 1

Stöbern Sie online oder laden Sie Bedienungsanleitung nach Software Hp XC System 3.x Software herunter. HP XC System 3.x Software User Manual Benutzerhandbuch

  • Herunterladen
  • Zu meinen Handbüchern hinzufügen
  • Drucken
  • Seite
    / 118
  • Inhaltsverzeichnis
  • LESEZEICHEN
  • Bewertet. / 5. Basierend auf Kundenbewertungen

Inhaltsverzeichnis

Seite 1 - User's Guide

HP XC System SoftwareUser's GuideVersion 3.0Part number: 5991-4847published January 2006

Seite 3 - Table of Contents

Examine the local host information:$ hostnamen2Examine the job information:$ bjobsNo unfinished job foundRun the LSF bsub -Is command to launch the in

Seite 4 - 5 Submitting Jobs

SCHEDULING PARAMETERS: r15s r1m r15m ut pg io ls it tmp swp memloadSched - - - - - - - - - - -loadStop - - - -

Seite 5 - 9 Using LSF

Examine the partition information:$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 6 idle n[5-10]Examine the loca

Seite 6 - 10 Advanced Topics

Examine the the finished job's information:$ bhist -l 124Job <124>, User <lsfadmin>, Project <default>, Interactive pseudo-term

Seite 7 - List of Figures

n16n16Linux n14 2.4.21-15.3hp.XCsmp #2 SMP date and time stamp ia64 ia64 ia64 GNU/LinuxLinux n14 2.4.21-15.3hp.XCs

Seite 8

n15n15n16n16$ srun -n3 hostnamen13n14n15Exit the pseudo-terminal:$ exitexitView the interactive jobs:$ bjobs -l 1008Job <1008>, User smith, Proj

Seite 9 - List of Tables

Copyright 1992-2004 Platform Computing CorporationMy cluster name is penguinMy master name is lsfhost.localdomain$ sinfoPARTITION AVAIL TIMELIMIT N

Seite 10

6 Processors Requested;date and time stamp: Dispatched to 6 Hosts/Processors <6*lsfhost.localdomain>;d

Seite 12

GlossaryAadministrationbranchThe half (branch) of the administration network that contains all of the general-purposeadministration ports to the nodes

Seite 13 - About This Document

List of Examples4-1 Directory Structure...

Seite 14 - HP XC Information

FCFS First-come, first-served. An LSF job-scheduling policy that specifies that jobs are dispatchedaccording to their order in a queue, which is deter

Seite 15 - Supplementary Information

LLinux VirtualServerSee LVS.load file A file containing the names of multiple executables that are to be launched simultaneously by asingle command.Lo

Seite 16 - Related Information

NetworkInformationServicesSee NIS.NIS Network Information Services. A mechanism that enables centralization of common data that ispertinent across mul

Seite 17 - Additional Publications

SMP Symmetric multiprocessing. A system with two or more CPUs that share equal (symmetric) accessto all of the facilities of a computer system, such a

Seite 19 - System Architecture

IndexAACML library, 42application development, 37building parallel applications, 42building serial applications, 39communication between nodes, 97comp

Seite 20 - Storage and I/O

configuring local disk, 96core availability, 38DDDT, 53debuggerTotalView, 53debuggingDDT, 53gdb, 53idb, 53pgdbg, 53TotalView, 53debugging optionssetti

Seite 21 - File System

submitting jobs, 77summary of bsub command, 77using srun with, 64viewing historical information of jobs, 82LSF-SLURM external scheduler, 45lshosts com

Seite 22 - System Interconnect Network

examples of, 99programming model, 39shared file view, 97signalsending to a job, 65Simple Linux Utility for Resource Management (see SLURM)sinfo comman

Seite 24 - Run-Time Environment

About This DocumentThis document provides information about using the features and functions of the HP XC System Software.It describes how the HP XC u

Seite 25 - Standard LSF

• Chapter 10: Advanced Topics (page 91) provides information on remote execution, running an Xterminal session from a remote node, and I/O performance

Seite 26

Documentation for the HP Integrity and HP ProLiant servers is available at the following URL:http://www.docs.hp.com/For More InformationThe HP Web sit

Seite 27 - 2 Using the System

• http://supermon.sourceforge.net/Home page for Supermon, a high-speed cluster monitoring system that emphasizes low perturbation,high sampling rates,

Seite 28 - Introduction

Related Linux Web Sites• http://www.redhat.comHome page for Red Hat®, distributors of Red Hat Enterprise Linux Advanced Server, a Linux distributionwi

Seite 29

•Perl Cookbook, Second Edition, by Tom Christiansen and Nathan Torkington•Perl in A Nutshell: A Desktop Quick Reference, by Ellen Siever, et al.Typogr

Seite 30 - 30 Using the System

1 Overview of the User EnvironmentThe HP XC system is a collection of computer nodes, networks, storage, and software, built into a cluster,that work

Seite 31 - Overview of Modules

© Copyright 2003, 2005, 2006 Hewlett-Packard Development Company, L.P.Confidential computer software. Valid license from HP required for possession, u

Seite 32 - Supplied Modulefiles

Table 1-1 Determining the Node PlatformPartial Output of /proc/cpuinfoPlatformprocessor : 0 vendor_id : GenuineIntel cpu family : 15

Seite 33 - Loading a Modulefile

SAN StorageThe HP XC system uses the HP StorageWorks Scalable File Share (HP StorageWorks SFS), which is basedon Lustre technology and uses the Lustre

Seite 34 - Modulefile Conflicts

Be aware of the following information about the HP XC file system layout:• Open source software that by default would be installed under the /usr/loca

Seite 35 - Creating a Modulefile

free -mDisk Partitions Use the following command to display the disk partitions and theirsizes:cat /proc/partitionsSwap Use the following command to d

Seite 36

Documentation CD contains XC LSF manuals from Platform Computing. LSFmanpages are available on the HP XC system.SLURM commands HP XC uses the Simple L

Seite 37 - 4 Developing Applications

by default for LSF-HPC batch jobs. The system administrator has the option of creating additional partitions.For example, another partition could be c

Seite 38 - Interrupting a Job

SLURM Allocates nodes for jobs as determined by LSF-HPC. It CONTROLS task/rank distribution withinthe allocated nodes. SLURM also starts the executabl

Seite 39 - Setting Debugging Options

2 Using the SystemThis chapter describes the tasks and commands that the general user must know to use the system. It addressesthe following topics:•

Seite 40 - Modulefiles

IntroductionAs described in Run-Time Environment (page 24), SLURM and LSF-HPC cooperate to run and manage jobson the HP XC system, combining LSF-HPC&a

Seite 41 - Standard

$ lsloadFor more information about using this command and a sample of its output, see Getting Host LoadInformation (page 76).Getting Information About

Seite 42 - 42 Developing Applications

Table of ContentsAbout This Document...13Intended Audience...

Seite 43 - Developing Libraries

Getting System Help and InformationIn addition to the hardcopy documentation described in the preface of this document (About This Document),the HP XC

Seite 44 - 44 Developing Applications

3 Configuring Your Environment with ModulefilesThe HP XC system supports the use of Modules software to make it easier to configure and modify the you

Seite 45

could cause inconsistencies in the use of shared objects. If you have multiple compilers (perhaps withincompatible shared objects) installed, it is pr

Seite 46 - 46 Submitting Jobs

Table 3-1 Supplied ModulefilesSets the HP XC User Environment to Use:ModulefileIntel C/C++ Version 8.0 compilers.icc/8.0Intel C/C++ Version 8.1 compil

Seite 47

you are attempting to load conflicts with a currently loaded modulefile, the modulefile will not be loadedand an error message will be displayed.If yo

Seite 48 - 48 Submitting Jobs

When a modulefile conflict occurs, unload the conflicting modulefile before loading the new modulefile. Inthe previous example, you should unload the

Seite 50 - 50 Submitting Jobs

4 Developing ApplicationsThis chapter discusses topics associated with developing applications in the HP XC environment. Beforereading this chapter, y

Seite 51 - Running Preexecution Programs

Table 4-1, “Compiler Commands” displays the compiler commands for Standard Linux, Intel, and PGIcompilers for the C, C++, and Fortran languages.Table

Seite 52 - 52 Submitting Jobs

The Ctrl/C key sequence will report the state of all tasks associated with the srun command. If the Ctrl/Ckey sequence is entered twice within one sec

Seite 53

3 Configuring Your Environment with ModulefilesOverview of Modules...

Seite 54 - Using TotalView with SLURM

Developing Parallel ApplicationsThis section describes how to build and run parallel applications. The following topics are discussed:• Parallel Appli

Seite 55 - Debugging an Application

PthreadsPOSIX Threads (Pthreads) is a standard library that programmers can use to develop portable threadedapplications. Pthreads can be used in conj

Seite 56 - TotalView process window

http://www.pathscale.com/ekopath.html.GNU Parallel MakeThe GNU parallel Make command is used whenever the make command is invoked. GNU parallel Makepr

Seite 57 - Exiting TotalView

Examples of Compiling and Linking HP-MPI ApplicationsThe following examples show how to compile and link your application code by invoking a compiler

Seite 58

recommends an alternative method. The dynamic linker, during its attempt to load libraries, will suffixcandidate directories with the machine type. Th

Seite 59

5 Submitting JobsThis chapter describes how to submit jobs on the HP XC system; it addresses the following topics:• Overview of Job Submission (page 4

Seite 60 - 60 Tuning Applications

Submitting a Serial Job Using Standard LSFExample 5-1 Submitting a Serial Job Using Standard LSFUse the bsub command to submit a serial job to standar

Seite 61

Example 5-3 Submitting an Interactive Serial Job Using LSF-HPC only$ bsub -I hostnameJob <73> is submitted to default queue <normal>.<&

Seite 62

The output for this command could also have been 1 core on each of 4 compute nodes in the SLURMallocation.Submitting a Non-MPI Parallel JobUse the fol

Seite 63

to the number provided by the -n option of the bsub command. Any additional SLURM srun options arejob specific, not allocation-specific.The mpi-jobnam

Seite 64 - The srun Roles and Modes

Running Preexecution Programs...516 Debugg

Seite 65 - Job Accounting

In Example 5-9, a simple script named myscript.sh, which contains two srun commands, is displayedthen submitted.Example 5-9 Submitting a Job Script$ c

Seite 66 - Security

Example 5-12 Submitting a Batch job Script That Uses the srun --overcommit Option$ bsub -n4 -I ./myscript.shJob <81> is submitted to default que

Seite 67

program should pick up the SLURM_JOBID environment variable. The SLURM_JOBID has the informationLSF-HPC needs to run the job on the nodes required by

Seite 68 - Overview of LSF-HPC

6 Debugging ApplicationsThis chapter describes how to debug serial and parallel applications in the HP XC development environment.In general, effectiv

Seite 69 - Using LSF-HPC 69

This section provides only minimum instructions to get you started using TotalView. Instructions for installingTotalView are included in theHP XC Syst

Seite 70 - Job Terminology

Using TotalView with LSF-HPCHP recommends the use of xterm when debugging an application with LSF-HPC. You also need to allocatethe nodes you will nee

Seite 71 - Using LSF-HPC 71

4. TheTotalView process windowopens.This window contains multiple panes that provide various debugging functions and debugginginformation. The name of

Seite 72 - Notes on LSF-HPC

Exiting TotalViewIt is important that you make sure your job has completed before exiting TotalView. This may require thatyou wait a few seconds from

Seite 74 - Job Startup and Job Control

7 Tuning ApplicationsThis chapter discusses how to tune applications in the HP XC environment.Using the Intel Trace Collector and Intel Trace Analyzer

Seite 75 - Preemption

Getting Information About the lsf Partition...76Submitting Jobs...

Seite 76 - Getting Host Load Information

Example 7-1 The vtjacobic Example ProgramFor the purposes of this example, the examples directory under /opt/IntelTrace/ITC is copied to theuser'

Seite 77 - Submitting Jobs

<install-path-name>/ITA/doc/Intel_Trace_Analyzer_Users_Guide.pdfUsing the Intel Trace Collector and Intel Trace Analyzer 61

Seite 79 - Using LSF-HPC 79

8 Using SLURMHP XC uses the Simple Linux Utility for Resource Management (SLURM) for system resource managementand job scheduling.This chapter address

Seite 80 - 80 Using LSF

The srun command has a significant number of options to control the execution of your application closely.However, you can use it for a simple launch

Seite 81 - Examining the Status of a Job

The squeue command can report on jobs in the job queue according to their state; possible states are:pending, running, completing, completed, failed,

Seite 82 - 82 Using LSF

# chmod a+r /hptc_cluster/slurm/job/jobacct.logYou can find detailed information on the sacct command and job accounting data in the sacct(1) manpage.

Seite 83 - Using LSF-HPC 83

9 Using LSFThe Load Sharing Facility (LSF) from Platform Computing Corporation is a batch system resource managerused on the HP XC system. LSF is an i

Seite 84 - 84 Using LSF

job management and information capabilities. LSF-HPC schedules, launches, controls, and tracks jobs thatare submitted to it according to the policies

Seite 85 - Using LSF-HPC 85

Differences Between LSF-HPC and Standard LSFLSF-HPC for the HP XC environment supports all the standard features and functions that standard LSF suppo

Seite 86 - 86 Using LSF

List of Figures9-1 How LSF-HPC and SLURM Launch and Manage a Job...737

Seite 87 - Using LSF-HPC 87

• All HP XC nodes are dynamically configured as “LSF Floating Client Hosts” so that you can executeLSF commands from any HP XC node. When you do execu

Seite 88 - 88 Using LSF

Serial jobs are allocated a single CPU on a shared node with minimalcapacities that satisfies other allocation criteria. LSF-HPC always tries torun mu

Seite 89 - Using LSF-HPC 89

• exclude= list-of-nodes• contiguous=yesThe srun(1) manpage provides details on these options and their arguments.The following are interactive exampl

Seite 90

• Use the bjobs command to monitor job status in LSF-HPC.• Use the bqueues command to list the configured job queues in LSF-HPC.How LSF-HPC and SLURM

Seite 91

This bsub command launches a request for four cores (from the -n4 option of the bsub command)across four nodes (from the -ext "SLURM[nodes=4]&quo

Seite 92 - 92 Advanced Topics

PreemptionLSF-HPC uses the SLURM "node share" feature to facilitate preemption. When a low-priority is job preempted,job processes are suspe

Seite 93

The following example shows the output from the lshosts command:$ lshostsHOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCESlsfhost

Seite 94 - 94 Advanced Topics

$ sinfo -p lsfPARTITION AVAIL TIMELIMIT NODES STATE NODELISTlsf up infinite 128 idle n[1-128]Use the following command to obtain more i

Seite 95 - Example Procedure 2

LSF-HPC node allocation (compute nodes). LSF-HPC node allocation is created by -n num-procs parameter,which specifies the number of cores the job requ

Seite 96 - Local Disks on Compute Nodes

Refer to the LSF bsub command manpage for additional information about using the external scheduler(-ext) option. See the srun manpage for more detail

Seite 98

Getting Information About JobsThere are several ways you can get information about a specific job after it has been submitted to LSF-HPC.This section

Seite 99 - Appendix A Examples

Job Allocation Information for a Finished JobThe following is an example of the output obtained using the bhist -l command to obtain job allocationinf

Seite 100 - View the job:

Example 9-5 Using the bjobs Command (Long Output)$ bjobs -l 24Job <24>, User <msmith>,Project <default>,Status <RUN>,

Seite 101

Example 9-7 Using the bhist Command (Long Output)$ bhist -l 24Job <24>, User <lsfadmin>, Project <default>, Interactive pseudo-termi

Seite 102 - 102 Examples

$ sacct -j 123Jobstep Jobname Partition Ncpus Status Error---------- ------------------ ---------- ------- ---------- -----123

Seite 103 - Run the job:

Be sure to unset the SLURM_JOBID when you are finished with the allocation, to prevent a previous SLURMJOBID from interfering with future jobs:$ unset

Seite 104 - Show the job allocation:

confirm an expected high load on the nodes. The following is an example of this; the LSF JOBID is 200 andthe SLURM JOBID is 250:$ srun --jobid=250 upt

Seite 105 - View the node state:

Table 9-2 LSF-HPC Equivalents of SLURM srun OptionsLSF-HPC EquivalentDescriptionsrun Optionbsub -n numNumber of processes (tasks) to run.-n--ntasks=nt

Seite 106 - View the finished job:

LSF-HPC EquivalentDescriptionsrun OptionYou cannot use this option. LSF-HPC uses it to createallocation.Root attempts to submit or run a job asnormal

Seite 107

LSF-HPC EquivalentDescriptionsrun OptionUse as an argument to srun when launching paralleltasks.How long to wait after the first taskterminates before

Seite 108

List of Tables1-1 Determining the Node Platform...

Seite 110 - 110 Glossary

10 Advanced TopicsThis chapter covers topics intended for the advanced user. This chapter addresses the following topics:• Enabling Remote Execution w

Seite 111

$ hostnamemymachineThen, use the host name of your local machine to retrieve its IP address:$ host mymachinemymachine has address 14.26.206.134Step 2.

Seite 112 - 112 Glossary

Determine the address of your monitor's display server, as shown at the beginning of "Running an X TerminalSession from a Remote Node"

Seite 113

Further, if the recursive make is run remotely, it can be told to use concurrency on the remote node. Forexample:$ cd subdir; srun -n1 -N1 $(MAKE) -j4

Seite 114

@ \ for i in ${HYPRE_DIRS}; \ do \ if [ -d $$i ]; \ then \ echo "Cleaning $$i ..."; \

Seite 115

struct_matrix_vector/libHYPRE_mv.a: $(PREFIX) $(MAKE) -C struct_matrix_vector struct_linear_solvers/libHYPRE_ls.a: $(PREFIX) $(MAKE) -

Seite 116 - 116 Index

Shared File ViewAlthough a file opened by multiple processes of an application is shared, each core maintains a private filepointer and file position.

Seite 118 - 118 Index

Appendix A ExamplesThis appendix provides examples that illustrate how to build and run applications on the HP XC system. Theexamples in this section

Kommentare zu diesen Handbüchern

Keine Kommentare