The Pablo Performance Capture Facility (PCF) is a set of programs that can be substituted for standard I/O, MPI-I/O or HDF calls so that performance data in C, C++ and FORTRAN codes can be captured and measured. These programs perform exactly the same functions as their standard counterparts, but they have been augmented by instrumentation software, attached before and after the call itself, to capture internal data. Programmers compile their instrumented source code and link it with one of the PCF libraries. Depending on which of the PCF libraries used for linking, during program execution, performance data captured by the instrumentation is either recorded in SDDF trace files or transferred via Autopilot for remote monitoring of the activity.
If SDDF trace files are produced, the user has the option of producing either a runtime trace or a summary trace of the monitored activity. If a runtime trace is performed, a record is written each time a monitored event occurs. This allows the user to analyze the I/O behavior of the application over time. If the summary trace option is selected, the user gets only one record per processor containing a summary of the monitored activity on that processor.
If Autopilot monitoring is selected, the user passes the address of the remote site which will receive the output. The user can then write an Autopilot client to analyzed the data.
Below we have instructions for instrumenting a code, instructions for compiling, linking, and executing instrumented source code .
In profiling the performance of an application's I/O behavior, the PCF can be used to capture performance metrics for the Unix I/O, MPI-I/O and HDF events that take place during program execution. An event, in this context, occurs each time a function in these categories is invoked. Below we describe how to instrument C, C++ and FORTRAN language programs.
Instrumenting a C Language Code:
The methods for performance monitoring of the Unix I/O, MPI-I/O or HDF differ only slightly. These differences are noted below. In each case, the user must add a statement to include a PCF header file PcCinterface.h at the beginning of the code. A call to an initialization routine must be placed in the main program before any calls to the routines to be monitored. If MPI parallelism is used, this call should follow the MPI_Init call. A call to a termination routine must occur after all of the monitored activity. If MPI is used, this call should precede the call to MPI_Finalize.
The Unix I/O initialization routine for C codes is unixIOCaptureInit and the termination routine is unixIOCaptureEnd. The syntax of these routines is defined below. In addition to adding the initialization and termination calls in the main program, the user must define a constant PcUIOcapture and add an include statement for a PCF header file PcCinterface.h to each of the source programs to be monitored. This header file contains macros which will cause the standard Unix I/O functions to be renamed to a function in the PCF library at compilation time. During execution, control is passed to the routine in the PCF library which records data and calls the corresponding standard I/O routine.
#include "PcCinterface.h"
void unixIOCaptureInit( const char* output,int procNum, int traceType );
where the parameters are
output
is the name of the output file if the PCF Tracfile library is used. Output will go to file name.procNum
it is the name of the remote host if the PCF Autopilot library is used
procNum
is the processor number. This is used to distinguish the trace files or sensors. Set the value to 0 if the code is run in scalar mode.
traceType
specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE. Note that SUMMARY_TRACE is not available with Autopilot
#include "PcCinterface.h"
void unixIOCaptureEnd( void );
Select from the following links to see a sample C code before or after instrumentation:
Sample C Code Before Unix I/O Instrumentation
Sample C Code After Unix I/O Instrumentation
The MPI I/O initialization routine for C codes is mpiIOCaptureInit and the termination routine is mpiIOCaptureEnd. These are described below. To trace the activity of the MPI-I/O routines, the Pablo PCF libraries provide procedures with names identical to the MPI-I/O routines. On entry to these procedures, data is captured and the corresponding profiling MPI-I/O routine is called. After return from the profiling routine, data is again captured. The user is only requred to modify the main program by adding a call to the initialization routine after the call to the MPI_Init routine and a call to the termination routine just prior to calling the MPI_Finalize routine.
#include "PcCinterface.h"
void mpiIOCaptureInit( const char* output, int procNum, int traceType );
where the parameters are
output
is the name of the output file if the PCF Tracfile library is used. Output will go to file output.procNum
it is the name of the remote host if the PCF Autopilot library is used
procNum
is the processor number. This is used to distinguish the trace files or sensors. Set the value to 0 if the code is run in scalar mode.
traceType
specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE. Note that SUMMARY_TRACE is not available with Autopilot
#include "PcCinterface.h"
void mpiIOCaptureEnd( void );
HDF:
The HDF initialization routine for C codes is hdfCaptureInit and the termination routine is hdfCaptureEnd. These are described below. The trace software for HDF is implemented by following the instructions to build the special instrumented version of the HDF5 library, libhdf5-inst.a, in the pablo subdirectory of the source code supplied by HDF. Link with this library instead of the normal HDF5 library. In addition, to this, the user is only required to modify the main program by adding a call to the initialization routine and termination routines.
Note: Initially, these special version of the HDF5 libraries may only be available through the Pablo Website.
#include "ProcTrace.h"
void hdfCaptureInit( const char* output,
int procNum,
unsigned traceID1,[traceID2, …,]
int traceType );
where the parameters are
output
is the name of the output file if the PCF Tracfile library is used. Output will go to file output.procNum
It is the name of the remote host if the PCF Autopilot library is used
procNum
is the processor number. This is used to distinguish the trace files or sensors. Set the value to 0 if the code is run in scalar mode.
traceID1,[traceID2, …,]
is a nonempty list of trace IDs specifying the procedures to be traced. Each trace ID may represent either an individual procedure or a source file in the HDF library. In the case where the ID represents a source file, all procedures in that file will be traced. An arbitrary number of trace IDs may be passed. The include file ProcTrace.h in the HDF include directory contains the definitions of the trace IDs associated with the HDF procedures and HDF library source files. Passing the value ID_ALLHDF causes all HDF procedures to be traced.
traceType
specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE. Note that SUMMARY_TRACE is not available with Autopilot
#include "ProcTrace.h"
void hdfCaptureEnd( void );
Instrumenting C++ Language Code:
In addition to Unix I/O, MPII/O and HDF, C++ stream I/O can also be monitored. The methods for performance monitoring of the Unix I/O, C++ Stream I/O , MPI-I/O or HDF differ only slightly. These differences are noted below. In each case, the user must add a statement to include a PCF header file PcCinterface.h at the beginning of the code. A call to an initialization routine must be placed in the main program before any calls to the routines to be monitored. If MPI parallelism is used, this call should follow the MPI_Init call. A call to a termination routine must occur after all of the monitored activity. If MPI is used, this call should precede the call to MPI_Finalize.
Unix I/O:
The Unix I/O initialization routine for C++ codes is unixIOCaptureInit and the termination routine is unixIOCaptureEnd. The syntax of these routines is defined below. In addition to adding the initialization and termination calls in the main program, the user must define a constant PcUIOcapture and add an include statement for a PCF header file PcCinterface.h to each of the source programs to be monitored. This header file contains macros which will cause the standard Unix I/O functions to be renamed to a function in the PCF library at compilation time. During execution, control is passed to the routine in the PCF library which records data and calls the corresponding standard I/O routine.
#include "PcCinterface.h"
void unixIOCaptureInit( const char* output,
int procNum,
int traceType );
where the parameters are
output
is the name of the output file if the PCF Tracfile library is used. Output will go to file name.procNum
it is the name of the remote host if the PCF Autopilot library is used
procNum
is the processor number. This is used to distinguish the trace files or sensors. Set the value to 0 if the code is run in scalar mode.
traceType
specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE. Note that SUMMARY_TRACE is not available with Autopilot
#include "PcCinterface.h"
void unixIOCaptureEnd( void );
Select from the following links to see a sample C code before or after instrumentation:
C++ Stream I/O:
To monitor C++ Stream I/O it is necesssary to add an include statement for the header file PabloStream.h and to redefine ofstream, ifstream and fstream. For initialization define an object obj of type PcCppBase:
#include "PabloStream.h"
#define ofstream traceOfstream
#define ifstream traceIfstream
#define fstream traceFstream
PcCppBase PabloObj (TraceFileName, PcConstants::tracetype, procNum);
…
[code to be traced]
PabloObj.end( );
where the parameters are
TraceFileName
is the root name of the output file if the PCF Tracfile library is used. Output will go to file TraceFileName.procNum.
tracetype
specifies the type of tracing to be performed (runtime or summary),
procNum
is the processor number. This is used to distinguish the trace files. Set the value to 0 if the code is run in scalar mode.
MPI-I/O:
The MPI I/O initialization routine for C++ codes is mpiIOCaptureInit and the termination routine is mpiIOCaptureEnd. These are described below. To trace the activity of the MPI-I/O routines, the Pablo PCF libraries provide procedures with names identical to the MPI-I/O routines. On entry to these procedures, data is captured and the corresponding profiling MPI-I/O routine is called. After return from the profiling routine, data is again captured. The user is only requred to modify the main program by adding a call to the initialization routine after the call to the MPI_Init routine and a call to the termination routine just prior to calling the MPI_Finalize routine.
#include "PcCinterface.h"
voi mpiIOCaptureInit( const char* output,
int procNum,
int traceType );
where the parameters are
output
is the name of the output file if the PCF Tracfile library is used. Output will go to file output.procNum.
It is the name of the remote host if the PCF Autopilot library is used
procNum
is the processor number. This is used to distinguish the trace files or sensors. Set the value to 0 if the code is run in scalar mode.
traceType
specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE. Note that SUMMARY_TRACE is not available with Autopilot
#include "PcCinterface.h"
void mpiIOCaptureEnd( void );
HDF:
The HDF initialization routine for C++ codes is hdfCaptureInit and the termination routine is hdfCaptureEnd. These are described below. The trace software for HDF is implemented by following the instructions to build the special instrumented version of the HDF library, libhdf5-inst.a, in the pablo subdirectory of the source code supplied by HDF. Link with this library instead of the normal HDF library. In addition, to this, the user is only required to modify the main program by adding a call to the initialization routine and termination routines.
Note: Initially, these special version of the HDF libraries may only be available through the Pablo Website.
#include "ProcTrace.h"
void hdfCaptureInit( const char* output,
int procNum,
unsigned traceID1,[traceID2, …,]
int traceType );
where the parameters are
output
is the name of the output file if the PCF Tracfile library is used. Output will go to file output.procNum
it is the name of the remote host if the PCF Autopilot library is used
procNum
is the processor number. This is used to distinguish the trace files or sensors. Set the value to 0 if the code is run in scalar mode.
traceID1,[traceID2, …,]
is a nonempty list of trace IDs specifying the procedures to be traced. Each trace ID may represent either an individual procedure or a source file in the HDF library. In the case where the ID represents a source file, all procedures in that file will be traced. An arbitrary number of trace IDs may be passed. The include file ProcTrace.h in the HDF include directory contains the definitions of the trace IDs associated with the HDF procedures and HDF library source files. Passing the value ID_ALLHDF causes all HDF procedures to be traced.
traceType
specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE. Note that SUMMARY_TRACE is not available with Autopilot
#include "ProcTrace.h"
void hdfCaptureEnd( void );
Instrumenting a FORTRAN Language Code:
The method for instrumenting a FORTRAN code is similar to instrumenting a C code. Again the methods for performance monitoring of the Unix I/O, MPI-I/O or HDF differ only slightly. These differences are noted below. In each case, the user must add a statement to include a PCF header file fPCFinterface.h at the beginning of the code. A call to an initialization routine must be placed in the main program before any calls to the routines to be monitored. If MPI parallelism is used, this call should follow the MPI_Init call. A call to a termination routine must occur after all of the monitored activity. If MPI is used, this call should precede the call to MPI_Finalize.
The Unix I/O initialization routine for FORTRAN codes is unixIOCaptureInit and the termination routine is unixIOCaptureEnd. The syntax of these routines is defined below. Because FORTRAN I/O is implemented through language statements rather than by I/O calls, monitoring of these events requires "wrapping" I/O statements with calls to PCF trace bracketing routines. The procedure is the same as that used for bracketing FORTRAN I/O statements with the Pablo Trace Library except for the difference in initialization and termination calls.
subroutine unixIOCaptureInit( output, procNum, traceType )
where the parameters are
character*(*) output
is the name of the output file if the PCF Tracfile library is used. Output will go to file output.procNum
it is the name of the remote host if the PCF Autopilot library is used
integer procNum
is the processor number. This is used to distinguish the trace files or sensors
integer traceType
specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE. Note that SUMMARY_TRACE is not available with Autopilot
subroutine unixIOCaptureEnd( )
Select from the following links to see a sample FORTRAN code before or after instrumentation:
Sample Fortran Code Before Unix I/O Instrumentation
Sample FORTRAN Code After Unix I/O Instrumentation
The MPI I/O initialization routine for FORTRAN codes is mpiIOCaptureInitF and the termination routine is mpiIOCaptureEndF.These are described below. To trace the activity of the MPI-I/O routines, the Pablo PCF libraries provide procedures with names identical to the MPI-I/O routines. On entry to these procedures, data is captured and the corresponding profiling MPI-I/O routine is called. After return from the profiling routine, data is again captured. The user is only requred to modify the main program by adding a call to the initialization routine after the call to the MPI_Init routine and a call to the termination routine just prior to calling the MPI_Finalize routine.
subroutine mpiIOCaptureInitF( output, procNum, traceType )
where the parameters are
character*(*) output
is the name of the output file if the PCF Tracfile library is used. Output will go to file output.procNum
it is the name of the remote host if the PCF Autopilot library is used
integer procNum
is the processor number. This is used to distinguish the trace files or sensors
integer traceType
specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE. Note that SUMMARY_TRACE is not available with Autopilot
subroutine mpioCaptureEndF( )
HDF:
The PCF does not support a FORTRAN interface for HDF at this time.
The compiling step for either C or FORTRAN requires that the include subdirectory for the Pablo header files be specified. Below, <PCF> is the directory containing the PCF software. Other flags may be required depending on the type of application. The names of the FORTRAN and C compilers may vary as well.
|
For FORTRAN |
f77 -c -<Flags> prog.f -I<PCF>/include |
|
For MPI C |
cc -c -<Flags> prog.c -I<PCF>/include |
|
For C++ |
C++ -c -<Flags> prog.C –I<PCF>/include |
The link step is somewhat complicated. The PCF library is written in C++ and uses both pthread and Pablo SDDF procedures. If the source code is written in FORTRAN, there may be FORTRAN intrinsic externals to be resolved. If the Autopilot interface is used, Autopilot and Globus library externals should also be resolved as well. The following notes should help in successfully linking.
The following illustrates the link steps used for the Trace File and Autopilot interfaces:
Let the following variables be set to the indicated paths:
PCF directory containing the PCF software
PTHREAD the library containing the pthread library
FORT the directory containing the -lftn library
AUTOPILOT the directory containing the Autopilot software
GLOBUS the directory containing the globus pthreads_standard or pthreads_stardard_debug
LDFLAGS_TF = -L$PCF -L$PTHREAD
LDFLAGS_AP = $LDFLAGS_TF -L$AUTOPILOT -L$GLOBUS
|
For scalar C TraceFile |
CC -o myExe prog.o $LDFLAGS_TF -lPabloPCF_TF -lptrhread <other Platform-specific libraries> |
|
For scalar C Autopilot |
CC -o myExe prog.o $LDFLAGS_AP -lPabloPCF_AP -lglobus_nexus -lglobus_io \ -lglobus_mp -lglobus_dc -lglobus_gss_assist -lglobus_gss -lglobus_gaa -lglobus_common \ -lptrhread <other Platform-specific libraries> |
· FORTRAN applications will require the additional flags -L$FORT -lftn
· MPI application may require -lmpi and possibly -lmpi++ flags. mpiCC may be required in place of CC.
Execution is as usual:
|
Scalare |
myEXE |
|
MPI |
mpirun -np <nprocs> myEXE |
The type of output will depend on the type of performance monitoring performed, i.e., whether the TraceFile or Autopilot options are used.
If the TraceFile monitoring is used, the output will be put in trace files. One trace file will be produced for each processor. The name of the file will have the root name output passed as an argument to the initialization routine with a suffix indicating the processor number. For example, if the value of the output argument is myFile, then the trace files will by myFile.0, myFile.1, etc.
If the Autopilot monitoring is used, the monitored information is passed to the host specified by the output parameter to the initialization routine through Autopilot sensors. The sensors have the Name properties UnixIOCapture, MPIIOCapture, and HDFCapture. They also have the Processor Number property procNum where procNum is the value of that argument passed to the initialization routine. An Autopilot Client can use this information to monitor these activities.