Using the Pablo PCF: A Tutorial

The Pablo Performance Capture Facility (PCF) is a set of programs that can be substituted for standard I/O, MPI-I/O or HDF calls so that performance data in C, C++ and FORTRAN codes can be captured and measured. These programs perform exactly the same functions as their standard counterparts, but they have been augmented by instrumentation software, attached before and after the call itself, to capture internal data. Programmers compile their instrumented source code and link it with one of the PCF libraries. Depending on which of the PCF libraries used for linking, during program execution, performance data captured by the instrumentation is either recorded in SDDF trace files or transferred via Autopilot for remote monitoring of the activity. 

If SDDF trace files are produced, the user has the option of producing either a runtime trace or a summary trace of the monitored activity.  If a runtime trace is performed, a record is written each time a monitored event occurs.  This allows the user to analyze the I/O behavior of the application over time.  If the summary trace option is selected, the user gets only one record per processor containing a summary of the monitored activity on that processor.

If Autopilot monitoring is selected, the user passes the address of the remote site which will receive the output.  The user can then write an Autopilot client to analyzed the data.

Below we have instructions for instrumenting a code, instructions for compiling, linking, and executing instrumented source code .

 

Instrumenting A Code

In profiling the performance of an application's I/O behavior, the PCF can be used to capture performance metrics for the Unix I/O, MPI-I/O and HDF events that take place during program execution. An event, in this context, occurs each time a function in these categories is invoked.  Below we describe how to instrument C, C++ and FORTRAN language programs.

 

Instrumenting a C Language Code:

The methods for performance monitoring of the Unix I/O, MPI-I/O or HDF differ only slightly. These differences are noted below.  In each case, the user must add a statement to include a PCF header file PcCinterface.h at the beginning of the code. A call to an initialization routine must be placed in the main program before any calls to the routines to be monitored.  If MPI parallelism is used, this call should follow the MPI_Init call.  A call to a termination routine must occur after all of the monitored activity.  If MPI is used, this call should precede the call to MPI_Finalize.

Unix I/O

The Unix I/O initialization routine for C codes is unixIOCaptureInit  and the termination routine is unixIOCaptureEnd.  The syntax of these routines is defined below.  In addition to adding the initialization and termination calls in the main program,  the user must define a constant PcUIOcapture and add an include statement for a PCF header file PcCinterface.h to each of the source programs to be monitored.  This header file contains macros which will cause the standard Unix I/O functions to be renamed to a function in the PCF library at compilation time.  During execution, control is passed to the routine in the PCF library which records data and calls the corresponding standard I/O routine. 

 

#include "PcCinterface.h"

void unixIOCaptureInit( const char* output,int procNum, int traceType );

 

where the parameters are

  output 

is the name of the output file if the PCF Tracfile library  is used.  Output will go to file name.procNum

it is the name of the remote host if the PCF Autopilot library is used

  procNum  

is the processor number.  This is used to distinguish the trace files or sensors.  Set the value to 0 if the code is run in scalar mode.

  traceType    

specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE.  Note that SUMMARY_TRACE is not available with Autopilot

 

#include "PcCinterface.h"

void unixIOCaptureEnd( void );

Select from the following links to see a sample C code before or after instrumentation:

 

Sample C Code Before Unix I/O Instrumentation

Sample C Code After Unix I/O Instrumentation

 

MPI-I/O: 

The MPI I/O initialization routine for C codes is mpiIOCaptureInit  and the termination routine is mpiIOCaptureEnd.  These are described below. To trace the activity of the MPI-I/O routines, the Pablo PCF libraries provide procedures with names identical to the MPI-I/O routines.  On entry to these procedures, data is captured and the corresponding profiling MPI-I/O routine is called. After return from the profiling routine, data is again captured.  The user is only requred to modify the main program by adding a call to the initialization routine after the call to the MPI_Init routine and a call to the termination routine just prior to calling the MPI_Finalize routine.

 

#include "PcCinterface.h"

void mpiIOCaptureInit( const char* output, int procNum, int traceType );

 

where the parameters are

  output 

is the name of the output file if the PCF Tracfile library  is used.  Output will go to file output.procNum 

it is the name of the remote host if the PCF Autopilot library is used

  procNum  

is the processor number.  This is used to distinguish the trace files or sensors.  Set the value to 0 if the code is run in scalar mode.

  traceType    

specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE.  Note that SUMMARY_TRACE is not available with Autopilot

 

#include "PcCinterface.h"

void mpiIOCaptureEnd( void );

 

HDF: 

The HDF initialization routine for C codes is hdfCaptureInit  and the termination routine is hdfCaptureEnd.  These are described below. The trace software for HDF is implemented by following the instructions to build the special instrumented version of the HDF5 library, libhdf5-inst.a, in the pablo subdirectory of the source code supplied by HDF.  Link with this library instead of the normal HDF5 library.  In addition, to this, the user is only required to modify the main program by adding a call to the initialization routine and termination routines. 

Note: Initially, these special version of the HDF5 libraries may only be available through the Pablo Website.

 

#include "ProcTrace.h"

void hdfCaptureInit( const char* output

                    int procNum

                    unsigned traceID1,[traceID2, …,]

                    int traceType );

 

where the parameters are

  output 

is the name of the output file if the PCF Tracfile library  is used.  Output will go to file output.procNum 

It is the name of the remote host if the PCF Autopilot library is used

  procNum  

is the processor number.  This is used to distinguish the trace files or sensors.  Set the value to 0 if the code is run in scalar mode.

  traceID1,[traceID2, …,]

is a nonempty list of trace IDs specifying the procedures to be traced.  Each trace ID may represent either an individual procedure or a source file in the HDF library. In the case where the ID represents a source file, all procedures in that file will be traced. An arbitrary number of trace IDs may be passed. The include file ProcTrace.h in the HDF include directory contains the definitions of the trace IDs associated with the HDF procedures and HDF library source files. Passing the value ID_ALLHDF causes all HDF procedures to be traced.

 

  traceType    

specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE.  Note that SUMMARY_TRACE is not available with Autopilot

#include "ProcTrace.h"

void hdfCaptureEnd( void );

 

Instrumenting C++ Language Code:

In addition to Unix I/O, MPII/O and HDF, C++ stream I/O can also be monitored. The methods for performance monitoring of the Unix I/O, C++ Stream I/O , MPI-I/O or HDF differ only slightly. These differences are noted below.  In each case, the user must add a statement to include a PCF header file PcCinterface.h at the beginning of the code. A call to an initialization routine must be placed in the main program before any calls to the routines to be monitored.  If MPI parallelism is used, this call should follow the MPI_Init call.  A call to a termination routine must occur after all of the monitored activity.  If MPI is used, this call should precede the call to MPI_Finalize.

Unix I/O

The Unix I/O initialization routine for C++ codes is unixIOCaptureInit  and the termination routine is unixIOCaptureEnd.  The syntax of these routines is defined below.  In addition to adding the initialization and termination calls in the main program,  the user must define a constant PcUIOcapture and add an include statement for a PCF header file PcCinterface.h to each of the source programs to be monitored.  This header file contains macros which will cause the standard Unix I/O functions to be renamed to a function in the PCF library at compilation time.  During execution, control is passed to the routine in the PCF library which records data and calls the corresponding standard I/O routine. 

 

#include "PcCinterface.h"

void unixIOCaptureInit( const char* output

                     int procNum

                   int traceType );

 

where the parameters are

  output 

is the name of the output file if the PCF Tracfile library  is used.  Output will go to file name.procNum 

it is the name of the remote host if the PCF Autopilot library is used

  procNum  

is the processor number.  This is used to distinguish the trace files or sensors.  Set the value to 0 if the code is run in scalar mode.

  traceType    

specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE.  Note that SUMMARY_TRACE is not available with Autopilot

 

#include "PcCinterface.h"

void unixIOCaptureEnd( void );

Select from the following links to see a sample C code before or after instrumentation:

 

C++ Stream I/O:

To monitor C++ Stream I/O it is necesssary to add an include statement for the header file PabloStream.h and to redefine ofstream, ifstream and fstream.  For initialization define an object obj of type PcCppBase:

 

#include "PabloStream.h"

#define ofstream traceOfstream

#define ifstream traceIfstream

#define fstream traceFstream

 

PcCppBase PabloObj (TraceFileName, PcConstants::tracetype, procNum);

[code to be traced]

PabloObj.end( );

 

where the parameters are

  TraceFileName

        is the root name of the output file if the PCF Tracfile library  is used.  Output will go to file TraceFileName.procNum. 

  tracetype

specifies the type of tracing to be performed (runtime or summary),

  procNum  

is the processor number.  This is used to distinguish the trace files.  Set the value to 0 if the code is run in scalar mode.

 

MPI-I/O: 

The MPI I/O initialization routine for C++ codes is mpiIOCaptureInit  and the termination routine is mpiIOCaptureEnd.  These are described below. To trace the activity of the MPI-I/O routines, the Pablo PCF libraries provide procedures with names identical to the MPI-I/O routines.  On entry to these procedures, data is captured and the corresponding profiling MPI-I/O routine is called. After return from the profiling routine, data is again captured.  The user is only requred to modify the main program by adding a call to the initialization routine after the call to the MPI_Init routine and a call to the termination routine just prior to calling the MPI_Finalize routine.

 

#include "PcCinterface.h"

voi mpiIOCaptureInit( const char* output

                        int procNum

                      int traceType );

 

where the parameters are

  output 

        is the name of the output file if the PCF Tracfile library  is used.  Output will go to file output.procNum

        It is the name of the remote host if the PCF Autopilot library is used

  procNum  

        is the processor number.  This is used to distinguish the trace files or sensors.  Set the value to 0 if the code is run in scalar mode.

  traceType    

specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE.  Note that SUMMARY_TRACE is not available with Autopilot

 

#include "PcCinterface.h"

void mpiIOCaptureEnd( void );

 

HDF

The HDF initialization routine for C++ codes is hdfCaptureInit  and the termination routine is hdfCaptureEnd.  These are described below. The trace software for HDF is implemented by following the instructions to build the special instrumented version of the HDF library, libhdf5-inst.a, in the pablo subdirectory of the source code supplied by HDF.  Link with this library instead of the normal HDF library.  In addition, to this, the user is only required to modify the main program by adding a call to the initialization routine and termination routines. 

Note: Initially, these special version of the HDF libraries may only be available through the Pablo Website.

 

#include "ProcTrace.h"

void hdfCaptureInit( const char* output

                     int procNum

                                              unsigned traceID1,[traceID2, …,]

                   int traceType );

 

where the parameters are

  output 

is the name of the output file if the PCF Tracfile library  is used.  Output will go to file output.procNum 

it is the name of the remote host if the PCF Autopilot library is used

  procNum  

is the processor number.  This is used to distinguish the trace files or sensors.  Set the value to 0 if the code is run in scalar mode.

  traceID1,[traceID2, …,]

is a nonempty list of trace IDs specifying the procedures to be traced.  Each trace ID may represent either an individual procedure or a source file in the HDF library. In the case where the ID represents a source file, all procedures in that file will be traced. An arbitrary number of trace IDs may be passed. The include file ProcTrace.h in the HDF include directory contains the definitions of the trace IDs associated with the HDF procedures and HDF library source files. Passing the value ID_ALLHDF causes all HDF procedures to be traced.

  traceType    

specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE.  Note that SUMMARY_TRACE is not available with Autopilot

#include "ProcTrace.h"

void hdfCaptureEnd( void );

 

 

Instrumenting a FORTRAN Language Code:

 The method for instrumenting a FORTRAN code is similar to instrumenting a C code. Again the methods for performance monitoring of the Unix I/O, MPI-I/O or HDF differ only slightly. These differences are noted below.  In each case, the user must add a statement to include a PCF header file fPCFinterface.h at the beginning of the code. A call to an initialization routine must be placed in the main program before any calls to the routines to be monitored.  If MPI parallelism is used, this call should follow the MPI_Init call.  A call to a termination routine must occur after all of the monitored activity.  If MPI is used, this call should precede the call to MPI_Finalize.

 

Unix I/O

The Unix I/O initialization routine for FORTRAN codes is unixIOCaptureInit  and the termination routine is unixIOCaptureEnd.  The syntax of these routines is defined below.  Because FORTRAN I/O is implemented through language statements rather than by I/O calls, monitoring of these events requires "wrapping" I/O statements with calls to PCF trace bracketing routines. The procedure is the same as that used for bracketing FORTRAN I/O statements with the Pablo Trace Library except for the difference in initialization and termination calls. 

 

subroutine unixIOCaptureInit( output, procNum, traceType )

 

where the parameters are

  character*(*) output 

is the name of the output file if the PCF Tracfile library  is used.  Output will go to file output.procNum 

it is the name of the remote host if the PCF Autopilot library is used

  integer procNum  

is the processor number.  This is used to distinguish the trace files or sensors

  integer traceType    

specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE.  Note that SUMMARY_TRACE is not available with Autopilot

subroutine unixIOCaptureEnd( )

 

Select from the following links to see a sample FORTRAN code before or after instrumentation:

Sample Fortran Code Before Unix I/O Instrumentation
Sample FORTRAN Code After Unix I/O Instrumentation

MPI-I/O:

The MPI I/O initialization routine for FORTRAN codes is mpiIOCaptureInitF  and the termination routine is mpiIOCaptureEndF.These are described below. To trace the activity of the MPI-I/O routines, the Pablo PCF libraries provide procedures with names identical to the MPI-I/O routines.  On entry to these procedures, data is captured and the corresponding profiling MPI-I/O routine is called. After return from the profiling routine, data is again captured.  The user is only requred to modify the main program by adding a call to the initialization routine after the call to the MPI_Init routine and a call to the termination routine just prior to calling the MPI_Finalize routine.

 

 subroutine mpiIOCaptureInitF( output, procNum, traceType )

 

where the parameters are

  character*(*) output 

is the name of the output file if the PCF Tracfile library  is used.  Output will go to file output.procNum 

it is the name of the remote host if the PCF Autopilot library is used

 

  integer procNum  

is the processor number.  This is used to distinguish the trace files or sensors

 

  integer traceType    

specifies the type of tracing to be performed: RUNTIME_TRACE or SUMMARY_TRACE.  Note that SUMMARY_TRACE is not available with Autopilot

 

subroutine mpioCaptureEndF( )

HDF:

The PCF does not support a FORTRAN interface for HDF at this time.

Compiling, Linking, and Execution

·         Compiling

The compiling step for either C or FORTRAN requires that the include subdirectory for the Pablo header files be specified. Below, <PCF> is the directory containing the PCF software.  Other flags may be required depending on the type of application.  The names of the FORTRAN and C compilers may vary as well.

For FORTRAN

f77 -c -<Flags> prog.f -I<PCF>/include

For MPI C

cc -c -<Flags> prog.c -I<PCF>/include

For C++

C++ -c -<Flags> prog.C –I<PCF>/include

·         Linking

The link step is somewhat complicated.  The PCF library is written in C++ and uses both pthread and Pablo SDDF procedures. If the source code is written in FORTRAN, there may be FORTRAN intrinsic externals to be resolved.  If the Autopilot interface is used, Autopilot and Globus library externals should also be resolved as well.  The following notes should help in successfully linking.

The following illustrates the link steps used for the Trace File and Autopilot interfaces:

Let the following variables be set to the indicated paths:

PCF                directory containing the PCF software

PTHREAD       the library containing the pthread library

FORT               the directory containing the -lftn library

AUTOPILOT   the directory containing the Autopilot software

GLOBUS         the directory containing the globus pthreads_standard or pthreads_stardard_debug 

 

LDFLAGS_TF = -L$PCF  -L$PTHREAD 

LDFLAGS_AP = $LDFLAGS_TF  -L$AUTOPILOT -L$GLOBUS

For scalar C TraceFile

CC -o myExe prog.o $LDFLAGS_TF  -lPabloPCF_TF -lptrhread <other Platform-specific libraries>

For scalar C Autopilot

CC -o myExe prog.o $LDFLAGS_AP -lPabloPCF_AP  -lglobus_nexus -lglobus_io \ -lglobus_mp -lglobus_dc -lglobus_gss_assist -lglobus_gss -lglobus_gaa -lglobus_common \ -lptrhread <other Platform-specific libraries>

 

·         FORTRAN applications will require the additional flags -L$FORT -lftn

·         MPI application may require -lmpi and possibly -lmpi++ flags.  mpiCC may be required in place of CC. 

·         Execution

Execution is as usual:

Scalare

myEXE

MPI

mpirun -np <nprocs> myEXE

The type of output will depend on the type of performance monitoring performed, i.e., whether the TraceFile or Autopilot options are used.

If the TraceFile monitoring is used, the output will be put in trace files.  One trace file will be produced for each processor.  The name of the file will have the root name output passed as an argument to the initialization routine with a suffix indicating the processor number.  For example, if the value of the output argument is myFile, then the trace files will by myFile.0, myFile.1, etc.

If the Autopilot monitoring is used, the monitored information is passed to the host specified by the output parameter to the initialization routine through Autopilot sensors. The sensors have the Name properties UnixIOCapture, MPIIOCapture, and HDFCapture. They also have the Processor Number property procNum where procNum is the value of that argument passed to the initialization routine.  An Autopilot Client can use this information to monitor these activities.