Using the Pablo MPI I/O Library: A Tutorial

The MPI I/O extension to the Pablo Trace Library is a set of programs that can be substituted for standard MPI I/O calls so that I/O performance data can be captured and measured. These programs perform exactly the same routines as their standard MPI I/O counterparts, but they have been augmented by instrumentation software, attached before and after the call itself, to capture internal data. Programmers compile their instrumented source code and link it with the trace library. During program execution, perfomance data captured by the instrumentation is recorded in SDDF files.  Analysts use Pablo analysis tools to study these SDDF files and understand how MPI I/O behaves and interacts with other system components. This tutorial includes: instructions for instrumenting a code, instructions for compiling, linking, and executing instrumented source code  as well as access to an end-to-end example of performance analysis.

Instrumenting A Code

In profiling the performance of an application's MPI  I/O behavior, Pablo tools track, or trace, the I/O events that take place during program execution. An event, in this context, occurs each time a specific MPI I/O function is executed. The standard MPI I/O routines are replaced by routines of the same name in the Pablo library, which perform the instrumentation and call the corresponding PMPI_ entry. This enables the function of the original MPI I/O routine to be executed and metrics reflecting its performance to be captured and written to an SDDF record.

Standard MPI I/O Request

Standard C
MPI I/O Call

Open a file MPI_File_open
Read from a file MPI_File_read
Write to a file MPI_File_write
Close the file MPI_File_close

Users insert, within their code, a call to the Pablo trace library initialization routine. Then they instrument their code by replacing statements or calls triggering targeted I/O events, with calls to  trace routines, which are the trace library versions of the corresponding I/O calls or commands. Termination routines, placed just after instrumentation, return control to the application making the call.

When an instrumented event occurs, the trace library generates a record of the occurrence, a trace record, written in SDDF format. Each SDDF file contains both record descriptors and data records. The record descriptors define the structure of data records to be generated for each occurence of targeted events, enabling ready interpretation of trace data records by analysis tools.

The data records within an SDDF file contain the actual metrics captured for a given occurence of a trace event, ordered according to the corresponding descriptor record. The descriptors are generated at initialization time with a call to InitMPIOtrace. Data records are generated when the targeted event is called.

The Pablo MPI I/O  Trace Utilities provide for the production of two types of MPI I/O trace records: Runtime trace records and Real-time trace records.

  1. Runtime Tracing

The data records within an SDDF file contain the actual metrics captured for a given occurence of a trace event, ordered according to the corresponding descriptor record. The descriptors are generated at initialization time with a call to initMPIOTrace. Data records are generated when the targeted event is called.

If the Runtime Tracing option is used, an SDDF packet, called an MPI I/O procedure trace record, is produced each time an MPI I/O procedure is entered. The MPI I/O routine trace record contains data indicating the type of procedure, the processor number, and the time the record was produced.

On return from a traced MPI I/O procedure, another MPI I/O procedure trace record is produced.  At the end of the run, the trace files contain enough informaiton to produce a thorough analysis of the MPI I/O activity that occurred during execution of the program.

  1. Real-time (summary) Tracing

If the Summary Tracing option is used, statistics about the MPI I/O are recorded in tables in memory during runtime. Prior to the end of execution, an SDDF record, called an MPI I/O summary record, is produced for each of the MPI I/O procedures that was traced. These records are written to the trace file associated with the processor.

The Runtime Tracing option produces much more information than the Summary Tracing but may also produce trace files that are extremely large. The size of the Summary tracing output files are roughly proportional to the number of MPI I/O procedures being traced and the number of processors used during execution.

Compiling, Linking, and Execution

Depending on the programming languaged used, the proceedures for compiling differ slightly. In any case, the main thing is that you need to include the Pablo trace header file.

These header files can be found in the include subdirectory of the directory containing the Pablo software.

For MPI FORTRAN mpif77 -c mpioexample.f -I<Pablodir>/include -I<mpidir>/include
For MPI C mpicc -c mpiExample.c -I<Pablodir>/include -I<mpidir>/include

The files must be linked with the libraries libPabloTrace.a and libPabloTraceExt.a, which can be found in the lib subdirectory of the directory containing the Pablo software.

For MPI FORTRAN mpif77 -o myEXE mpioexample.o -L<Pablodir>/lib -lPabloTrace -lPabloTraceExt -lmpi [other libraries as necessary]
For MPI C mpicc -o myEXE mpioexample2.o -L<Pablodir>/lib -lPabloTrace -lPabloTraceExt  -lmpi [other libraries as necessary]

Execution is as usual:

MPI mpirun -np <nprocs> myEXE

This will produce a tracefile for each processor.

Download the Pablo MPI I/O Library User Guide