Using the Pablo UNIX I/O Library: A Tutorial

The UNIX I/O extension to the Pablo Trace Library is a set of programs that can be substituted for standard UNIX, C, and FORTRAN I/O calls so that I/O performance data can be captured and measured. These programs perform exactly the same routines as their standard I/O-call counterparts, but they have been augmented by instrumentation software, attached before and after the call itself, to capture internal data. Programmers compile their instrumented source code and link it with the trace library. During program execution perfomance data, captured by the instrumentation, is recorded in SDDF files.  Analysts use Pablo analysis tools to study these SDDF files and understand how UNIX I/O behaves and interacts with other system components. This tutorial includes: instructions for instrumenting a code, instructions for compiling, linking, and executing instrumented source code  as well as access to an end-to-end example of performance analysis.

Instrumenting a code

In profiling the performance of an application's I/O behavior, Pablo tools track, or trace, the I/O events that take place during program execution. An event, in this context, occurs each time a specific statement or instruction (open, read, write, seek, flush, close, etc.) is executed.

Standard I/O Request

Standard C
I/O Call

Instrumented C I/O Call

Open a file fopen traceFOPEN
Read from a file fread traceFREAD
Write to a file fwrite traceFWRITE
Close the file fclose traceFCLOSE

Users insert, within their code, a call to the Pablo trace library initialization routine. Then they instrument their code by replacing statements or calls triggering targeted I/O events, with calls to  trace routines, which are the trace library versions of the corresponding I/O calls or commands. Termination routines, placed just after instrumentation, return control to the application making the call.

When an instrumented event occurs, the trace library generates a record of the occurrence, a trace record, written in SDDF format. Each SDDF file contains both record descriptors and data records. The record descriptors define the structure of data records to be generated for each occurence of targeted events, enabling ready interpretation of trace data records by analysis tools.

The data records within an SDDF file contain the actual metrics captured for a given occurence of a trace event, ordered according to the corresponding descriptor record. The descriptors are generated at initialization time with a call to InitIOtrace. Data records are generated when the targeted event is called.  Click here for a complete list of SDDF trace event descriptors of the trace records produced by the I/O extension to the Pablo trace library.

Compiling, Linking, and Execution

Depending on the programming languaged used, the proceedures for compiling differ slightly. In any case, the main thing is that you need to include the Pablo trace header file.

These header files can be found in the include subdirectory of the directory containing the Pablo software.

For C cc -c ioexample2.c -I<Pablodir>/include
For FORTRAN f77 -c ioexample.f -I<Pablodir>/include
For MPI C mpicc -c mpiExample.c -I<Pablodir>/include -I<mpidir>/include

The files must be linked with the libraries libPabloTrace.a and libPabloTraceExt.a, which can be found in the lib subdirectory of the directory containing the Pablo software.

For C

cc -o myEXE ioexample2.o -L<Pablodir>/lib -lPabloTrace -lPabloTraceExt [other libraries as necessary]

For FORTRAN f77 -o myEXE ioexample.o -L<Pablodir>/lib -lPabloTrace -lPabloTraceExt [other libraries as necessary]
For MPI C mpicc -o myEXE ioexample2.o -L<Pablodir>/lib -lPabloTrace -lPabloTraceExt  -lmpi [other libraries as necessary]

Execution is as usual:

C or FORTRAN Command line myEXE
MPI mpirun -np <nprocs> myEXE

This will produce a tracefile for each processor.

Download the Pablo UNIX I/O Library User Guide