The UNIX I/O extension to the Pablo Trace Library is a set of programs that can be substituted for standard UNIX, C, and FORTRAN I/O calls so that I/O performance data can be captured and measured. These programs perform exactly the same routines as their standard I/O-call counterparts, but they have been augmented by instrumentation software, attached before and after the call itself, to capture internal data. Programmers compile their instrumented source code and link it with the trace library. During program execution perfomance data, captured by the instrumentation, is recorded in SDDF files. Analysts use Pablo analysis tools to study these SDDF files and understand how UNIX I/O behaves and interacts with other system components. This tutorial includes: instructions for instrumenting a code, instructions for compiling, linking, and executing instrumented source code as well as access to an end-to-end example of performance analysis.
In profiling the performance of an application's I/O behavior, Pablo tools track, or trace, the I/O events that take place during program execution. An event, in this context, occurs each time a specific statement or instruction (open, read, write, seek, flush, close, etc.) is executed.
Standard I/O Request
Standard C
I/O CallInstrumented C I/O Call
Open a file fopen traceFOPEN Read from a file fread traceFREAD Write to a file fwrite traceFWRITE Close the file fclose traceFCLOSE Users insert, within their code, a call to the Pablo trace library initialization routine. Then they instrument their code by replacing statements or calls triggering targeted I/O events, with calls to trace routines, which are the trace library versions of the corresponding I/O calls or commands. Termination routines, placed just after instrumentation, return control to the application making the call.
Trace Routines
The UNIX I/O extension to the Pablo Trace Library provides programers with a predefined set of I/O events and corresponding trace routines.There are three classes of Unix I/O trace routines:
- Trace Interface Routines provide performance-gathering interfaces to standard library routines called from programs written in C. The details traced depend on the operation performed. Click here for a complete list of the predefined I/O detail trace records. Basically these routines:
- Check and record the timestamp
- Make the actual I/O request
- Check the timestamp again and compute the operation duration
- Generate the detail trace event record
Return the result of the I/O request to the program that called it
- Trace Bracketing Routines allow users to collect performance information on I/O requests implemented as statements within the language rather than as library calls (for instance in FORTRAN). These routines come in pairs, which are used to bracket the actual I/O request. Since the bracketing routines are not direct replacements for the actual I/O request, it is up to the user to call these routines with accurate arguments.
- Production Control Routines provide options for controlling the volume and type of trace events generated by the extension. Click for a complete list of production control routines for C or FORTRAN. Users can employ these routines to:
- selectively enable I/O tracing for only parts of the code
- selectively disable and reenable the production of detail I/O trace events
- summarize I/O activity as the program executes, generating I/O summary trace events instead of detail traces. There are three types of summaries:
- Lifetime summaries---accumulate information on file I/O activity between opens and closes
- TimeWindow summaries---accumulate information about all I/O activity during windows, or specified units of time
- File Region summaries---accumulate information about I/O activity in specified regions of open files
Trace Records
When an instrumented event occurs, the trace library generates a record of the occurrence, a trace record, written in SDDF format. Each SDDF file contains both record descriptors and data records. The record descriptors define the structure of data records to be generated for each occurence of targeted events, enabling ready interpretation of trace data records by analysis tools.
The data records within an SDDF file contain the actual metrics captured for a given occurence of a trace event, ordered according to the corresponding descriptor record. The descriptors are generated at initialization time with a call to InitIOtrace. Data records are generated when the targeted event is called. Click here for a complete list of SDDF trace event descriptors of the trace records produced by the I/O extension to the Pablo trace library.
Depending on the programming languaged used, the proceedures for compiling differ slightly. In any case, the main thing is that you need to include the Pablo trace header file.
- For C---the header trace file is IOTrace.h
- For FORTRAN---fIOTrace.h
These header files can be found in the include subdirectory of the directory containing the Pablo software.
For C cc -c ioexample2.c -I<Pablodir>/include For FORTRAN f77 -c ioexample.f -I<Pablodir>/include For MPI C mpicc -c mpiExample.c -I<Pablodir>/include -I<mpidir>/include
The files must be linked with the libraries libPabloTrace.a and libPabloTraceExt.a, which can be found in the lib subdirectory of the directory containing the Pablo software.
For C
cc -o myEXE ioexample2.o -L<Pablodir>/lib -lPabloTrace -lPabloTraceExt [other libraries as necessary]
For FORTRAN f77 -o myEXE ioexample.o -L<Pablodir>/lib -lPabloTrace -lPabloTraceExt [other libraries as necessary] For MPI C mpicc -o myEXE ioexample2.o -L<Pablodir>/lib -lPabloTrace -lPabloTraceExt -lmpi [other libraries as necessary]
Execution is as usual:
C or FORTRAN Command line myEXE MPI mpirun -np <nprocs> myEXE
This will produce a tracefile for each processor.