SCSI Disk Feature-Extraction Facility
During the past five years, the capacity, speed, and reliability of commodity disks have increased dramatically. However, the disparity between processor and disk speed continues to grow. To achieve high performance, particularly for I/O-intensive applications, developers must use storage devices intelligently, implementing I/O to take advantage of file system and disk policies. Understanding disk behavior is prerequisite to optimizing I/O performance and to designing appropriate file systems.
The Pablo Disk Feature Extraction Facility (SDFEF) component of the Pablo Performance Analysis Environment is a software toolkit that automatically extracts important parameters of disk operation. It was designed to enable automatic characterization of disk features and extract key parameters of SCSI disks using the SCSI standard for disk interrogation [BGYJ95]. These parameters can be used as baseline data when interpreting physical I/O traces from device-driver instrumentation or evaluating the ways file systems mediate application requests and disk responses.
SDFEF uses two methods to extract SCSI disk parameter data, Interrogation and Empirical Extraction. The data generated by SDFEF contains both interrogation data and empirical extraction data.
Modern SCSI disk drives supply some disk parameter values upon request. Most disk vendors implement both ANSI-standard and vendor-specific methods for requesting configuration information. For the purpose of portability, SDFEF uses only ANSI-standard features supported by most popular SCSI disks. The interrogation data includes disk information such as disk defect lists, the capacity of the logical unit, and disk configuration parameters, etc.
Empirical extraction is done by issuing a specific series of user commands and extrapolate the desired values from the results. There are two reasons to use empirical extraction method:
Some interrogation values are averaged values and can be misleading.
Some parameters can only be obtained empirically.
The empirical extraction data includes disk zone information, rotation speed, and pre-fetching related information, etc.
SDFEF includes user-level programs and a JAVA user interface. The user-level programs automatically determine the parameters of a SCSI disk through interrogation and extrapolation from timing results of predefined operations such as a series of READs. Java user interface present the obtained parameter values in a user-friendly way.
This research is supported in part by the US National Science Foundation and the US Department of Energy.