Improving Parallel Write by Node-Level Request Scheduling (IEEE CCGRID 2009)


In a cluster of multiple processors or cpu-cores, many processes may run on each compute node. Each process tends to issue contiguous I/O requests for snapshot, checkpointing or so, however, if large number of processes enter the I/O phase at the same time, the requests from the same process may be interrupted by the requests of other processes.

Then, the I/O nodes receive these requests as non-contiguous way. This interleaved access pattern causes performance degradation in parallel file systems. In order to overcome the problem, we have designed the gather-arrange-scatter (GAS) I/O architecture, for optimizing the parallel write performance. The GAS is an architecture for capturing write operations, buffering them in the memory, and scheduling them to reduce I/O cost at I/O nodes.

The scheduling is done per compute node, and the requests are sent to the remote disks in parallel. In this paper, after introducing the GAS architecture in detail, its efficiency and scalability are evaluated using the NAS Parallel Benchmark BTIO. GAS is 5.2%faster than ROMIO collective I/O on PVFS2 in BTIO with 16 nodes/64 processes, and 34.9% faster than MPI noncollective I/O in the same configuration.