On-demand file staging system for Linux clusters (IEEE Cluster 2009)


An on-demand file staging system, Catwalk, is proposed. Catwalk is designed so that it can run on any Linux clusters without any special or additional hardware. By having hook functions on the system calls of file operations, a file staging system can be transparent from the view of users, and users can be free from having wrong file staging scripts.

In Catwalk, the file copying is done via normal TCP protocol so that Catwalk can run over ordinary, widely-used Ethernet. The stage-in file copy is pipelined to maximize the bandwidth from single file server. The performance of Catwalk is evaluated and compared with NFS using synthetic but realistic workloads. The evaluations show the stage-in performance with the pipeline technique is much better than the performance of NFS. The stage-out performance is comparable with the NFS performance despite the extra copying of files, and the file server is lightly loaded with the Catwalk stage-out while NFS entails much heavier server loads.

The biggest problems of NFS are its centralized design and lack of scheduling for the parallel workloads. The performance of Catwalk shows that remote file access performance can be improved much better if file accesses are scheduled in a proper way. Thus the proposed file staging system can be a strong complement to NFS, especially for small clusters often having no dedicated parallel file system.