File system metadata virtualization
ColaboratorCortés, Toni; Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Document typeDoctoral thesis
PublisherUniversitat Politècnica de Catalunya
Rights accessOpen Access
The advance of computing systems has brought new ways to use and access the stored data that push the architecture of traditional file systems to its limits, making them inadequate to handle the new needs. Current challenges affect both the performance of high-end computing systems and its usability from the applications perspective. On one side, high-performance computing equipment is rapidly developing into large-scale aggregations of computing elements in the form of clusters, grids or clouds. On the other side, there is a widening range of scientific and commercial applications that seek to exploit these new computing facilities. The requirements of such applications are also heterogeneous, leading to dissimilar patterns of use of the underlying file systems. Data centres have tried to compensate this situation by providing several file systems to fulfil distinct requirements. Typically, the different file systems are mounted on different branches of a directory tree, and the preferred use of each branch is publicised to users. A similar approach is being used in personal computing devices. Typically, in a personal computer, there is a visible and clear distinction between the portion of the file system name space dedicated to local storage, the part corresponding to remote file systems and, recently, the areas linked to cloud services as, for example, directories to keep data synchronized across devices, to be shared with other users, or to be remotely backed-up. In practice, this approach compromises the usability of the file systems and the possibility of exploiting all the potential benefits. We consider that this burden can be alleviated by determining applicable features on a per-file basis, and not associating them to the location in a static, rigid name space. Moreover, usability would be further increased by providing multiple dynamic name spaces that could be adapted to specific application needs. This thesis contributes to this goal by proposing a mechanism to decouple the user view of the storage from its underlying structure. The mechanism consists in the virtualization of file system metadata (including both the name space and the object attributes) and the interposition of a sensible layer to take decisions on where and how the files should be stored in order to benefit from the underlying file system features, without incurring on usability or performance penalties due to inadequate usage. This technique allows to present multiple, simultaneous virtual views of the name space and the file system object attributes that can be adapted to specific application needs without altering the underlying storage configuration. The first contribution of the thesis introduces the design of a metadata virtualization framework that makes possible the above-mentioned decoupling; the second contribution consists in a method to improve file system performance in large-scale systems by using such metadata virtualization framework; finally, the third contribution consists in a technique to improve the usability of cloud-based storage systems in personal computing devices.
- Tesis - TDX-UPC