ASBAL: An Autonomic Storage System

 

 

 Objectives

ASBAL is an autonomic storage system able to improve the I/O performance in an automatic way. In order to obtain the autonomic operation, we propose a system able to first model the behavior of the devices. Then it is able to predict the behavior of the applications with regard to files access pattern. On the other hand, the system is able to propose different data placements and to predict the performance that will be obtained with these new distributions. Finally, with this information, the system is able to decide whether a better distribution exists and therefore implement it.

 People Involved

 

 Motivation

One of the main trends in the computing world is the increasing needs for I/O capacity and performance shown by applications. In addition, the gap between primary and secondary-memory speed continues growing, which means that the I/O performance is becoming the bottleneck for these applications.

Many approaches to solve this problem have been proposed in the last decades. One of the most promising consists of configuring the storage system and the placement of data to maximize the storage-system performance for a specific workload. In general, this approach consists of finding the optimal configuration and data placement for the I/O system given a specific workload. Currently, these optimizations are usually done by experts who use their experience and intuition to make this configuration and
placement. This means that only a few sites can take advantage from this kind of "optimal'' placement benefits because not everybody has (or can afford) an expert to place data in the best possible way. For this reason, a tool that could perform this
tuning in an automatic way would be a great step in making this technique available to a wider range of sites. Furthermore, this tool becomes even more useful if the optimal configuration and placement varies throughout the time making it more difficult to
keep the right placement up to date.

Our objective is to design a storage system capable of extracting all potential performance and capacity available in a heterogeneous environment with as little human interaction as possible. We envision the system as an advanced data-placement
mechanism that analyzes the workload to decide the best distribution of data among all available devices, as well as the best placement within each device.

 Autonomic storage system: a global picture

In order to place the work in the right context, we will first give a global description of how the whole system works, and then we will describe in detail each of the parts that build it.


The Figure presents the different steps done by our system and that will be described in the following paragraphs. In the starting point, when the system is new, we have a set of disks attached to the system. The first thing we have to do is to model them. This model should have two main properties. First it should be able to predict the performance of a given workload, without having to run it. Second, it should treat the disk as a "black box''. Otherwise, the model would not work for disks with new characteristics or mechanics.

Once we have all disks modeled, and thus we can predict the performance of any possible workload, we start learning the workload behavior. This step mainly consists of tracing the requests done to the disk and keeping them in file-system internal data structures.

Periodically, which could be once a day, once a week, or any other period depending on the needs, the system uses the workload behavior learned to generate different placement alternatives that may (or may not) improve the performance of the applications running on the system. As these new placements cannot be physically implemented and tested, we will use the disk models to predict the performance each new placement would achieve.

After the performance of all proposed placements is predicted, we pick the best one and compare it with the performance of the current workload (which has been learned at the same time as the workload). If the new placement is better that 10% the performance of the current one, then we take the effort of moving the blocks to implement the new placement and thus improve the performance of the applications using these disks.

It is also important to see, that whenever a new disk is added, it has to be modeled and then it will be used by the generator of placement alternatives to place blocks in it.

 Related Publications

  • Autonomic Storage System Based on Automatic Learning
    F. Hidrobo and T. Cortes
    International Conference on High-performance Computing (HiPC 2004)
    Bangalore, India, December 19-22, 2004
  • Towards an Automatic Storage System to Improve Parallel I/O
    F. Hidrobo and T. Cortes
    Parallel and Distributed Computing and Systems (PDCS 2003)
    Marina del Rey, USA, November 3-5, 2003
  • Towards a Zero-Knowledge Model for Disk Drive
    F. Hidrobo and T. Cortes
    Autonomic Computing Workshop (AMS 2003)
    Seattle, WA, June 25, 2003