Research Area: | Distributed storage | ||
---|---|---|---|
Status: | Finished | Degree: | Phd |
Directors: | Students: | ||
Proposed start date: | 2011-09-01 | Proposed end date: | 2015-06-12 |
Attachements: | |||
Description: | |||
Nowadays, end-users require higher amounts of reliable and available on-line space to store their personal information (e.g., documents, pictures). This motivates researchers to devise and evaluate novel personal storage systems in order to cope with the growing storage demands of users. In this dissertation, we focus our efforts to study two emerging personal storage architectures: Personal Clouds and social storage systems. As one can easily infer, both architectures are radically different and pursue distinct goals. On the one hand, Personal Clouds such as Dropbox or SugarSync, are centralized on-line cloud services for personal information that enable users to store, synchronize and share data from a variety of devices and operating systems. On the other hand, a social storage service is built upon a decentralized system that leverages preexisting trust or social relationships between users to enable mutually beneficial resource sharing. According to these storage architectures, this thesis contributes in two general challenges. Our first challenge is to understand the operation and performance of Personal Clouds. Under this ambitious challenge, (i) we first contribute by unveiling the internal structure of a global-scale Personal Cloud, namely UbuntuOne (U1), by describing its architecture, metadata service and storage interactions. Moreover, (ii) we provide a back-end analysis of U1 that includes the study of the storage workload, the user behavior and the performance of the U1 metadata store. We also suggest improvements to U1 that can also benefit similar Personal Cloud systems in terms of storage optimizations, user behavior detection and security. Apart from the internal facets of Personal Clouds, users and applications may interact with these services externally. In this sense, we also contribute by (iii) measuring and characterizing the transfer performance (e.g., speed, variability) of Personal Cloud REST API services. Furthermore, we realized that these API services may be a vector of abuse of Personal Clouds free accounts, which motivated us to study this vulnerability and propose several countermeasures. Overall, our contributions under this challenge provide a holistic view of the behavior of Personal Clouds that extends the state-of-the-art knowledge on these systems. Our second challenge is to explore the Quality of Service (QoS) of social storage systems. To undertake this challenge, we noticed that social storage systems are highly affected by availability correlations and very small groups to store data. This particular scenario poses new research questions that remain unsolved for providing an adequate storage QoS to users. In this sense, our first contribution is (iv) to analyze the QoS of social storage systems in terms of data availability, transfer performance and load balancing. Moreover, (v) we evaluate the suitability of common approaches for estimating data availability when users are correlated, showing that these techniques are severely biased and how this impacts on the data redundancy calculation. In consequence, we propose a history-based method to calculate data availability tailored to heterogeneous and correlated availabilities. Given the performance limitations inherent to many social storage scenarios, (vi) we design a hybrid architecture to enhance the QoS achieved by the system that combines user resources and cloud storage to let users infer the right balance between control and performance. In the experimental evaluation of this architecture, we specially focus on the role that the social topology plays in the system’s performance. Therefore, we contribute new insights on the performance of social storage systems as well as alternative architectural designs. Our contributions may help to increase the feasibility and performance of these systems, which is fundamental to their eventual adoption by end-users. Keywords: Personal Clouds, Performance Analysis, Distributed Storage, Friend-to-Friend Systems, Social Clouds. |