Version 8 (modified by raul, 12 years ago) (diff) |
---|
Personal Cloud Measurement
We have created a script named measurement.py. It takes two arguments:
- Provider: -p or --provider and [dropbox|box|sugarsync]
- Test: -t or --type and [load_and_transfer|service_variability]
Depending on the test type, the script executes one of the following files:
- load_and_transfer: File load_and_transfer_test.py
- service_variability: File service_variability.py
Load and Transfer Workload
The objective of this workload was twofold: Measuring the maximum up/down transfer speed of operations and detecting correlations between the transfer speed and the load of an account. Intuitively, the first objective was achieved by alternating upload and download operations, since the provider only needed to handle one operation per account at a time. We achieved the second point by acquiring information about the load of an account in each API call. The execution of this workload was continuously performed at each node as follows: First, a node created synthetic files of a size chosen at random from the aforementioned set of sizes. That node uploaded files until the capacity of the account was full. At this point, that node downloaded all the files also in random order. After each download, the file was deleted.
Implementation
First of all, the script creates 4 different files of different sizes (25, 50, 100 and 150 MB). Once done, it starts to upload random files until the account is full and an error is returned from the provider. When this appends the script starts to download a random file from the account and removes it when the download has finished.
This test will be running for 5 days approximately.
Service Variability Workload
This workload maintained in every node a nearly continuous upload and download transfer flow to analyze the performance variability of the service over time. This workload provides an appropriate substrate to elaborate a time-series analysis of these services. The procedure was as follows: The upload process first created files corresponding to each defined file size which were labeled as “reserved”, since they were not deleted from the account. By doing this we assured that the download process was never interrupted, since at least the reserved files were always ready for being downloaded. Then, the upload process started uploading synthetic random files until the account was full. When the account was full, this process deleted all files with the exception of the reserved ones to continue uploading files. In parallel, the download process was continuously downloading random files stored in the account.
Implementation
The script creates 4 files as the previous test. When the files are ready it uploads a file named reserved.dat (which will remain in the accound until the test ends) with a size of 50MB. As soon as the file is completely uploaded, the script creates two threads, one for download and one for upload.
- The upload thread constantly uploads files sized [25, 50, 100, 150] MB until the account is full. Once it is full, the thread removes all files except "reserved.dat". Then, it starts its cycle again.
- The download thread continuously lists all files in the account and downloads one randomly chosen. There will be always at least one file ("reserved.dat").
Deployment
Finally, we executed the experiments in different ways depending on the chosen platform. In the case of PlanetLab?, we employed the same machines in each test, and therefore, we needed to sequentially execute all the combinations of workloads and providers. This minimized the impact of hardware and network heterogeneity, since all the experiments were executed in the same conditions. On the contrary, in our labs we executed in parallel a certain workload for all providers (i.e. assigning 10 machines per provider). This provided two main advantages: The measurement process was substantially faster, and fair comparison of the three services was possible for the same period of time.
Traces
We will make the traces from our experiments publicly available upon the publication of an article which is currently under submission.