Managing applications on the cloud requires extensive decision making on the part of the Application Provider (AP). When an application faces changing workload, the services of the application are either scaled up or down in response. The services run on Virtual Machines (VM) or container instances. APs decide on how the application scales through VM provisioning and the placement of the services on the VMs.
Various drivers guide this decision making. Application performance and cost are two such drivers. This thesis answers the question of how APs can meet the performance constraints of their applications while minimizing the cost of the running VMs. Two versions of the problem are presented. The first version expects to meet mean response time constraints given a deployment configuration through the replication of VMs and addition of virtual processors. The presented solution is based on layered bottlenecks. A case study shows the solution meets response time constraints and uses fewer resources in comparison to a simple utilization based approach.
The second version adds the minimization of cost as an objective, where VM-types having different cost rates are used. This problem does not require a deployment configuration and provides a complete solution, where resources can be added and removed. A novel solution based on the layered bottleneck strength value with genetic algorithm has been presented. For the case study, a decision maker is implemented for a web application. The proposed solution is compared with three algorithms, all of which run within the decision maker. The results from the case study show that the proposed solution provides shorter runtime than the exhaustive search, and is able to meet response time constraints with near optimal minimization of cost. The solution also results in better cost than a plain genetic algorithm and random search, at the expense of slightly longer runtime.