Abstract
With clouds becoming a standard target for deploying applications, it is more important than ever to be able to seamlessly utilise resources and services from multiple providers. Proprietary vendor APIs make this challenging and lead to conditional code being written to accommodate various API differences, requiring application authors to deal with these complexities and to test their applications against each supported cloud. In this paper, we describe an open source Python library called CloudBridge that provides a simple, uniform, and extensible API for multiple clouds. The library defines a standard ‘contract’ that all supported providers must implement, and an extensive suite of conformance tests to ensure that any exposed behavior is uniform across cloud providers, thus allowing applications to confidently utilise any of the supported clouds without any cloud-specific code or testing.
Categories and Subject Descriptors: Computer systems organization, Architectures, Distributed architectures, Cloud computing
Keywords: Cloud Computing, Cross-cloud, Interoperability, Portability
1. INTRODUCTION
Rapid adoption and deployment of cloud infrastructure, whether it be for new applications, microservices, or migration of legacy applications into the cloud [1], calls for treatment of infrastructure as code [2]. Furthermore, with an increased reliance on cloud providers as application deployment platforms, it is becoming even more important to be able to deploy increasingly complex application stacks in a cross-cloud compatible manner [3]. With the core conceptual underpinnings of major cloud providers rapidly converging in terms of functionality and scope (e.g., virtual machines, firewall rules, block storage and object storage among others), cross-cloud compatibility would appear to be simple at first glance. However, writing such applications is often quite difficult in practice, as each vendor offers proprietary APIs with various differences.
To support cross-cloud compatibility at an API level, there are two broad approaches: wrappers and adapters [4]. Wrappers hide the actual vendor API calls and expose a single external API, translating the wrapper’s API to the vendor APIs (e.g., Libcloud and jClouds). Adapters may expose several popular APIs, allowing the end-user to utilise a familiar API of choice, which the adapter will relay to the vendor’s API. For example, Amazon’s APIs are supported in systems such as OpenNebula [5] and OpenStack.
While adapted APIs provide upfront relief to the application developer by being able to utilize a familiar SDK for multiple cloud providers (e.g., boto), in the long run, the subtle differences across the APIs call for special-casing the application code or limit the features of the application on select clouds. Some specific examples of this include the inability to name instances on OpenStack using the EC2 APIs and difficulties in correlating EC2 IDs with corresponding OpenStack IDs.
Many of the wrapped APIs that are currently available have similar issues, resulting in abstractions which often require complex special cased code to deal with various differences between clouds. This is often because the abstractions adopt a lowest-common-denominator approach for wider compatibility. In practice however, this poses a significant hurdle for application developers, as it is non-trivial to test an application for compatibility across a wide range of clouds or indeed, even obtain access to a wide range of clouds. This leads to reduced application portability and incurs temporal, financial and complexity costs.
In this paper, we present a new Python library called CloudBridge. CloudBridge follows a fixed set of principles to provide a minimal, uniform, and extensible programmatic interface to multiple cloud providers. It is minimal because the library forwards all calls to existing provider SDKs, forming a thin, simple layer with a consistent interface. It is uniform because it offers a consistent set of methods across all services, and all implementations are tested using the same test suite, obviating the need for applications to be individually tested against each implementation. It is extensible because the interface is well defined, and new provider implementations can be dynamically plugged in. The library focuses on the core infrastructure-as-a-service cloud features that are likely to exist, or can be simulated, for multiple providers. At the moment, support for compute and object storage services on Amazon Web Services and OpenStack clouds is available while support for the Google Cloud is under development.
The library is very well documented, paving the path for community contributions to add support for additional clouds. There is dedicated usage documentation available at https://cloudbridge.readthedocs.org/. The library is released under the MIT open source license and the code is available in at https://github.com/gvlproject/cloudbridge.
2. ARCHITECTURE
We begin our description of CloudBridge by discussing our design philosophy. We then describe the conceptual layout of the library, followed by implementation details.
2.1. Design Philosophy
CloudBridge aims to provide a uniform and extensible core for manipulating infrastructure on multiple cloud providers. Since cloud vendors differ in their service model and technical realization [6] (partially due to their business models), a library such as this is forced to choose a set of principles that drive its development and accommodate for the technical differences of the underlying cloud environments. Although the cloud computing space is highly diverse and rapidly evolving, we believe that the design philosophy as presented transcends the surface differences among providers and is able to cater to a wide range of vendors and applications.
First, the core library offers a uniform API irrespective of the underlying provider. This means that no special-casing of the application code is required, implying that the application can be deployed to any supported provider without modification. Second, the library provides a set of conformance tests for all supported clouds. With high test coverage, the application developer is in a position to reliably deploy their application without having to individually test against each supported cloud. From the library development standpoint, the extensive test coverage allows sweeping code changes and new features to be added with increased confidence that no existing functionality is affected [7]. Third, it focusses on mature clouds with a required minimal set of features, as opposed to following a lowest-common-denominator approach. The availability and implementation details of these services may differ across the providers but basic services such as the compute service, block storage service and a firewall service, among others, should exist for a cloud provider to be added. The fourth principle is that CloudBridge, as an additional layer of indirection, needs be as thin as possible. This is realized by wrapping the cloud providers’ native SDKs, which leads to reduced development time, improved reliability and quicker iterations when adding new features.
These set design principles imply that certain tradeoffs may be necessary in cases where service availability across clouds is substantially different. There are two general paths that can be undertaken in this case: (1) a higher-level service is simulated within the library on behalf of a cloud provider to adhere to the first design principle or (2) the service is decomposed into its most basic parts and the higher-level service functionality is left up to the application developer. Due to the potentially conflicting provider constraints and requirements, it is impossible to set a library-wide policy, and must be decided on a case-by-case basis.
2.2. API Overview and Interfaces
The CloudBridge library API revolves around three concepts: (1) providers; (2) services; and (3) resources. The provider encapsulates connection properties for a given cloud provider and manages the required connection. Services expose the provider functionality, offering the ability to create, query and manipulate resources. Resources represent a remote cloud resource, such as an individual machine instance. Figure 1 captures the relationships between these three concepts.
Figure 1.
A class diagram of CloudBridge concepts.
Driven by the first design principle (‘be uniform’), CloudBridge offer a consistent set of methods across all the available services (see Table 1). The available methods enable basic activities (List, Get, Find, and Create) to be performed on resources. This level of uniformity across the available services enhances developer experience as common functionality is always present. Similarly, individual resources offer a common set of basic properties (e.g., id, name). In addition, each resource is equipped with a set of fields and methods appropriate for that resource type.
Table 1.
A set of services currently supported in CloudBridge and the methods available in each service.
Service name | list | get | find | create |
---|---|---|---|---|
| ||||
InstanceService | yes | yes | yes | yes |
VolumeService | yes | yes | yes | yes |
SnapshotService | yes | yes | yes | yes |
ImageService | yes | yes | yes | no |
NetworkService | yes | yes | yes | yes |
SubnetService | yes | yes | no | yes |
ObjectStoreService | yes | yes | yes | yes |
KeyPairService | yes | yes | yes | yes |
SecurityGroupService | yes | yes | yes | yes |
InstanceTypeService | yes | yes | yes | N/A |
RegionService | yes | yes | no | N/A |
2.3. Implementation
We have implemented CloudBridge in about 5,000 lines of Python, with the core code and conformance tests consisting of 3,000 lines, and provider implementations taking up 2,000. The library can be used with Python 2.7, 3.4 and 3.5 on Linux, OS X, and Windows. There is comprehensive documentation covering both usage and development, including an introductory tutorial and a contributor’s guide.
Structurally, the library implements the Bridge pattern [8] for relaying the providers’ native SDKs to the defined library interface. The adopted pattern allows the library interface to evolve independently of the underlying libraries and is supportive of adding new providers in a low-impact fashion (see Figure 2). Specifically, one needs to provide an implementation for the interface to translate the native SDK API calls into the ones defined by the library interface; no library-wide modifications are necessary. Further, the Factory pattern [8] is implemented for instantiating the requested provider implementation.
Figure 2.
Bridge pattern as implemented in CloudBridge.
Driven by the fourth design principle (‘be a thin layer’), the library heavily relies on third party SDKs and libraries. This greatly simplifies the implementation aspects but also introduces a significant risk with dependency management. As the cloud computing space evolves, so will the SDKs’ functionality. To combat this challenge, we have adopted a model where the development branch of the library tracks the latest dependent SDK versions and any changes introduced by a dependency are dealt with during the development cycle. Just prior to a CloudBridge version release, the current dependent library versions are fixed in the library setup. Hence, at any point in the future, the specific versions whose operation has been verified will get installed and the library, as well as any higher-level applications, will continue to be operational.
Before services and resources for a provider can be implemented, it is necessary to define the provider connection properties. This is achieved by supplying an implementation of the CloudProvider abstract class, which initializes the underlying SDK connection with the required authentication tokens. Typically, the implementation will lazily establish several connection objects for various provider services (e.g., compute, network, object store). Next, each provider implementation implements all the interface services and resources.Table 1 captures the core methods for the available services as defined in the interface. As described in the previous section, resources implement a core set of methods and properties as well as additional properties applicable to the given resource.Figure 3 provides an example of the Instance class with its attributes and methods. The implementation for each attribute or method uses the native provider SDK to obtain the required information or perform the action defined by the library interface. The implementation manipulates the obtained information to return the results in a format that matches the CloudBridge definition.
Figure 3.
Interface definition for the Instance class.
For application developers, CloudBridge provides support for some additional features. As managing lists of resources is a common operation, CloudBridge resources can be iterated on (e.g., for instance in provider.compute.instances). This provides for more compact application code that is idiomatic Python. Similarly, consuming large lists of resource objects can quickly become a performance and usability problem. To alleviate this issue, CloudBridge provides support for paging object lists. Note that certain provider and resource combinations provide support for this natively via their SDK, which is then simply relayed via CloudBridge. In cases where this is not supported, CloudBridge provides a client-side implementation by caching a full list of resources. The CloudBridge implementation also provides access to native provider SDK objects. Although this breaks the uniformity of the API and requires special-casing application source code, it offers an avenue for using provider functionality that is not available in CloudBridge. This feature is not advocated by the authors but may be an appropriate solution in certain circumstances.
3. USAGE
We demonstrate usage of CloudBridge via a snippet of code to launch a virtual machine on a select cloud. As can be seen in Table 2, after the initial imports (lines 1–2), the first step is to set up a provider-specific object by supplying credentials and cloud-specific parameters (lines 3–10). Before an instance can be launched, we need to create a key pair and a security group (lines 11–13). Each time a resource is created or a reference to an existing one fetched, CloudBridge wraps the resource in an appropriate CloudBridge resource object. Once the preliminary requirements for launching an instance have been satisfied, we launch an instance by supplying desired parameters (lines 14–16). After the instance has launched, we can inspect its properties (lines 17–23) and ultimately terminate it (line 24). As visible from this example, all the code after this initial setup is equivalent irrespective of the provider. This is true for any service or resource available within CloudBridge and is the enabler of cloud-agnostic applications.
Table 2.
A snippet of code used to launch an instance using CloudBridge, shown for two different clouds.
AWS | OpenStack (NeCTAR) | |
---|---|---|
| ||
1. | from cloudbridge.cloud.factory import CloudProviderFactory | |
2. | from cloudbridge.cloud.factory import ProviderList | |
| ||
3. | config = {‘aws_access_key’: ‘A_KEY’, | config = {‘os_username’: ‘username’, |
4. | ‘aws_secret_key’: ‘S_KEY’} | ‘os_password’: ‘password’, |
5. | ‘os_tenant_name’: ‘tenant name’, | |
6. | ‘os_auth_url’: ‘authentication URL’, | |
7. | ‘os_region_name’: ‘region name’} | |
8. | ||
9. | provider = CloudProviderFactory().create_provider(ProviderList.AWS, config) | provider = CloudProviderFactory().create_provider(ProviderList.OPENSTACK, config) |
10. | image_id = ‘ami-d85e75b0’ | image_id = ‘c1f4b7bc-a563-4feb-b439-a2e071d861aa’ |
| ||
11. | kp = provider.security.key_pairs.create(‘cloudbridge_intro’) | |
12. | sg = provider.security.security_groups.create(‘cloudbridge_intro’, ‘A security group used by CloudBridge’) | |
13. | sg.add rule(‘tcp’, 22, 22, ‘0.0.0.0/0’) | |
14. | img = provider.compute.images.get(imagejd) | |
# Find the smallest instance with at least 2 vcpus and 4 GB of ram | ||
15. | inst_type = sorted([t for t in provider.compute.instance_types.list() if t.vcpus >= 2 and t.ram >= 4], key=lambda x: x.vcpus*x.ram)[0] | |
16. | inst = provider.compute.instances.create(name=‘CloudBridge-intro’, image=img, instance_type=inst_type, key_pair=kp, security_groups=[sg]) | |
17. | # Wait until ready | |
18. | inst.wait till ready() | |
19. | # Show instance state | |
20. | inst.state | |
21. | # ‘running’ | |
22. | inst.public ips | |
23. | # [u‘54.166.125.219’] | |
24. | inst.terminate() |
4. DISCUSSION
As captured by the design philosophy item on uniformity, any of the available API methods are uniform across all the implemented providers allowing the application to, conceptually, be developed without special-cased code. However, cloud vendors may not provide all the underlying services. For example, the Australian national cloud NeCTAR (nectar.org.au/research-cloud) does not provide the volume storage service for all users.
Similarly, the US-based Chameleon cloud (chameleoncloud.org) does not provide an object storage service. In such scenarios, the application code must still account for the availability of a service before consuming it.
CloudBridge offers two methods for dealing with this scenario: CloudProvider class implements a has_service method that allows the application to check for existence of a service (e.g., provider.has_service(CloudServiceType.OBJECT_STORE)). Another option is to create a hybrid provider object. For services that can be used independently, CloudBridge offers the ability to specify a per-service configuration object. Table 3 shows an example where all the services except the object store service (Swift) will target Chameleon while the object store service targets NeCTAR cloud. Unfortunately, this cannot be achieved for any combination of services because, for example, a volume service must exist on the same cloud as the instance service.
Table 3.
Hybrid provider configuration sample.
1. | config = | {‘os_username’: ‘Chameleon username’, |
2. | ‘os_password’: ‘Chameleon password’, | |
3. | ‘os_tenant_name’: ‘Chameleon tenant’, | |
4. | ‘os_auth_url’: ‘Chameleon auth url’, | |
5. | ‘os_region_name’: ‘Chameleon region’ | |
6. | ‘os_swift_username’: ‘NeCTAR username’, | |
7. | ‘os_swift_password’: ‘NeCTAR password’, | |
8. | ‘os_swift_tenant_name’: ‘NeCTAR tenant name’, | |
9. | ‘os_swift_auth_url’: ‘NeCTAR auth url’, | |
10. | ‘os_swift_region_name’: ‘NeCTAR region name’) |
In future, we plan to extend these capabilities to broadly determine a provider’s feature set so that application developers can adjust according to available features in a service (e.g., availability of block device mapping support in the ComputeService). However, this creates a tension between too fine a granularity of feature determination (which poses a greater burden on the developer to respond appropriately to a larger number of possible conditions), and too coarse a granularity, which would mandate that all clouds have all features at all times. In general, CloudBridge will err on the side of coarser granularity with clear guidelines on usage, to reduce the burden on application developers. The side-effect of this is that, by design, CloudBridge will be unable to support a broad gamut of providers with disparate capabilities. The upside of this design is that CloudBridge’s abstraction is simpler.
5. RELATED WORK
Conceptually, there are two models for interoperable service functionality: a common service API or an adapter library. A common API is captured by a standard that providers implement and allow applications to uniformly interact with multiple clouds using that same API. Cloud Infrastructure Management Interface (CIMI) [9] and the Open Cloud Computing Interface (CSSI) [10] are the two standards that exist today for low-level infrastructure management. Although conceptually desirable, in practice, adoption of these standards has been low [11] as vendors (particularly the major ones) have little incentive to conform to the standard, plus it limits the amount of innovation and differentiation a vendor can supply. In the long run, as vendor services converge in terms of functionality and scope, standards-based implementations may become more prevalent. The alternative to a common service API is a library such as CloudBridge that provides standardization via code. Arguably, the most popular such library today is Apache Libcloud (libcloud.apache.org), which provides a common Python API for many popular cloud services.
While Libcloud and CloudBridge may appear to be solving the same problem at first glance, we believe that these two libraries target two different levels of abstraction and are therefore conceptually and functionally very different. A rough analogy would be that Libcloud is to CloudBridge what urllib is to requests - a detail-oriented vs. a higher-level abstraction. In principle, Libcloud could be used as an underlying provider library for CloudBridge, although in practice, we have chosen not to do so at present for reasons explained below. In more detail, the differences are as follows:
Libcloud offers a lowest-common-denominator approach by targeting a large number of clouds with a minimum level of abstraction, and offering extra functionality as cloud specific extensions. In practice, this requires that special-case code be written using cloud-specific “extended methods”, unlesstheapplication’s requirements are very modest;
Libcloud wraps libraries at a ReST/HTTP level, instead of using native SDKs. The positive aspects of this approach are that it minimizes the number of dependencies and gives complete control over the client-side implementation (e.g. switching to an asynchronous implementation). The negatives aspects are that the library becomes slow to incorporate new features because they have to be implemented directly in the library vs. reusing native libraries, meaning that Libcloud will always be slightly behind the curve (native libraries are always more up-to-date since the cloud provider adds native library support with the release of a new service feature). Further, there is a significant duplication of work, often of a non-trivial nature (e.g. Boto has over 77,000 lines of code that largely has to be replicated in the 250,000 lines of Libcloud code);
Libcloud’s testing is largely provider specific, since the exposed methods tend to be provider specific. This offers few guarantees to the developer of having a “write once, run anywhere” experience. Since cloud infrastructures are non-trivial to get access to, deploy and test on, this makes it extremely difficult to support a large variety of clouds in practice. In contrast, CloudBridge takes this burden upon itself, leaving the application developer free to test against a single implementation or conformant mock only;
Libcloud supports a wider range of provider implementations. At present, Libcloud supports more than 30 cloud providers, exceeding CloudBridge by an order of magnitude. Over time, we will be adding support for additional providers to CloudBridge, with Libcloud itself being potentially used as an SDK.
Libcloud does not offer a way to determine provider capabilities at runtime. Unlike libraries like jClouds, Libcloud does not offer a feature to determine provider capabilities at runtime. CloudBridge, at present, has limited supported for this capability (e.g., via the has_service function).
While we did consider using Libcloud as an underlying provider for CloudBridge, we found that in the case of EC2 and OpenStack, the vendor provided SDKs were naturally at a higher-level of maturity and functionality as mentioned above, and were therefore more suitable choices. Nevertheless, Libcloud may be a more suitable choice for future provider additions to CloudBridge, and would need to be evaluated on a case by case basis.
Similar to CloudBridge and Libcloud, comparable libraries have been developed for other languages, for example jClouds for Java (jclouds.org), Fog for Ruby (fog.io), and pkgcloud for JavaScript and its Node.js framework (github.com/pkgcloud/pkgcloud).
Finally, a variation of the adapter model is an HTTP service that provides its own APIs that target multiple clouds. Projects such as Apache Deltacloud (deltacloud.apache.org) and Dasein Cloud (dasein.org) allow HTTP requests to be sent to a single API and a single service while the service internally translates the request to the native provider. While the benefit of using HTTP as a common interface is exciting due to its wide applicability, it does require the application developer to operate at a lower level of abstraction than using a native, language-specific library, losing idiomatic expression and convenience.
6. CONCLUSIONS AND FUTURE WORK
Although there are existing libraries that facilitate cross-cloud application deployment, they tend to lack uniformity across their APIs. This reduces application portability and slows down development because developers need to learn what methods are available for each individual provider, conditionally handle the differences, and test their applications against each cloud. In this paper, we described a new Python library that follows a fixed set of principles to produce a minimal, uniform and extensible interface for all supported clouds and thereby deliver an interoperable interface for them. By design, the library is focused on a set of cloud features that are common across mature cloud providers such as Amazon, OpenStack and Google and, in our experience, those features are sufficient to address the requirements of a wide range of applications. Therefore, we believe the simplicity and portability of CloudBridge will go a long way in promoting its adoption.
Looking forward, we are currently adding support for the Google Cloud provider and and will be looking into adding support for additional cloud services, such as the Metadata service, support for Projects/Tenancies and the increasingly available Container-as-a-Service.
We are also considering extending CloudBridge with support for orchestrating and managing container technologies, such as Docker and LXC. We hope to integrate this with the cloud specific container services mentioned above, and provide a uniform abstraction for managing the lifecycle of containers, their storage connections and their deployment.
Another possible future enhancement for CloudBridge is to add support for aggregate clouds, through application of the Composite pattern [8]. By allowing a composite provider instance to be created out of two simple provider instances, it would be possible to have features such as transparent failover or cloud-bursting for example, using a user supplied policy. While there are several challenges in providing such support, it is something that we hope to explore in future.
ACKNOWLEDGMENTS
This project was supported in part through grant VLS402 from National eCollaboration Tools and Resources, grant eRIC07 from Australian National Data Service, grant number HG006620 from the National Human Genome Research Institute, and grant number CA184826 from the National Cancer Institute, National Institutes of Health.
Footnotes
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
REFERENCES
- [1].Botta A, de Donato W, Persico V, and Pescapé A, “Integration of cloud computing and Internet of Things: A survey,” Futur. Gener. Comput. Syst, vol. 56, pp. 684–700, October. 2015. [Google Scholar]
- [2].Fitzgerald B, Forsgren N, Stol K-J, Humble J, and Doody B, “Infrastructure Is Software Too!,” SSRN Electron. J, October. 2015. [Google Scholar]
- [3].Petcu D, “Multi-Cloud: expectations and current approaches,” Proc. 2013 Int. Work. Multi-cloud Appl. Fed. clouds - MultiCloud ’13, pp. 1–6, 2013. [Google Scholar]
- [4].Di Martino B, Cretella G, and Esposito A, Cloud Portability and Interoperability. Cham: Springer International Publishing, 2015. [Google Scholar]
- [5].Moreno-Vozmediano R and Llorente IM, “IaaS Cloud Architecture: From Virtualized Datacenters to Federated Cloud Infrastructures,” Computer (Long. Beach. Calif), vol. 45, no. 12, pp. 65–72, December. 2012. [Google Scholar]
- [6].Prodan Rand Ostermann S, “Asurvey and taxonomy of infrastructure as a service and web hosting cloud providers,” in 2009 10th IEEE/ACM International Conference on Grid Computing, 2009, pp. 17–25. [Google Scholar]
- [7].Bertolino A, “Software Testing Research: Achievements, Challenges, Dreams,” in Future of Software Engineering (FOSE ’07), 2007, pp. 85–103.
- [8].Gamma E, Helm R, Johnson R, and Vlissides J, “Design patterns: elements of reusable object-oriented software,” January. 1995.
- [9].Davis D and Pilz G, “Cloud Infrastructure Management Interface (CIMI) Model and REST Interface over HTTP,” vol. DSP-0263, no. May, 2012. [Google Scholar]
- [10].Edmonds A, Metsch T, Papaspyrou A, and Richardson A, “Toward an Open Cloud Standard,” IEEE Internet Comput, vol. 16, no. 4, pp. 15–25, July. 2012. [Google Scholar]
- [11].Ortiz SJ, “The Problem with Cloud-Computing Standardization,” Computer (Long. Beach. Calif), vol. 44, no. 7, pp. 13–16, 2011. [Google Scholar]