Ceph and OpenStack join forces
Dream Team
When building cloud environments, you need more than just a scalable infrastructure, you also need a high-performance storage component. We look at Ceph, a distributed object store and filesystem that pairs well in the cloud with OpenStack.
No matter how overused the word "cloud" has become as a buzzword, it has had little effect on the cloud's popularity, especially in the enterprise: OpenNebula, openQRM, and Eucalyptus are all examples of enterprise clouds. However, no solution has enjoyed the kind of attention received by OpenStack, which resulted from a collaboration of the US space agency NASA and the US hosting service provider Rackspace.
In the past two years, the project has evolved from the underdog to the standard; OpenStack conferences attract far more visitors than many traditional community events. Even the leaders of the OpenStack Foundation [1] are surprised by the success, and companies even evaluate their cloud solutions with a preference for OpenStack (see the box "OpenStack in the Enterprise").
OpenStack in the Enterprise
The German IT service provider Teuto.NET uses Ceph in combination with OpenStack in its own data center for some customers. Burkhard Noltensmeier, the Managing Director of the company, is looking to offer his customers this product as a standalone service hosted on Ostack.de.
As Noltensmeier told Linux Magazine when we enquired, the company evaluated other solutions up front. VMware's vCloud [2] was out of the question for the open source-centric company; the IT department did look at Google Ganeti in conjunction with DRBD, however. Although this combination works quite well, it lacks the ability to manage system images and network resources automatically.
A team made up of Ceph and OpenStack, however, offers a better API for "provisioning the infrastructure and in particular networking," Noltensmeier continues. It is highly available, easy to expand, and integrates well with third-party storage and network solutions, he said.
The potential storage candidates also include iSCSI, Filer by NetApp, or EqualLogic by Dell, as well as Ceph and Hyper-V, Xen, VMware, and other possibilities as hypervisors. OpenStack's storage component Swift was ruled out, however, because Ceph offers the ability to integrate block and object storage via a common infrastructure, and Ceph clusters were also "relatively easy to use" and robust, Noltensmeier said.
However, Noltensmeier also sees some room for improvement: For example, he would prefer better caching. Setting up a RAID on the cluster computers (erasure coding) and disaster recovery at pool level are not yet available in the stable release. That said, the speed at which Ceph is developing is impressive.
The Touchy Topic of Storage
When admins plan to set up in the cloud, they must deal with the topic of storage. Traditional storage systems are not sufficient in cloud computing because of two factors: scalability and automation. An admin can include classic storage in automated processes, whereas the issue of scalability can be more difficult to manage: Once a SAN is in the rack, any potential expansion is often expensive – if it is even feasible.
One aggravating factor for a public cloud provider is that you can hardly predict the required disk space. On the basis of recent demand, you might be able to calculate the growth of disk space, but that is little help if a new customer appears out of the blue requiring 5TB of space for its image hosting service. As with computing components, the provider must be able to expand the cloud quickly, and this is where the open source and free Ceph software enters the game.
Ceph Rules
The source code of the distributed filesystem Ceph [3] is largely covered by the LGPL and BSD licenses. The software is currently considered a hot product in the storage scene because it unites precisely these capabilities.
Thanks to these features, Inktank, the provider of the storage solution, has been able to compete for market share with long-established companies like EMC and NetApp.
An existing Ceph cluster can be easily extended by adding disks; only commodity hardware is required. Normal SATA disks complete the array, which affects the project budget considerably less than a (usually much smaller) SAS disk would.
Opportunely, Inktank has promoted the seamless integration of OpenStack and Ceph in recent months and, in a sense, already ensures at the factory that OpenStack optimally leverages the capabilities of Ceph. To determine the ideal location for Ceph and OpenStack, it makes sense to look at the kind of data the OpenStack components process and how they deal with storage.
Who Stores What?
In OpenStack services, two types of data basically occur: metadata and the actual payload. Technically, the metadata includes the cloud-specific configuration, which is usually handled by a separate database, such as MySQL. Most of the services deployed for virtualization purposes in OpenStack exclusively store metadata in the long term; this includes Keystone (identity service), Neutron (Network as a Service), Heat (Orchestration), and Ceilometer (monitoring).
The OpenStack dashboard, a.k.a. Horizon, does not create any data – either meta- or user. The compute service Nova is a special case: The component takes on the job of managing virtual machines. In addition to persistent metadata, it creates temporary images of virtual machines. The images of VMs in OpenStack are volatile, virtually by definition: In technical jargon, this is known as "ephemeral storage." If Nova wants to store a VM permanently, it needs to rely on an additional service.
This just leaves two services, Glance and Cinder, which create payload data in addition to their own metadata. Glance is the image service. Its purpose in life is to offer redundant image files from which new VMs can be booted. Cinder is the block storage service; it steps in to help Nova when the VMs require persistent storage. The task is thus to team Ceph with these two services to achieve optimal OpenStack integration.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Systemd Fixes Bug While Facing New Challenger in GNU Shepherd
The systemd developers have fixed a really nasty bug amid the release of the new GNU Shepherd init system.
-
AlmaLinux 10.0 Beta Released
The AlmaLinux OS Foundation has announced the availability of AlmaLinux 10.0 Beta ("Purple Lion") for all supported devices with significant changes.
-
Gnome 47.2 Now Available
Gnome 47.2 is now available for general use but don't expect much in the way of newness, as this is all about improvements and bug fixes.
-
Latest Cinnamon Desktop Releases with a Bold New Look
Just in time for the holidays, the developer of the Cinnamon desktop has shipped a new release to help spice up your eggnog with new features and a new look.
-
Armbian 24.11 Released with Expanded Hardware Support
If you've been waiting for Armbian to support OrangePi 5 Max and Radxa ROCK 5B+, the wait is over.
-
SUSE Renames Several Products for Better Name Recognition
SUSE has been a very powerful player in the European market, but it knows it must branch out to gain serious traction. Will a name change do the trick?
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.