#OpenStack likes #Ceph: You will love the way it works

Ceph scales out, Ceph is highly redundant, Ceph is flexible, Ceph is full of cool features. Ceph and #OpenStack are close friends.

Te queremos Ceph!

Firefly is the last Ceph’s release and is a beauty starting point for us to develop what we named “the first storage solution fully supported by KIO”. As I’ve mentioned in my previous notes regarding to this “Software Defined Storage” current reality, commodity is helping out to companies like us to support and design our own hardware storage technology. These new trends are supported by powerful innovators like EMC ScaleIO (or ViPR), Nutanix – BTW, Yesterday I’ve met a high trustable source that told me there is a chance that Nutanix decouples its software from its hardware to get more flexibility and it will be in a very short term –  Pivot3  – who dares to say they are the “inventors of Software Defined Storage”- , VMWare Virtual SAN, SimpliVity and more?

Back to Ceph, we are really committed to use it as our reference technology for OpenStack’s storage. Why? several reasons:

  • The most powerful reason: “Ceph is an open source and freely-available, and it always will be”
  • Much Faster: Ceph is freely-available and that help us to get a more affordable option to use Solid State Disks only to store all Data Blocks at reasonable market prices and removing out the need for Storage Tiering, and therefore a high simplification to manage different applications with different performance needs. Actually, we are adding Flash PCI Cards for caching and making your applications fly like never before!
  • Flexibility: Ceph supports a truly Scale-Out operation model allowing you to add more boxes/servers to increase capacity and performance without any issue to data accessibility.
  • Highly Redundant: You can have up to three copies of the stored data spread among the servers.
  • A solid integration with OpenStack’s Cinder & Swift

We were seeing solutions with integration with OpenStack’s Cinder like SolidFire; EMC VNX and NetApp. We’ve chosen Ceph to reach the market prices that our customers demand. SolidFire was pretty close to be selected, but the uncertainty about its efficiency factor and its eventually real results and its dependency from the application data type, make to consider a solution like Ceph that brings a complete accuracy on our storage costs.

Short summary what Ceph does

Ceph is a solution that brings an unified storage solution: objects, block and file access. Ceph’s block storage feature works through a special interface called RBD (RADOS Block Device). Ceph doesn’t work with the traditional storage protocols like Fiber Channel over Ethernet or iSCSI. Anyway, Ceph is optimized to bring a really good performance thank to it “automatically stripes and replicates the data across the cluster“. Then, Ceph thinly provisions any server image into the cluster spreading pieces in different servers and disks (where OSD daemons reside) to balance performance and disk capacity usage.

(below you can check out how Ceph and OpenStack work together through cinder)

ceph firefly openstack cinder icehouse kio mauricio rojas cto crush rados rbd block storage

Ceph is composed by three types of components:

  • OSD Daemons are in charge to manage data operations like store and replicate it. These OSDs are grouped by different failures domains – failures domains could be different datacenter’s cabinets –
  • Ceph Monitor (M) keep information about the cluster and its components
  • Ceph Metadata Server makes things easier to attend POSIX file system users

Ceph uses CRUSH algorithm to define where the data needs to be located into the Cluster and it is responsible for the efficiency to scale and balance the usage of the Cluster’s resources.

Sounds simple, but you need to dedicate some time to architect and implement it. However, You will not regret to try it!

See you around!

4 replies »

  1. CEPH sounds great on slides. You’d be wise to look deeper though (e.g. the Ceph war stories session at the OpenStack Design Summit… where the term Cephpocalypse was coined). Data loss is generally frowned upon when selecting a storage solution.

  2. Good to have news from you Rob. Ceph is not ready to be an out-of-the-box solution, requires a lot of work. Data loss is a risk that you have to remove through testing your implementation against tens of possible failures. You need to add code also in the middle to make it work as you want. Data loss is a risk in any storage platform, vendors minimize the chance to happen through testing and tweaking/coding software and firmwares. Therefore, we have to do the same thing with the resources that we have… testing…testing…testing

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: