What do I have to choose to efficiently store, protect and manage my private cloud’s data?

The amount of available storage systems in the market have increased in my radar during the last months therefore the complexity to choose one of them – I am personally afraid of choosing one now and regret it later, aren´t you? –  The storage ecosystem is changing so fast that probably if you select something now, the criterion that you have just used won’t work few months later.

This new trend of software defined something that has blessed the storage field and now we’ve got roadmaps fully based on software’s version/update releases instead of on new fresh hardware’s components. In fact, storage’s hardware is turning out in commodity assets for the most companies and software is using new versions of it almost transparently – in some cases you will just need to compile your code again –

Ok then, what do I have to choose storing my private cloud servers’ data? Or maybe the correct question is, what do I have to choose to efficiently store, care and manage my private cloud’s data?

ceph nutanix emc vipr vnx vmware virtual san openstack scaleout block files object storage performance dedup cinder swift glance comparison

The eternal comparison between local disks and networked consolidated storage

Store data is the easiest part. You can use local disks devices in every server to store your private cloud’s data – like some Clouders do to bring the lowest prices – However, there are more storage features in the market that bring a remarkable difference to your business.

Why to use a networked storage subsystem instead of  local disks devices? Well, the first word comes up to my mind is redundancy. In fact, this storage subsystem could be composed by local disk devices in every server in a fully converged cloud architecture – see my previous notes to know what I mean – But, the fact is you really need an independent piece of software that can take care, right away, on the data – and the services that depend on it – when a failure comes up. You need a storage subsystem that can protect this data by moving or copying it to other disks/servers and hopefully without any disruption to users or applications. The other reason is cost; what if your storage requirement exceeds the available capacity in the server’s disks… you will buy other server just to get more storage capacity? I don’t think so; you will rely on the storage capacity and performance management features that our precious storage subsystem brings.

Then, you will choose something that can bring redundancy – hopefully without any disruption against most common failures – and also capacity and performance management – and additionally cost savings through features like compression, thin-provisioning and de-duplication –

Solutions like Ceph, ViPR (ScaleIO included), SolidFire, Nutanix and VMWare Virtual SAN offers performance and capacity scale-out and distributed redundancy at different levels. However, Ceph is limited to bring additional storage savings through features like de-duplication and compression. Ceph brings thin provisioning, and you can get dedup and compression using btrfs as the file systems for your storage nodes, but btrfs are not recommended to production data …yet. Ceph also doesn’t bring a consolidated dashboard out-of-the-box to simplify the operation. Ceph helps you to reduce your costs by not paying any license cost for its usage. You have to figure it out the trade-off between paying licenses and build your own management software.

HyperVisor’s Datastores versus Networked Storage products

Why do you need a storage subsystem with APIs to connect orchestration tools? Why not to use just the hypervisor’s datastore definition to manage the storage resources directly? IMHO if you need to take thousands of snapshots of your cloud data, a datastore is more a capacity/performance issue than a help; if you want to scale-out to 100s of TBs, a datastore is a clumsy object on the road; if you want to troubleshoot a performance issue, a datastore could turn it out in a nightmare.

Datastores were thought to manage bunches of dumb disks. You deserve more, don’t you?

ViPR and Nutanix are the most matured solutions to work with any known hyper-visor (VMWare, Hyper-V, Xen, KVM). I dare to say that SolidFire and Ceph bring the flexibility to adapt themselves to any of these hyper-visors; it requires additional work and testing.

Orchestration software defines storage’s choice

If your cloud is composed by a full set of services highly orchestrated and automated – I mean, not just a bunch of virtual servers – You will find several options for cloud orchestration and automation tools in the market; I will use OpenStack for my own reference through out this note. Then you need something plenty of APIs supporting a seamlessly integration with your cloud’s software – or called commonly a software defined storage product –

You have to be 100% sure the vendor or community that supports this storage product is committed to keep it updated with the future releases of your cloud’s software of choice:  i.e. OpenStack, they (vendors or communities) have to be testing at this time the version Juno and components like Cinder, Swift, Glance and Nova.

I would say SolidFire is the most mature product for Cinder and Nova. Nutanix and ViPR are just starting this journey with OpenStack and supporting other components like Swift and Glance; and Ceph requires work and a Company like KIO, with the skills and the experience, to support this integration. There are also others companies bringing solutions like HDS, EMC VNX and VMax, Dell Equalogic, HP 3Par and LeftHand, Huawei, IBM and NetApp (full list see Cinder Support Matrix ), solutions that I am not covering through out this note because not all of them are providing capabilities like scale-out in performance and capacity; and the flexibility to work using innovative architectures like a fully converged one, also someones are too expensive to be considered on services managed by Clouders like us. GlusterFS is a nice solution but its lack of built-in features like replication and block access like Ceph, then I put it out of this comparison … for now.

Applications set your IO storage profile

Are you building a cloud to support a web portal with a low, medium or high amount of concurrent user sessions? Or probably you just want to archive millions of images for a long time with the lowest cost. Well, applications define what will be the IO profile and therefore the storage requirements. Maybe you will need just an object storage based on ATA disks or your portal could be so sophisticated that an unified solution that can bring objects, files and blocks should be the best option. Some data could be highly transactional requiring a very low latency response and randomly reading the 80% of the time, then you will need SSD and/or Flash or a mix of different disk types.

Storages require flexibility to fit into the IO application profile. I mean It is not just objects to support orchestration components, it’s the best way to attend your application IO requirements.

Ceph supports different types of access and could be set it up to fit different IO requirements changing the block size, or the object size, or adding cache or tiering through faster disk devices. You could define an object/file storage on SATA disks and a block storage based on SSD/Flash. ViPR today could manage similar options. Nutanix is thinking to decouple their hardware from software to bring more flexibility.  SolidFire is still tied to their appliances, and this lack of flexibility could limit its scope.

Flexibility brings more options to get the performance that fits your applications IO requirements, and at the right cost.

See you around!

3 thoughts on “What do I have to choose to efficiently store, protect and manage my private cloud’s data?

  1. Mauricio,
    Great post, I have a few things to add. First when purchasing storage, it is important to think about the five to seven year plan for all instances rather than just a single point solution for a single instance. Most regrets come from taking the “Tylenol” and purchasing a box solely for performance for a point solution then eventually it wears off. To mitigate that here is what I would look for:

    1. Linear scale-out of performance and capacity so that they system can grow as the number of instances grow. Imperative that scale-out is done without disruption and minimal performance impact. Also important to scale-back, with most companies having multiple data centers the value of being able to take performance and capacity from one data center to another become key over the long run.

    2. Volume level QoS to enable multiple applications to run on the same storage array. The main benefit is that, similar to hypervisors provisioning vCPU’s and vRAM, with volume level QoS multiple different workloads can run right next to each other without the risk of noisy neighbors.

    3. Global in-line efficiencies of de-duplication, compression and thin-provisioning. Huge difference between on/off (typically if it can be turned off then there is a reason for it), done a per volume basis (small de-duplication range), post process (performance hit) and global in-line.

    4. Built from the ground up around a REST based API so that everything can be scripted and automated. In addition most systems that are API native have deep integrations with VMWare, OpenStack and CloudStack which protects you once those initiatives going forward.

    If the storage system satisfies those four pieces then you will be future proofed going forward.

  2. I have to thank you for the efforts you have
    put in writing this website. I am hoping to view the same high-grade blog posts by you in the future as well.
    In truth, your creative writing abilities has inspired me
    to get my very own site now 😉

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s