Swapping a failed ceph osd drive﻿

Swapping a failed ceph osd drive

8/28/2016

In the world of computing, there are some things which, in Dustin's words, "is not good software." For the past 6 months, I've been messing around with Ceph for object storage. Luckily for me, it is good software. However, it has bad documentation. Sometimes bordering horrible. This is the sorry state of opensource software these days (ever try an use a Hashicorp product lately?). While most things in Ceph are fairly automated, swapping an OSD's underlying storage device is not. Seems like a scale-out storage product would want to nail down the simple act of swapping a dead hard drive, right?

Sadly, the official documentation for swapping a drive is...long and not admin friendly. Seriously Ceph team, this is your suggested process? Red Hat is now Ceph's corporate overlord and even they think that this is an arduously bad process. Look, they even made a bug on it: https://bugzilla.redhat.com/show_bug.cgi?id=1210539

Luckily, I've figured out a much easier way. Here it is:

1. On the storage node, find the OSD # to drive letter mapping (sdq is the dying drive):

2. On the storage node, stop the OSD:

3. On the storage node, replace the drive and check its (new) drive letter in dmesg.

4. On a ceph mon node, convert the OSD ID into an OSD UUID:

5. On a ceph mon node, remove the old OSD auth key:

6. On the storage node, prepare the new OSD drive (/dev/sdq here). Note that this uses the cluster ID. You can find your cluster ID by running "ceph -s":

7. On the storage node, activate the new volume (using systemd in this cluster):

After that, your OSD will start rebuilding PG's. No need to mess with the CRUSH map. Just swap the drive, remove the old key, prepare drive with old OSD UUID, start OSD, done.