
Updated the nccli string to utilscli to avoid AT&T specific Network cloud terminology. Change-Id: I8dae02559a422dab0bdb8007daaa4f86a67f087e
65 lines
1.7 KiB
Markdown
65 lines
1.7 KiB
Markdown
# Ceph Maintenance
|
||
|
||
This MOP covers Maintenance Activities related to Ceph.
|
||
|
||
## Table of Contents ##
|
||
|
||
<!-- TOC depthFrom:1 depthTo:6 withLinks:1 updateOnSave:1 orderedList:0 -->
|
||
|
||
- Table of Contents
|
||
- 1. Generic commands
|
||
- 2. Replace failed OSD
|
||
|
||
## 1. Generic Commands ##
|
||
|
||
### Check OSD Status
|
||
To check the current status of OSDs, execute the following:
|
||
|
||
```
|
||
utilscli osd-maintenance check_osd_status
|
||
```
|
||
|
||
### OSD Removal
|
||
To purge OSDs in down state, execute the following:
|
||
|
||
```
|
||
utilscli osd-maintenance osd_remove
|
||
```
|
||
|
||
### OSD Removal By OSD ID
|
||
To purge OSDs by OSD ID in down state, execute the following:
|
||
|
||
```
|
||
utilscli osd-maintenance remove_osd_by_id --osd-id <OSDID>
|
||
```
|
||
|
||
### Reweight OSDs
|
||
To adjust an OSD’s crush weight in the CRUSH map of a running cluster, execute the following:
|
||
|
||
```
|
||
utilscli osd-maintenance reweight_osds
|
||
```
|
||
|
||
## 2. Replace failed OSD ##
|
||
|
||
In the context of a failed drive, Please follow below procedure. Following commands should be run from utility container
|
||
|
||
Capture the failed OSD ID. Check for status `down`
|
||
|
||
utilscli ceph osd tree
|
||
|
||
Remove the OSD from Cluster. Replace `<OSD_ID>` with above captured failed OSD ID
|
||
|
||
utilscli osd-maintenance osd_remove_by_id --osd-id <OSD_ID>
|
||
|
||
Remove the failed drive and replace it with a new one without bringing down the node.
|
||
|
||
Once new drive is placed, delete the concern OSD pod in `error` or `CrashLoopBackOff` state. Replace `<pod_name>` with failed OSD pod name.
|
||
|
||
kubectl delete pod <pod_name> -n ceph
|
||
|
||
Once pod is deleted, kubernetes will re-spin a new pod for the OSD. Once Pod is up, the osd is added to ceph cluster with weight equal to `0`. we need to re-weight the osd.
|
||
|
||
utilscli osd-maintenance reweight_osds
|
||
|