docs/doc/source/fault-mgmt/800-series-alarm-messages.rst
Stone 037c99f0b7 Fault Management doc
Added Data Networks toctree

Changed case on doc title in top level index - changed doc directory to
fault-mgmt.

Added Distributed Cloud section.

Broke out "OpenStack Fault Management Overview" statement about remote log
collection to conditionally included file.

Incorporated patch 6 review comments. Also implemented rST :abbr:
for first instance of SNMP in each file.

Changed port number and community string in two SNMP walk examples.

Change-Id: I1afd71265e752c4c9a54bf2dc9a173b3e17332a7
Signed-off-by: Stone <ronald.stone@windriver.com>
2020-11-27 14:13:00 -05:00

152 lines
3.9 KiB
ReStructuredText

.. rww1579702317136
.. _800-series-alarm-messages:
=========================
800 Series Alarm Messages
=========================
The system inventory and maintenance service reports system changes with
different degrees of severity. Use the reported alarms to monitor the overall
health of the system.
.. include:: ../_includes/x00-series-alarm-messages.rest
.. _800-series-alarm-messages-table-zrd-tg5-v5:
.. list-table::
:widths: 6 15
:header-rows: 0
* - **Alarm ID: 800.001**
- Storage Alarm Condition:
1 mons down, quorum 1,2 controller-1,storage-0
* - Entity Instance
- cluster=<dist-fs-uuid>
* - Degrade Affecting Severity:
- None
* - Severity:
- C/M\*
* - Proposed Repair Action
- If problem persists, contact next level of support.
-----
.. list-table::
:widths: 6 15
:header-rows: 0
* - **Alarm ID: 800.003**
- Storage Alarm Condition: Quota/Space mismatch for the <tiername> tier.
The sum of Ceph pool quotas does not match the tier size.
* - Entity Instance
- cluster=<dist-fs-uuid>.tier=<tiername>
* - Degrade Affecting Severity:
- None
* - Severity:
- m
* - Proposed Repair Action
- Update ceph storage pool quotas to use all available tier space.
-----
.. list-table::
:widths: 6 15
:header-rows: 0
* - **Alarm ID: 800.010**
- Potential data loss. No available OSDs in storage replication group.
* - Entity Instance
- cluster=<dist-fs-uuid>.peergroup=<group-x>
* - Degrade Affecting Severity:
- None
* - Severity:
- C\*
* - Proposed Repair Action
- Ensure storage hosts from replication group are unlocked and available.
Check if OSDs of each storage host are up and running. If problem
persists contact next level of support.
-----
.. list-table::
:widths: 6 15
:header-rows: 0
* - **Alarm ID: 800.011**
- Loss of replication in peergroup.
* - Entity Instance
- cluster=<dist-fs-uuid>.peergroup=<group-x>
* - Degrade Affecting Severity:
- None
* - Severity:
- M\*
* - Proposed Repair Action
- Ensure storage hosts from replication group are unlocked and available.
Check if OSDs of each storage host are up and running. If problem
persists contact next level of support.
-----
.. list-table::
:widths: 6 15
:header-rows: 0
* - **Alarm ID: 800.102**
- Storage Alarm Condition:
PV configuration <error/failed to apply\> on <hostname>.
Reason: <detailed reason\>.
* - Entity Instance
- pv=<pv\_uuid>
* - Degrade Affecting Severity:
- None
* - Severity:
- C/M\*
* - Proposed Repair Action
- Remove failed PV and associated Storage Device then recreate them.
-----
.. list-table::
:widths: 6 15
:header-rows: 0
* - **Alarm ID: 800.103**
- Storage Alarm Condition:
\[ Metadata usage for LVM thin pool <VG name>/<Pool name> exceeded
threshold and automatic extension failed
Metadata usage for LVM thin pool <VG name>/<Pool name> exceeded
threshold \]; threshold x%, actual y%.
* - Entity Instance
- <hostname>.lvmthinpool=<VG name>/<Pool name>
* - Degrade Affecting Severity:
- None
* - Severity:
- C\*
* - Proposed Repair Action
- Increase Storage Space Allotment for Cinder on the 'lvm' backend.
Consult the user documentation for more details. If problem persists,
contact next level of support.
-----
.. list-table::
:widths: 6 15
:header-rows: 0
* - **Alarm ID: 800.104**
- Storage Alarm Condition:
<storage-backend-name> configuration failed to apply on host: <host-uuid>.
* - Degrade Affecting Severity:
- None
* - Severity:
- C\*
* - Proposed Repair Action
- Update backend setting to reapply configuration. Consult the user
documentation for more details. If problem persists, contact next level
of support.