File-based backup of a vCenter Server using VAMI fails due to the vmware-postgres-archiver service not running
search cancel

File-based backup of a vCenter Server using VAMI fails due to the vmware-postgres-archiver service not running

book

Article ID: 307180

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • File-based backup using the vCenter VAMI fails due to not all mandatory services being started
  • When reviewing the current service status using the service-control command, you find that vmware-postgres-archiver is stopped:
    # service-control --status --all
    Running:
     applmgmt lookupsvc lwsmd observability observability-vapi pschealth vc-ws1a-broker vlcm vmafdd vmcad vmdird vmware-analytics 
     vmware-certificateauthority vmware-certificatemanagement vmware-cis-license vmware-content-library vmware-eam vmware-envoy vmware-envoy-hgw 
     vmware-envoy-sidecar vmware-hvc vmware-infraprofile vmware-perfcharts vmware-pod vmware-rbd-watchdog vmware-rhttpproxy vmware-sca vmware-sps 
     vmware-stsd vmware-topologysvc vmware-trustmanagement vmware-updatemgr vmware-vapi-endpoint vmware-vdtc vmware-vmon vmware-vpostgres 
     vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-ui vstats vtsdb wcp
    Stopped:
     vmcam vmonapi vmware-imagebuilder vmware-netdumper vmware-postgres-archiver vmware-vcha
  • When attempting to start the service, it fails
  • Reviewing /var/log/vmware/vpostgres/pg_archiver.log.stderr you find errors similar to below:
    Starting service process with pid: 37973.
    <timestamp> DEBUG pg_archiver Updated startup LSN using segment file "0000000100000000000000E0.gz"
    <timestamp> DEBUG pg_archiver Updated startup LSN using segment file "0000000100000000000000ED.gz"
    <timestamp> DEBUG pg_archiver Updated startup LSN using segment file "0000000100000000000000FD.gz"
    <timestamp> DEBUG pg_archiver Updated startup LSN using segment file "000000010000000100000015.gz"
    <timestamp> DEBUG pg_archiver Updated startup LSN using segment file "000000010000000100000018.gz"
    <timestamp> DEBUG pg_archiver Updated startup LSN using segment file "00000001000000010000001E.gz"
    <timestamp> DEBUG pg_archiver Updated startup LSN using segment file "00000001000000010000001F.gz"
    <timestamp> DEBUG pg_archiver Updated startup LSN using segment file "000000010000000100000020.gz.partial"
    <timestamp> DEBUG pg_archiver starting log streaming at 1/20000000 (timeline 1)
    <timestamp> ERROR pg_archiver unexpected termination of replication stream: ERROR: requested WAL segment 000000010000000100000020 has already been removed
    <timestamp> ERROR pg_archiver disconnected

 

 

Environment

  • VMware vCenter Server Appliance 6.7
  • VMware vCenter Server Appliance 7.0.x

Cause

File in /storage/archive/vpostgres/ already deleted or continuity broken.

Resolution

Before making an attempt at following the steps below, please make sure that a fresh backup or offline snapshot of the vCenter Server Appliance (VCSA) in powered off state exists. If the VCSA is part of an Enhanced Linked Mode (ELM) replication group, please be aware that offline snapshots of all ELM nodes are required to enable a successful rollback if anything goes wrong.

 

To resolve this issue, manually remove the replication slot from the postgres database.

  1. Run this query to know the slot name (by default vpg_archiver):
    # /opt/vmware/vpostgres/current/bin/psql -U postgres -d VCDB -c "select * from pg_replication_slots;"
    slot_name | plugin | slot_type | datoid | database | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
    --------------+--------+-----------+--------+----------+--------+------------+------+--------------+-------------+---------------------
    vpg_archiver | | physical | | | f | | | | 0/DB000000 |
    (1 row)
  2. Run this command to delete it (fill the slot name with the result of last query):
    # /opt/vmware/vpostgres/current/bin/psql -U postgres -d VCDB -c "select pg_drop_replication_slot('vpg_archiver');"
    pg_drop_replication_slot
    --------------------------
    (1 row)
  3. Run this command to remove all segments on /storage/archive/vpostgres directory:
    # rm /storage/archive/vpostgres/*
  4. Run this command to start the vmware-postgres-archiver service:
    # /bin/service-control --start vmware-postgres-archiver