Recover From VCSA 6.5 503 Service Unavailable

I noticed that my DR VCSA was down when I could not reach it from the Production VCSA in enhanced link mode.

I tried to connect directly to it and received the error:

503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x00007f9b48016e80] _serverNamespace = / 
action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe)

I SSH’ed to the DR VCSA and checked volume space with “df -h” and found /storage/seat was full.

I drilled into each directory until i found the reason this volume was filling up. It looks to be an events table for the posgres DB so i will expand this volume from the host that the VCSA lives on.

Once you expand the drive you can run the command to auto grow “vpxd_servicecfg storage lvm autogrow” and then “df -h” to verify the volume shows free space.

Then restart the VCSA or restart all services with “service-control –start –all”

Now that vCenter is back up was able to edit the events and tasks database settings to make sure they are set to enable for auto cleanup of old tasks.

I wasn’t seeing much space reclaimed so I will manually clean up the database with a script VMware has provided.

To prune the old task and events right away then get the script from https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2110031

Use WinSCP to copy the script to the /tmp directory. Here is link on using WinSCP with VCSA 6.5.

This is what i ran and i chose to prune down to 3 days because it is a lab and i wanted to see a significant change.

/opt/vmware/vpostgres/current/bin/psql -U postgres -v TaskMaxAgeInDays=3 -v EventMaxAgeInDays=3 -v StatMaxAgeInDays=15 -d VCDB -t -q -f /tmp/2110031_Postgres_task_event_stat.sql

Here is the output from running the script to cleanup the DB:

root@DR-VCN-01 [ /opt/vmware/vpostgres/current/bin ]#
res_task_event_stat.sqlmware/vpostgres/current/bin ]# /opt/vmware/vpostgres/current/bin/psql -U postgres -v TaskMaxAgeInDays=3 -v EventMaxAgeInDays=3 -v StatMaxAgeInDays=15 -d VCDB -t -q -f /tmp/2110031_Postg
psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:

424: WARNING: Database cleanup may take long time depends from size of VPX_TASK, VPX_EVENT, VPX_SAMPLE_TIME1, VPX_SAMPLE_TIME2, VPX_SAMPLE_TIME3, VPX_SAMPLE_TIME4 and all VPX_HIST_STATx_y tables at 2017-10-22 04:41:41.386445+00


psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:427: WARNING: Starting clean up tasks at 2017-10-22 04:41:41.38707+00


psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:432: WARNING: Starting clean up events at 2017-10-22 04:41:41.730353+00


psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:438: WARNING: Starting clean up statistics level 1. at 2017-10-22 04:43:30.215752+00


psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:442: WARNING: Starting clean up statistics level 2. at 2017-10-22 04:43:30.226072+00


psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:446: WARNING: Starting clean up statistics level 3. at 2017-10-22 04:43:30.226904+00


psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:450: WARNING: Starting clean up statistics level 4. at 2017-10-22 04:43:30.308779+00


psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:454: WARNING: Starting clean up orpan data at 2017-10-22 04:43:30.370942+00


psql.bin:/tmp/2110031_Postgres_task_event_stat.sql:471: WARNING: Done at 2017-10-22 04:43:50.255839+00


root@DR-VCN-01 [ /opt/vmware/vpostgres/current/bin ]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 4.9G 0 4.9G 0% /dev
tmpfs 4.9G 16K 4.9G 1% /dev/shm
tmpfs 4.9G 676K 4.9G 1% /run
tmpfs 4.9G 0 4.9G 0% /sys/fs/cgroup
/dev/sda3 11G 5.0G 5.2G 50% /
tmpfs 4.9G 3.0M 4.9G 1% /tmp
/dev/mapper/netdump_vg-netdump 985M 1.3M 932M 1% /storage/netdump
/dev/sda1 120M 28M 87M 24% /boot
/dev/mapper/autodeploy_vg-autodeploy 12G 26M 12G 1% /storage/autodeploy
/dev/mapper/imagebuilder_vg-imagebuilder 17G 28M 16G 1% /storage/imagebuilder
/dev/mapper/core_vg-core 25G 45M 24G 1% /storage/core
/dev/mapper/dblog_vg-dblog 15G 839M 14G 6% /storage/dblog
/dev/mapper/updatemgr_vg-updatemgr 99G 404M 93G 1% /storage/updatemgr
/dev/mapper/db_vg-db 14G 100M 13G 1% /storage/db
/dev/mapper/seat_vg-seat 32G 164M 30G 1% /storage/seat
/dev/mapper/log_vg-log 15G 612M 14G 5% /storage/log

root@DR-VCN-01 [ /opt/vmware/vpostgres/current/bin ]#

It took around 2 minutes to prune 27GB from the database.

After running the script and checking free space with df -h again it shows only 1% used after cleanup.

The post Recover from VCSA 6.5 503 Service Unavailable first appeared on Software Defined Blog.