Remove stale/duplicate SRM entries from PSC 2.0

Things to know.

  1. SRM solution user should match the VRUI user. See example

/usr/lib/vmidentity/tools/scripts/lstool.py list –url http://localhost:7080/lookupservice/sdk –type VCDR

RESULT SRM-5abc29cb-fc9a-46b4-a157-72be62820b34@vsphere.local

/usr/lib/vmidentity/tools/scripts/lstool.py list –url http://localhost:7080/lookupservice/sdk –type VRUI

RESULT Service ID: h5-dr-5abc29cb-fc9a-46b4-a157-72be62820b34

The IP addresses should also match.

To remove a stale entry

/usr/lib/vmidentity/tools/scripts/lstool.py unregister –url http://localhost:7080/lookupservice/sdk –id h5-dr-5abc29cb-fc9a-46b4-a157-72be62820b34 –user ‘administrator@vsphere.local’ –password ‘VMware123!’ –no-check-cert

For all the other commands including the windows commands please checkout my original post

https://paulsbestscriptlets.com/2018/02/27/remove-stale-srm-entries-from-psc/

Please find commands for 7.0 below:

/usr/lib/vmware-lookupsvc/tools/lstool.py list –url http://localhost:7090/lookupservice/sdk  –type com.vmware.vcDr
/usr/lib/vmware-lookupsvc/tools/lstool.py list –url http://localhost:7090/lookupservice/sdk  –type VCDR
/usr/lib/vmware-lookupsvc/tools/lstool.py list –url http://localhost:7090/lookupservice/sdk  –type VrUi

/usr/lib/vmware-lookupsvc/tools/lstool.py unregister –url http://localhost:7090/lookupservice/sdk –id ‘Service_ID’ –user ‘administrator@vsphere.local’ –password ‘password’ –no-check-cert 

Set NTP on all VMs ESXi

More and more we see NTP of VMs is getting overlooked. Now we have COVID-19 Infra and System admins need to have a NTP strategy that is robust and does not impact production.

Here is a powercli method to set ntp on all VMs to get from ESXi hosts.

In this blog I advise the prerequisite step of how to set NTP on ESXi.

https://paulsbestscriptlets.com/2018/07/05/ntp-check-with-powercli/

Save file as TimeSync.ps1Run from powercli. 
.\TimeSync.ps1

Step 1. 
Create the temp folder on your Windows machine and generate a CSV for the VMs in vCenter.   
Get-View -ViewType virtualmachine | Select name,@{N=’ToolsConfigInfo’;E={$_.Config.Tools.syncTimeWithHost } } | Export-Csv  “c:\temp\dr\TimeSyncStatus.csv” -NoTypeInformation
Step 2 
Create the powercli script. 

Copy all text below this line
 – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
Import-Csv c:\temp\TimeSyncStatus.csv | foreach {     $strNewVMName = $_.Name        #Update VMtools without reboot $vm = $strNewVMName$vmtest = Get-vm $vm | get-view$vmConfigSpec = New-Object VMware.Vim.VirtualMachineConfigSpec$vmConfigSpec.Tools = New-Object VMware.Vim.ToolsConfigInfo$vmConfigSpec.Tools.syncTimeWithHost = $true$vmtest.reconfigVM($vmConfigSpec)
}

  – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – -Copy all text above this line
Step 3
Save file as TimeSync.ps1Run from powercli. 
.\TimeSync.ps1

The VCIX Journey.

Please read the article below to see how I managed to clear the VCAP Deploy exam.   Both offer a different set of challenges.  The VCAP deploy is all hands on and navigation of vCenter and deep diving into creating and managing storage networking vMotion alarms RPO and backups.

https://paulsbestscriptlets.com/2017/07/17/my-vcap-6-deploy-experience/

The VCAP Design was a different beast and proved very challenging.   The exam is heavily weighted towards project management and getting to terms with the best design decisions which can be made for a wide range of scenarios.  While I was studying, I found the vBrownbag video series invaluable.  

Here is the blue print I created in an easy to follow org chart.

It is very easy to go wrong in the exam.   It will take you a full 2 hours to complete.   I had 20 minutes to spare at the end as this was the second time I have taken the exam.  If you have finished the exam after 20 minutes then you have likely failed.   Each question I gave a comment for each part of the question and gave my reasons for making the choice.    So you will be taking your time reading and rereading each question and justifying your answers.    The training material was a massive amount of learning and it took me places I never would have gone: ).  Overall I got great benefit and insight from the exam. 

My preparation had taken me to a state where I couldn’t have done much more so I was quietly confident.

On to the VCDX dare I say it !!! 😊

Tampered Keystore vSphere Replication 8.x

This is a common issue and happens when we get a certain combination of vCenter 6.5 and most notably vSphere Replication 8.1.0.

To resolve the issue follow the steps in this article. However there is a twist.

The twist is the old password will actually be ‘vmware’ for step 3 and 4 and it will be the old password from command one on step 8

sed -i — ‘s/old_password/new_password/g’ /var/opt/apache-tomcat/conf/server.xml

reboot a few times and ensure you can search the inventory with the fqdn. Adjust the hostname to match.

https://docs.vmware.com/en/vSphere-Replication/8.1/com.vmware.vsphere.replication-admin.doc/GUID-0481E271-A990-427E-AFE0-7345EB7B489E.html

To change the password for the hms-keystore.jks keystore, open the remote console of your vSphere Replication virtual machine and log in as root.
Obtain the current keystore password.
# /opt/vmware/hms/bin/hms-configtool -cmd list | grep keystore
Example of the output hms-keystore-password = old_password
Change the keystore password.
# /usr/java/default/bin/keytool -storepasswd -storepass old_password -new new_password -keystore /opt/vmware/hms/security/hms-keystore.jks
Change the vSphere Replication appliance private key password.
The following command is a long, single command and must be run at once. There are breaks in the command for better visibility. Verify that the command returns a success message.
# /usr/java/default/bin/keytool -keypasswd -alias jetty -keypass
old_password -new new_password -storepass new_password -keystore
/opt/vmware/hms/security/hms-keystore.jks
Update the configuration with the new password.
/opt/vmware/hms/bin/hms-configtool -cmd reconfig -property ‘hms-keystore-password=new_password’
Update the tomcat server.xml file with the new password.
sed -i — ‘s/old_password/new_password/g’ /var/opt/apache-tomcat/conf/server.xml
Reboot the appliance for the changes to take effect.
# reboot
Use a supported browser to log in to the vSphere Replication VAMI.
The URL for the VAMI is https:// vr-appliance-address:5480.
On the VR tab, click Configuration, and click Save and Restart Service.

SRM Health Check

The health check for SRM involves the following. 

 

  • Check protection groups for errors.
  • Check recovery plans:
  • Validate the resource mappings.  Ensure they include ESXi resource mappings.
  • Validate the SRA replications
  • Validate the NTP settings
  • Validate port settings
  • Run a test plan

Figure 1 )

 

Invalid VMs, check SRM settings. 

f1

f2

Figure 2 )

f3

f4

f5

Figure 3 )

 

Validate the resource mappings.  Ensure they include ESXi resource mappings. 

f7

f8

 

SRM 8.X test recovery. Planned migration and recovery, cleanup and reprotect steps.

recoverysteps

 

 

Figure 4 )

 

Validate the SRA replications

Figure 5 )

 

Validate the NTP settings

 

Ref: https://paulsbestscriptlets.wordpress.com/2018/07/05/ntp-check-with-powercli/

 

 

One-liners to speed up troubleshooting common NTP related issues. 

Connect-ViServer <vCenter address> -user administrator@vsphere.local -password VMware123!

Check NTP Server is configured.

Get-VMHost |Sort Name|Select Name, @{N=“NTPServer“;E={$_ |Get-VMHostNtpServer}}, @{N=“ServiceRunning“;E={(Get-VmHostService -VMHost $_ |Where-Object {$_.key-eq “ntpd“}).Running}}

Check timestamp for ESXi host.

foreach($esxcli in get-vmhost|get-esxcli){“”|select @{n=’Time’;e={$esxcli.system.time.get()}},@{n=’hostname’;e={$esxcli.system.hostname.get().hostname}} }

Set time manually on esxi host. 

Directly on host: esxcli system time set -H 03 -m 29 -s 40

For all hosts.

$esxlist = get-vmhost

foreach ($_ in $esxlist) {$esxcli = Get-EsxCli -VMHost $_; $esxcli.system.time.get();}

foreach($_ in $esxlist){$esxcli = Get-EsxCli -VMHost $_; $esxcli.system.time.set(10,09,53,01,20,2019); }

/// Parameters are ///

(day,hour,minute,month,second,year)

 

Figure 6 )

 

Validate port settings

 

http://vmwaresrmguru.blogspot.com/

 

 

https://kb.vmware.com/s/article/1009562

 

 

 

Figure 7 )

 

Run a test plan

 

VMware Tools is it fully installed and what happens during a Site Recovery Test Failover?

Here is an example of a failed recovery test recovery plan.

018-09-12T14:03:08.744-03:00 error vmware-dr[00352] [SRM@6876 sub=Recovery ctxID=652c898a opID=5688b3ef-4d94-4134-aa1d-a61e19733b64:f43d:6d7e:a4ba:1a65:823:2d26] [df57903e-fbaa-409e-adfb-024e004a9ff9.failoverOrchJob] IP customization failed for VM vmName01 [vm-318]: (vmodl.fault.SystemError) {
–> faultCause = (vmodl.MethodFault) null,
–> faultMessage = <unset>,
–> reason = “vix error codes = (1, 2).
–> ”
–> msg = “Received SOAP response fault from [<cs p:000002bfc3b152c0, TCP:dr-vcenter-01:443>]: createTemporaryDirectory
–> A general system error occurred: vix error codes = (1, 2).
–> ”
–> }
–> [context]zKq7AVMEAAgAAIIDkgAPdm13YXJlLWRyAAAKPAJ2bWFjb3JlLmRsbAAB9U0Adm1vbWkuZGxsAAF5nAEC7AsDZHItdm1vbWkuZGxsAAN3EGZkci1yZWNvdmVyeS5kbGwAAztfsgNpOxkENvgFZnVuY3Rpb25hbC5kbGwAAJsgGwD7LxsAiSQhBYdPAk1TVkNSMTIwLmRsbAAFLlECBmSDAEtFUk5FTDMyLkRMTAAHsXAGbnRkbGwuZGxsAA==[/context]
–> [backtrace begin] product: VMware vCenter Site Recovery Manager, version: 8.1.0, build: build-9569154, tag: vmware-dr, cpu: x86_64, os: windows, buildType: release
–> backtrace[03] vmacore.dll[0x00023C0A]
–> backtrace[04] vmomi.dll[0x00004DF5]
–> backtrace[05] vmomi.dll[0x00019C79]
–> backtrace[06] dr-vmomi.dll[0x00030BEC]
–> backtrace[07] dr-recovery.dll[0x00661077]
–> backtrace[08] dr-recovery.dll[0x00B25F3B]
–> backtrace[09] dr-recovery.dll[0x00193B69]
–> backtrace[10] functional.dll[0x0005F836]
–> backtrace[11] vmacore.dll[0x001B209B]
–> backtrace[12] vmacore.dll[0x001B2FFB]
–> backtrace[13] vmacore.dll[0x00212489]
–> backtrace[14] MSVCR120.dll[0x00024F87]
–> backtrace[15] MSVCR120.dll[0x0002512E]
–> backtrace[16] KERNEL32.DLL[0x00008364]
–> backtrace[17] ntdll.dll[0x000670B1]
–> [backtrace end]

In SRM-UI only “A general system error occurred: vix error codes = (1, 2).” is shown.

 

 

VIX is the failback authentication process and it does not always trigger correctly due to timeouts and other issues that might be happening in vCenter at the time of the test failover.

Below is what I expect to see when VMware Authentication service installed.

It should look like this.

Image_2019-06-14_09-25-03.png

Linux

Capture

Checking other articles the service does not install by default and during the installer you need to install the extra service.  Here is another article where the .exe gets removed during a VMware tools upgrade !

https://kb.vmware.com/s/article/2144610

It becomes very complex if we go deeper down such as the thread below.

https://stackoverflow.com/questions/14888390/the-vmware-authorization-service-is-not-running

The takeaways are SRM is super at checking if VMware Tools is installed correctly and what we should check in similar scenarios.  

Bulk IP Customization

Many times it’s not practical to modify the IP addresses of every individual VM as they are configured.  Luckily VMware has provided a way to bulk upload IP addresses.

From an SRM server, open a command prompt and change the working directory to:  c:Program FilesVMwareVMware vCenter Site Recovery Managerbin

NOTE: Path may be different depending on your install location.

SRM-IPCustom1

 

Generate a .CSV file to edit your IP Addresses by running dr-ip-customizer.exe with the –cfg, –cmd –vc -i –out switches.

–cfg should be the location of the vmware-dr.xml file.  –cmd should be “Generate”, –vc lists the vCenter server, and –out lists the location to generate the .csv file.

Example: dr-ip-customizer.exe –cfg “C:Program filesVMwareVMware vCenter Site Recovery ManagerConfigvmware-dr.xml” –cmd generate –vc FQDNofvCenter -i –out c:ipaddys.csv

SRM-IPCustom2

 

 

 

Open the .csv file and fill out the information.  Notice that there are two entries for the VM.  This is because there are two vCenters and in order to do protection and fail back we need the IP Addresses for both sides.

SRM-IPCustom3

 

Once the IP Address information is entered, run the customizer again with the –cmd “Apply” and –CSV file location.

Example: dr-ip-customizer.exe –cfg “C:Program filesVMwareVMware vCenter Site Recovery ManagerConfigvmware-dr.xml” –cmd apply –vc FQDNofvCenter -i –csv c:ipaddys.csv

SRM-IPCustom4

Configuring VMware Switch for NLB

For all those using Load balancing with VMware, you will need to configure a NLB portgroup to avoid port flooding. To prevent RARP packets being sent every time vMotion or powering on a VM, you will need to configure no notify switch on the required portgroups. If you are using unicast, you will also need to set the Security Policy Forged Transmit to Accept.

Vmware_NLB1

Vmware_NLB2

Vmware_NLB3

 

 

 

Nice article Ryan.

 

Configuring VMware Switch for NLB

 

 

How to create a vSAN iSCSI LUN

10 Ways to Troubleshoot Poor vSphere Performance

25 Sep 2017 by Jason Fenech

A poorly performing virtual machine is probably one of the topmost ailments you’ll bump into as a virtualization admin. The issue is also one of the hardest nuts to crack due to its multifaceted nature. Regardless, there are a number of things you can do to read the symptoms, narrow down the cause and apply a fix. Taking hints from this VMware KB, today we explore 10 ways to troubleshoot poor vSphere performance. Have a look at this Altaro webinar for a more holistic approach to boosting vSphere performance.

 

Performance Monitoring


Before we move on, let’s revisit the Performance monitoring function embedded in vSphere clients. This is one tool that will help you examine performance related issues. Figure 1 shows a performance chart for datastore read and write latencies for a VM using the vSphere Web client. The 3ms peaks observed are well within the acceptable range, however, sustained levels exceeding 10ms are indicative of a looming storage issue or perhaps network congestion.

Use alarms wherever possible so you’re always on top of any performance issue. Alternatively, consider deploying vRealize Operations Manager for a more in-depth assessment of your environment.

Figure 1 - Performance graph for datastore read and write latencies
Figure 1 – Performance graph for datastore read and write latencies

 

How to troubleshoot poor vSphere performance


The steps, or rather, questions you should ask yourself, are listed in an orderly fashion starting with the most trivial. Re-evaluate the performance of the affected VM after each step if you decide to try out a relevant fix. You can then choose to skip to the next step depending on the level of observed improvement if any. If you come across something as glaringly obvious as a failed disk on a host, it goes without saying that you’d want to fix this first before moving on!

 

1 – Is it really unexpected behavior?

A VM subjected to a heavy workload can sometimes be perceived as performing poorly. Some examples are virtualized instances of SQL servers, processor intensive or badly written SQL queries or mail servers with large user bases. The performance monitoring charts in vSphere Web client will help you gauge resource utilization across a given period of time so you can assess if the behavior anomaly was a one-off or ongoing and then gauge whether the behavior is expected or not. Products such as MS SQL and Exchange Server, will by design, take up any RAM thrown at them bleeding memory from the VM’s guest OS unless otherwise configured. What that in mind, it’s always a good idea to refer to the product’s documentation.

 

2 – Are you running the latest product?

Updates and new releases may address performance issues in the form of ironed out bugs or improved drivers and code. Sometimes, however, the latest release could, in fact, make the problem even worse. So test, test and test again before taking the leap or at least wait until there’s a sufficient uptake of the new release or update, so you can make an informed decision.

 

3 – Are your VMs running VMware Tools?

Make sure that vmtools are installed, updated and running on every VM that supports them. The VMware Tools package, above all, provides a set of optimized virtual device drivers that directly affect VM performance (for the better usually). Again, using the vSphere Web client, you can easily check your overall vmtools health as shown in Fig. 2. Remember to add the vmtools fields by right-clicking on the fields header and selecting them accordingly.

Figure 2 - Displaying the vmtools status for VMs manged by vCenter Server
Figure 2 – Displaying the vmtools status for VMs managed by vCenter Server

 

Alternatively, you could cook yourself a PowerCLI script that checks for the vmtools package and its current state. The bulk of the properties related to vmtools is found under <vm>.guest.extensiondata.

Figure 3 - Using PowerCLI to query the state of vmtools on VMs
Figure 3 – Using PowerCLI to query the state of vmtools on VMs

 

 

4 – Is your VM adequately powered in terms of resources?

Though seemingly obvious, you’d be surprised as to how many VMs are not assigned sufficient resources as per the guest OS requirements and the applications running under it. Remember that, regardless of the benefits virtualization brings about, there are always overheads to contend with. For instance, if the VM runs out of RAM, it will start swapping to disk on a more frequent basis. If the underlying storage is flaky, performance will be badly hit. Whenever possible, use reservations, resource pools, and DRS to ensure that the correct amount of resources are assigned to a VM for maximum operational efficiency.

 

5 – Is antivirus software or similar running on ESXi?

Yes, even though rare in practice, you can, in fact, find antivirus software – think vShield – running on ESXi. This can adversely affect VM performance on a number of counts if it is not configured properly. One should also keep in mind that there is little justification for running AV software on ESXi given its small footprint and inbuilt security features. Best practice, in fact, calls for anti-malware software to be relegated to the VM’s guest OS. If you must install AV on ESXi, do make it a point to exclude VM files such as VMDKs from scanning schedules especially during peak utilization hours.

AV software aside, there’s also Backup and other I/O intensive software that may adversely impact VM performance.

 

6 – Is your underlying storage healthy?

Whether you’re using local or SAN-based datastores, it all boils down to the performance and health of your disks and the underlying sub-systems housing them. Simply put, if VMs do not get their fair share of IOPS, performance will start degrading. Here are a few things you can check and do:

Bad disks: Run regular health checks on your disk / networked storage and replace aging or failing disks immediately.

ESXi OS: Use separate disk(s) for the ESXi host’s OS, the swap partition, and VMs residing on a local datastores. Also, consider using RAID to improve read and write performance.

Snapshots: Delete any unused or redundant snapshots. The more snapshots you have, the greater the disk overheads will be vis-à-vis I/O activity.

Encryption: Use disk encryption only when necessary. Encryption = overheads = decreased performance.

 

7 – Do your ESXi hosts have enough resources?

Running a dozen or so VMs configured with 16GB of RAM concurrently on a single ESXi host that has only 96GB of RAM is simply asking for trouble. Consider adding RAM to the host or use DRS – if you have multiple ESXi hosts and proper licensing – for better load distribution.

 

8 – Do you have CPU power management enabled?

CPU power management, when enabled on ESXi servers, may introduce speed latency that can be picked up by applications or workloads resulting in a slower performance. If you suspect this to be the case, do consult the vendor documentation on how to disable CPU power management. If disabling it has no effect, re-enable it in the spirit of running energy-friendly data centers.

 

9 – Is everything good on the networking front?

Make sure that your ESXi host networking does not become a bottleneck preventing VMs from running and operating optimally. Symptoms may include a laggy response when connecting to VMs via remote clients or management consoles to overly lengthy vMotion transfers. Make sure that the network cards on your hosts are correctly configured. If your infrastructure permits it, partition or segregate network traffic. Run services such as management, vMotion and storage on their own dedicated network. Use optimized TCP/IP stacks and things like Jumbo frames where applicable. Make sure that the firmware on any networking hardware thrown in the mix is up to date. Finally, do not exclude issues with the virtual switches. Check your portgroups, vlan assignment and so on.

 

10 – Have you checked your ESXi OS and hardware health lately?

Just like any other system, ESXi needs regular maintenance for it to operate at full throttle both from a hardware and operating system perspective. Purple screens aside, you cannot immediately tell if there’s some issue brewing just waiting to rear its ugly head on a weekend night. Make sure to monitor disk usage and use the health monitoring software that generally comes bundled with your server(s) or products like PRTG.

 

Wrap Up


This pretty much sums today’s post on how to improve VM performance. That said, there are probably more factors affecting VM performance and ways and means to tackle them. I suggest you read the material referenced by the links provided throughout this post and other posts, such as this one on DRS, for more information. Also, get yourself into the habit of visiting sites such as the VMware Technology Network where you’ll find like-minded people sharing similar queries, problems, and potential solutions.

Finally, be sure to check out our dedicated ebook: vSphere Troubleshooting Guide by vExpert Ryan Birk

 

Thank you Jason for such a great post.

 

ref:https://www.altaro.com/vmware/troubleshooting-vm-performance/