This is expected behavior. DRS supports initial placement of vGPU VMs running vSphere 6.7 Update 1 and later without load balancing support.
Starting with vSphere 8.0 U2, DRS can estimate the Stun Time for a given vGPU VM configuration. When the DRS Cluster Advanced Options are set and the Estimated VM Devices Stun Time for a VM is lower than the VM Devices vMotion Stun Time limit, DRS will automate VM migrations.
To enable this functionality, make sure the infrastructure meets the following requirements:
Then add the following DRS Cluster Advanced Options:
Option: PassthroughDrsAutomationValue: 1 Option: LBMaxVmotionPerHostValue: 1
For vGPU VMs with Stun Times exceeding the "vMotion Stun Time Limit" (default 100 seconds), a VI Admin can add the following DRS Cluster Advanced Option:
Option: VmDevicesStunTimeToleratedValue: <number of seconds, greater than any VM's Estimated Stun Time in the Cluster> (Default 100 seconds)
OR
Modify the "vMotion Stun Time Limit" in the VM's Configuration -> "VM Options" Tab -> "Advanced" Section
For older releases, to resolve the issue please follow the below mentioned points:
For additional information refer Using vMotion to Migrate vGPU Virtual Machines