In large-scale deployments, GPU servers cannot be shared or pooled, and would typically serve pointed and localized applications where GPU utilization averages below 15 percent, even for advanced users.
A GPU card can be configured in one of two modes: vSGA (shared virtual graphics) and vGPU. The NVIDIA card should be configured with vGPU mode. This is specifically for use of the GPU in compute workloads, such as in machine learning or high performance computing applications. Access the ESXi host server either using the ESXi shell or through SSH.