Affine transformation virtual 3D object using a finger gesture-based interactive system in the virtual environment. A convolutional neural network (CNN) based thumb and index fingertip detection system are presented here for seamless interaction with a virtual 3D object in the virtual environment. First, a two-stage CNN is employed to detect the hand and fingertips, and using the information of the fingertip position, the scale, rotation, translation, and in general, the affine transformation of the virtual object is performed. This is the version 2.0 that includes a more generalized affine transformation of virtual objects in the virtual environment with more experimentation and analysis. Previous versions only include the geometric transformation of a virtual 3D object with respect to a finger gesture. Paper for the affine transformation of the virtual 3D object has been published in Virtual Reality & Intelligent Hardware, Elsevier Science Publishers in 2020.