Cryo-electron microscopy (cryo-EM) is capable of producing reconstructed 3D images of biomolecules at near-atomic resolution. However, raw cryo-EM images are highly corrupted 2D projections of the target 3D biomolecules. Reconstructing the 3D molecular shape requires the estimation of the orientation of the biomolecule that has produced the given 2D image, and the estimation of camera parameters to correct for intensity defects. Current techniques performing these tasks are often computationally expensive, while the dataset sizes keep growing. There is a need for next-generation algorithms that preserve accuracy while improving speed and scalability. In this paper, we combine variational autoencoders (VAEs) to learn a low-dimensional latent representation of cryo-EM images. Analyzing the latent space with differential geometry of shape spaces leads us to design a new estimation method for orientation and camera parameters of single-particle cryo-EM images, that has the potential to accelerate the traditional reconstruction algorithm.