Gases and plasmas can be modeled in both a statistical sense (as a collection of discrete particles) and a continuum sense (as a continuous distribution). A collection of discrete particles is often modeled using a Maxwellian velocity distribution, which is useful in many scenarios but limited by the assumption of thermal equilibrium. In this work, we develop an architecture to learn a low-dimensional, general parameterization of the velocity distribution from scientific instrument plasma data. Such parameterizations have direct applications in data compression and simplified downstream learning algorithms. We verify that this dimensionally-reduced distribution preserves the key underlying physics of the data after reconstruction, specifically looking at the fluid parameters as derived from the instrument plasma moments (e.g., density, velocity, temperature). Finally, we present evidence for an information bottleneck arising from the relationship between the number of reduced parameters and the quality of reconstructed fluid parameters. Applying this learned architecture to data compression, we achieved a 30X compression ratio with what were deemed as acceptable losses.