Pulmonary embolism (PE) is a common life-threatening condition with a challenging diagnosis, as patients often present with nonspecific symptoms. Prompt and accurate detection of PE and specifically an assessment of its severity are critical for managing patient treatment. We introduce diverse multimodal fusion models that are capable of utilizing weakly-labeled multi-modal data, combining both volumetric pixel data and clinical patient data for automatic risk stratification of PE. The best performing multimodality model is an intermediate fusion model that achieves an AUC of 0.96 for assessing PE severity, with a sensitivity of 90\% and specificity of 94%. To the best of our knowledge, this is the first study that attempted to automatically assess PE severity.