In wildlife imagery, the main challenges for a model to assist human annotation are two-fold: (1) the training dataset is usually imbalanced, which makes the model's suggestion biased, and (2) there are complex taxonomies in the classes. We establish a simple and efficient baseline, including the debiasing loss function and the hyperbolic network architecture, to address these issues and achieve noticeable improvements in image classification accuracy compared to a naive method. Moreover, we propose leveraging the semantic correlation to train the model more effectively by adding a co-occurrence layer to our model during training. The proposed semantic correlation-based learning method significantly improves the performance. We demonstrate the efficacy of our method in both our real-world wildlife areal survey recognition dataset and the public image classification dataset, CIFAR100-LT and CIFAR10-LT.