Wildlife poaching fuels the multi-billion dollar illegal wildlife trade and pushes countless species to the brink of extinction. To aid rangers in preventing poaching in protected areas around the world, we develop an multi-armed bandit approach to help rangers choose how much time to spend in each region, balancing exploration of infrequently visited regions and exploitation of known hotspots. However, naive bandit approaches compromise short-term performance for long-term optimality, resulting in animals poached. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. Additionally, when some species are more vulnerable, we ought to offer more protection to these animals; unfortunately, existing bandit approaches do not offer a way to prioritize important species. To bridge this gap, we propose a novel combinatorial bandit objective that trades off between reward maximization and species prioritization. We demonstrate that our approach improves performance on real-world poaching data from Cambodia.