Holistic Evaluation of Large Language Models for Medical Tasks with MedHELM