Multiple initiatives are currently underway for setting up large biobanks with associated clinical and molecular data. The Million Veteran Program (MVP) was initiated by the VA in 2011 and has collected over 820,000 consented biosamples. The size and diversity of the MVP cohort, as well as the availability of extensive VA electronic health records, make it a promising resource for precision medicine. We report the characterization of the initial set of data for its ethnic diversity and its implications in disease association studies and risk prediction. Over 29% of participants self-report as non-white ethnicity. MVP has substantial diversity in genetic ancestry and is on its way to meets a pressing need for greater diversity in genome-wide association (GWAS) analyses.