According to an article by The Verge, a new study demonstrates the main data with which medical algorithms were created solely focused on California, Massachusetts, and New York. “…according to the research published this week in the Journal of the American Medical Association. The narrow geographic distribution of the data used for these algorithms may be an unrecognized bias, the study authors argue.” 

With AI, you need as much data as possible. You have different ways in which you can analyze data. The larger the data set you analyze, the more accurate the predictions are. When using AI for medical purposes, it is extremely important to acknowledge the possible bias that developers have. This is so you can eliminate these bias  as much as possible. Bias can result in errors and will not produce a reliable solution. In a medical context, this could have serious consequences.

 It is essential to uncover these biases early before AI is heavily relied on. Using AI for medicine requires bountiful data from different states because different climates and lifestyles have an impact on health. This means if other states are not taken into consideration, they may not be able to use the technology. 

“It’s well-recognized that gender and racial diversity is important in those training sets: if an algorithm only gets men’s X-rays during training, it may not work as well when it’s given an X-ray from a woman who is hospitalized with difficulty breathing. But while researchers have learned to watch for some forms of bias, geography hasn’t been highlighted.”

 It is interesting that bias towards geography has only now been discovered. As enterprises shift towards AI, the criteria become more disciplined as AI is widely relied on. It is critical to identify and remove the biases that come from errors in the underlying data set. In order to do so, you need to have a truly representative data set.