Lena Nguyen, Class of 2021
Artificial Intelligence, or AI, has been shown to have potential in diagnosing a variety of diseases in what is known as computer vision, which is the application of artificial intelligence to visual tasks. AI is especially cited as a promising tool to assist physicians in diagnosing and treating patients. In a study that compared the accuracy of pathologists identifying metastatic tumors in breast cancer with and without computer assistance, the study found that having computer assistance diagnosis (CAD) resulted in improved detection of micrometastasis (Steiner et. al 2018). This is especially promising as micrometastasis refers to small numbers of cancer cells that go on to invade other tissues, but are too few in number to be detected by regular screening tests. A different study instead compared the performance of an AI called CNN and expert dermatologists in identifying melanomas and benign tumors, and they found that CNN outperformed experts in identifying melanomas, though not in benign tumors (Haenssle et. al 2018). However, the effectiveness of AI should not overshadow its many limitations, and the discussion should not be reduced down to man versus machine.
The claims promoting the potency of AI in diagnostics should be thoroughly scrutinized. A recent meta-analysis on CAD studies and melanoma found several issues with them. One of the problems noted was the common use of public test sources, which was found to lead to significantly higher accuracy readings over independent test sources (Dick et. al 2019). Another problem was that most studies were solely in the field of computer science, rarely taking into account important data available in a clinical setting, such as a patient’s age or history of melanoma (Dick et. al 2019). Looking back at the promising study with CNN, there are still significant limitations to its results that the study addresses. While the test given to CNN and dermatologists only had positive and negative as potential answers, that is not the case for all patients. When given the option to have the test subject do a follow-up and other management decisions, the computer and dermatologists achieved similar results (Haenssle et. al 2018). In addition, the study does not address possible nonresponse bias, as only a small fraction of the dermatologists requested to participate actually decided to participate. Studies with nonresponse bias can be less credible considering that many experts in their field may have chosen not to participate against an AI, skewing the results.
There is also another form of bias that isn’t acknowledged as much as it should be: discrimination by the AI itself. While machines are often portrayed as unprejudiced because of their mechanical nature, machine learning can pick up patterns that appear numerically beneficial to its programmed goal. In a study done on Facebook’s AI, it was found that the advertising AI would display an ad more to certain demographics than others along racial and gender lines, even if it was not what the ad creator had intended (Ali et. al 2019). Although the AI used in Facebook is very likely different from ones used in medicine, it shows that AI are not the flawless machines many see them as; a biased AI could have severe consequences such as having a higher chance of diagnosing someone based on race, gender, or age when no correlation exists for such groups. Researchers should be extremely wary and test potential AI before any widespread medical applications are taken of such technology. This isn’t to say artificial intelligence has no place in medicine, and outside of diagnostics, AI has potential assisting surgeries that require precision, prescribing the right dosage for treatment, and reducing human errors due to fatigue. However, researchers and medical literature should take care to recognize bias and create more clinical studies of AI for safe implementations in the future.
Works Cited
Steiner D.F., MacDonald R., Liu Y., Truszkowski P., Hipp J.D., Gammage C., Thng F., Peng L., Stumpe M.C. Impact of Deep Learning Assistance on the Histopathologic Review of Lymph Nodes for Metastatic Breast Cancer. The American Journal of Surgical Pathology. 2018;42(12):1636-1646
Haenssle H.A., Fink C., Schneiderbauer R., Toberer F., Buhl T., Blum A., Kalloo A., Hassen A.B.H., Thomas L., Enk A., et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Annals of Oncology. 2018;29(8):1836-1842
Dick V., Sinz C., Mittlböck M., Kittler H., Tschandl P., Accuracy of Computer-Aided Diagnosis of Melanoma. Jama Dermatology. 2019;155(11):1291-1299
Ali M., Sapiezynski P., Bogen M., Korolova A., Discrimination through Optimization: How Facebook’s Ad Delivery Can Lead to Biased Outcomes. Proceedings of the Association for Computing Machinery on Human-Computer Interaction. 2019;3(CSCW):1-30
Image Source: Seanbatty [Internet]. Pixabay; cited 2019 Dec 2. Available from: https://pixabay.com/illustrations/artificial-intelligence-ai-robot-2228610/
The claims promoting the potency of AI in diagnostics should be thoroughly scrutinized. A recent meta-analysis on CAD studies and melanoma found several issues with them. One of the problems noted was the common use of public test sources, which was found to lead to significantly higher accuracy readings over independent test sources (Dick et. al 2019). Another problem was that most studies were solely in the field of computer science, rarely taking into account important data available in a clinical setting, such as a patient’s age or history of melanoma (Dick et. al 2019). Looking back at the promising study with CNN, there are still significant limitations to its results that the study addresses. While the test given to CNN and dermatologists only had positive and negative as potential answers, that is not the case for all patients. When given the option to have the test subject do a follow-up and other management decisions, the computer and dermatologists achieved similar results (Haenssle et. al 2018). In addition, the study does not address possible nonresponse bias, as only a small fraction of the dermatologists requested to participate actually decided to participate. Studies with nonresponse bias can be less credible considering that many experts in their field may have chosen not to participate against an AI, skewing the results.
There is also another form of bias that isn’t acknowledged as much as it should be: discrimination by the AI itself. While machines are often portrayed as unprejudiced because of their mechanical nature, machine learning can pick up patterns that appear numerically beneficial to its programmed goal. In a study done on Facebook’s AI, it was found that the advertising AI would display an ad more to certain demographics than others along racial and gender lines, even if it was not what the ad creator had intended (Ali et. al 2019). Although the AI used in Facebook is very likely different from ones used in medicine, it shows that AI are not the flawless machines many see them as; a biased AI could have severe consequences such as having a higher chance of diagnosing someone based on race, gender, or age when no correlation exists for such groups. Researchers should be extremely wary and test potential AI before any widespread medical applications are taken of such technology. This isn’t to say artificial intelligence has no place in medicine, and outside of diagnostics, AI has potential assisting surgeries that require precision, prescribing the right dosage for treatment, and reducing human errors due to fatigue. However, researchers and medical literature should take care to recognize bias and create more clinical studies of AI for safe implementations in the future.
Works Cited
Steiner D.F., MacDonald R., Liu Y., Truszkowski P., Hipp J.D., Gammage C., Thng F., Peng L., Stumpe M.C. Impact of Deep Learning Assistance on the Histopathologic Review of Lymph Nodes for Metastatic Breast Cancer. The American Journal of Surgical Pathology. 2018;42(12):1636-1646
Haenssle H.A., Fink C., Schneiderbauer R., Toberer F., Buhl T., Blum A., Kalloo A., Hassen A.B.H., Thomas L., Enk A., et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Annals of Oncology. 2018;29(8):1836-1842
Dick V., Sinz C., Mittlböck M., Kittler H., Tschandl P., Accuracy of Computer-Aided Diagnosis of Melanoma. Jama Dermatology. 2019;155(11):1291-1299
Ali M., Sapiezynski P., Bogen M., Korolova A., Discrimination through Optimization: How Facebook’s Ad Delivery Can Lead to Biased Outcomes. Proceedings of the Association for Computing Machinery on Human-Computer Interaction. 2019;3(CSCW):1-30
Image Source: Seanbatty [Internet]. Pixabay; cited 2019 Dec 2. Available from: https://pixabay.com/illustrations/artificial-intelligence-ai-robot-2228610/
Proudly powered by Weebly