The imitation game: a review of the use of artificial intelligence in colonoscopy, and endoscopists’ perceptions thereof
Article information
Abstract
The development of deep learning systems in artificial intelligence (AI) has enabled advances in endoscopy, and AI-aided colonoscopy has recently been ushered into clinical practice as a clinical decision-support tool. This has enabled real-time AI-aided detection of polyps with a higher sensitivity than the average endoscopist, and evidence to support its use has been promising thus far. This review article provides a summary of currently published data relating to AI-aided colonoscopy, discusses current clinical applications, and introduces ongoing research directions. We also explore endoscopists’ perceptions and attitudes toward the use of this technology, and discuss factors influencing its uptake in clinical practice.
INTRODUCTION
“Can machines think?” These were the opening words of Alan Turing’s landmark paper “Computing Machinery and Intelligence” in 1950 [1], where he introduced the concept of using computers to simulate intelligent behavior and critical thinking [2, 3]. In that paper, he described the “Turing test,” or the “imitation game,” a simple test of a machine’s ability to exhibit intelligent behavior indistinguishable from that of a human—a machine would pass the test if a human could not reliably tell apart a machine from a human [1]. The term “artificial intelligence (AI)” was subsequently coined by John McCarthy in 1956, referring to “the science and engineering of making intelligent machines” [4, 5]. Seventy years later, we have arrived in an era in which machines can not only simulate human intelligence, but can even supersede it with increased speed, accuracy, and reproducibility [6].
AI systems are built on techniques that mimic facets of human intelligence, such as machine learning and deep learning [7]. Machine learning involves automatically identifying patterns within datasets, building algorithms from them, and “learning” to apply these predictive models to future scenarios for improved decision-making [8]. Deep learning involves the self-creation of an artificial neural network, a multilayered network in which various algorithms, like neurons of a human brain, can interconnect via hidden neural layers, to improve the overall efficiency of the network and process vast amounts of data [7, 9, 10].
The widespread application of AI to medicine has ballooned In the past 2 decades as a result of advances in information technology, big data collection and processing, and increased electronification of medical data [6]. The advent of deep learning in particular has significantly advanced the field of AI by rendering AI systems capable of analyzing complex algorithms and self-learning [2].
AI IN COLONOSCOPY
Within medicine, one of the most successful uses of AI has been in the field of computer-aided diagnostics (CAD) in colonoscopy. The use of AI in colonoscopy has far-reaching clinical benefits.
Colorectal cancer is the third most diagnosed cancer in the world, with the second highest mortality rate [11]. In Singapore, it is the most common cancer, accounting for 16.9% of all cancer diagnoses in men and 13.1% in women [12]. Colonoscopy remains the gold standard for detection of colonic polyps and colorectal cancer, with both diagnostic (biopsy) and potentially therapeutic (polyp removal) advantages [13–15]. Periodic colonoscopy assessments have been shown to play an important preventive role in decreasing the incidence of colorectal cancer by detecting precancerous adenomas at an early stage [16, 17], and thereby also reducing mortality from colorectal cancer [18, 19].
However, postcolonoscopy colorectal cancer (PCCRC) remains recognized problem, in which colorectal cancer is diagnosed after a colonoscopy in which no cancer was found [20]. A retrospective single-center study of 107 PCCRCs in England found that 73% of PCCRCs were determined to be affected by technical endoscopic factors and 27% by decision-making factors. Nineteen percent of index colonoscopies had poor bowel preparation and 85% of PCCRCs were classified as possible missed lesions [21]. A meta-analysis by Zhao et al. [22] of more than 15,000 tandem colonoscopies showed an adenoma miss rate (AMR) of 26%—in other words, about one-fourth of adenomas are missed during colonoscopy. These results show that the effectiveness of colonoscopy hinges upon quality indicators, such as the adenoma detection rate (ADR), complete resection rate, adequate bowel preparation, cecal intubation rate (i.e., completion of colonoscopy), and withdrawal time, as suggested by the European Society of Gastrointestinal Endoscopy guidelines [23]. This is where AI comes into play—to improve the quality of colonoscopies.
The use of AI in colonoscopy is dependent on deep convoluted neural networks for real-time image analysis (Fig. 1) [24]. There are 2 major CAD systems thus far: computer-aided detection (CADe) and computer-aided diagnosis (CADx).
Computer-aided detection
CADe involves real-time image analysis to detect polyps, with the aim of increasing the ADR and decreasing the AMR [7, 25]. Studies have shown the ADR to be inversely related to the risk of interval colorectal cancer and mortality [23, 26], and a US study of more than 300,000 colonoscopies reported a 3% decrease in the risk of interval colorectal cancer for every 1% increase in the ADR [27]. CADe systems have enabled the real-time AI-aided detection of adenomas with higher sensitivity than the average endoscopist [28]. Examples include Medtronic’s “GI Genius,” Pentax’s “Discovery AI,” and Fujifilm’s “CAD-EYE.”
This has been well supported by many randomized controlled trials (RCTs) comparing human to AI endoscopists in terms of ADRs [29–31]. A meta-analysis [32] of 5 RCTs [33–37] consisting of a total of 4,354 patients showed that the pooled ADR was significantly higher in the CADe group (36.6%) than in the control group (25.2%; relative risk, 1.44). Sessile serrated lesions, which are notoriously frequently missed on colonoscopy [38], were also detected at a higher rate using CADe (relative risk, 1.52) [32].
The first CADe system to be approved by the U.S. Food and Drug Administration was Medtronic’s GI Genius, which has been shown to have a 99.7% sensitivity rate [39]. A multicenter randomized trial of 685 subjects conducted by Repici et al. [37] showed a significantly higher ADR in patients who underwent GI Genius-aided colonoscopy (54.8%) than in patients who underwent colonoscopy without GI Genius assistance (40.4%; relative risk, 1.30). A more recent study by Wallace et al. [40] showed a nearly 50% reduction in the AMR with GI Genius (15.5%) compared to unassisted colonoscopy (32.4%).
Computer-aided diagnosis
CADx involves characterizing polyps based on morphological parameters, such as surface, vascular patterns, shape, size, and location, to generate probability scores for malignancy or nonmalignancy [41]. This helps to improve the accuracy of optical biopsies, which refer to the in vivo prediction of polyp histology before resection and formal histological analysis [42]. Most of these systems use image-enhanced endoscopy techniques such as narrow-band imaging and blue laser imaging (BLI) to enhance the accuracy of predictions.
The goal is to reduce unnecessary polypectomies for nonneoplastic lesions, such that colonoscopy becomes more cost-effective and timesaving, and potentially avoiding the rare but significant complications that come with polypectomy, such as bleeding, infection and perforation of the bowel [43]. These methods can potentially allow the implementation of a “resect and discard” [44] or “detect and leave” [45] strategy for diminutive polyps (5 mm or smaller), using technology that provides a negative predictive value (NPV) of more than 90% for adenomatous histology, according to the thresholds set by the American Society for Gastrointestinal Endoscopy (ASGE) Preservation and Incorporation of Valuable Endoscopic Innovations (PIVI) recommendations [46].
There is currently intense interest in the area of developing realtime CADx systems. Fujifilm’s CAD-EYE is the first AI system to include both CADe and CADx (using BLI) on the same platform. CAD-EYE’s CADx module obtained an accuracy of 93.2% with white-light endoscopy and 94.9% with BLI [47]. These results exceed the ASGE PIVI thresholds of 90% NPV, and the system has been approved for clinical use in the European Union. Medtronic’s GI Genius has also included a CADx module in its newest iteration, but it has so far only managed to achieve 85% accuracy with white-light endoscopy in the recent study conducted by Biffi et al. [48].
Besides distinguishing neoplastic from nonneoplastic lesions, CADx has been explored as a way to diagnose the depth of cancer invasion. Tamai et al. [49] developed a CADx system that is able to distinguish colorectal lesions with deep submucosal invasion (T1b cancer) with 83.9% sensitivity and 82.6% specificity, achieving a diagnostic accuracy greater than that of clinicians (reported to be less than 80% [20]). This might prove to be very useful in clinical practice if it can potentially advise on the need for advanced resection methods such as endoscopic submucosal dissection and surgery [25].
Computer-aided quality control
Adenoma detection relies upon 2 main factors: the identification of mucosal abnormalities and adequate colonic mucosal exposure [50]. While CADe and CADx target the former, the use of AI has also been explored for the latter. AI has been used to automate and enhance quality control in colonoscopy, by monitoring technical and mechanical factors of colonoscopy, such as withdrawal time and the adequacy of bowel preparation.
For example, the ENDOANGEL system, developed by Renmin Hospital of Wuhan University in China, is a real-time quality improvement system that provides automated monitoring of the withdrawal time, withdrawal speed, and adequacy of mucosal exposure, and relays this information to the endoscopist in real time. It has been shown to result in a significantly longer mean withdrawal time (6.38 minutes vs. 4.76 minutes) and significantly higher ADR (16% vs. 8%) than unassisted colonoscopy [51].
OUR CENTER’S EXPERIENCE WITH AIAIDED COLONOSCOPY
Commercial AI-aided colonoscopy systems have only been introduced to Singapore fairly recently, since 2021. Their use has mostly been limited to product trials, and only Medtronic’s GI Genius CADe system has been formally evaluated. In 2021, our center conducted a single-institution cohort study of 24 consultant-grade endoscopists [52], of whom 18 performed 5 or more GI Genius-aided colonoscopies over a period of 3 months.
We examined the effects of GI Genius on ADR on both a collective and individual level. Collectively, the median ADR of 30.4% with GI Genius was higher than the baseline polypectomy rate of 24.3% (P = 0.02). Individually, 13 out of 18 endoscopists with 5 or more GI Genius-aided colonoscopies achieved a higher ADR rate with GI Genius, with 2 experiencing significant improvement in the ADR (39% and 40%). The median improvement was 8.5% (interquartile range, –2.8% to 17.8%), and 14 of all 250 polypectomies (5.6%) performed were also found to be sessile serrated lesions on histology, which was higher than the expected rate of 2% to 3% [53].
These results serve to highlight the benefits of CADe in helping even experienced endoscopists detect more adenomatous lesions, including those that are notoriously difficult to detect with the naked eye (Fig. 2). These results also concur with those achieved by multiple existing RCTs on CADe, as mentioned previously [32].
KNOWLEDGE, PERCEPTIONS, AND BEHAVIORS OF ENDOSCOPISTS TOWARDS AI-AIDED COLONOSCOPY
Physician sentiment is often a significant determinant of how quickly technologies are deployed in a clinical setting [54]. Despite evidence proving the benefit of AI-aided colonoscopy systems, not all endoscopists seem to welcome the advent of such systems.
An online survey conducted amongst 124 gastroenterologists in the United States from 2018 to 2019 [55] showed that while most indicated interest in the application of AI to colonoscopy (86.0%) and felt that CADe would improve their endoscopic performance (84.7%), there were significant concerns about cost (75.2%), operator dependence or “deskilling” due to over-reliance on AI (62.8%), and increased procedural time (60.3%). In contrast to the support for CADe, only 57.2% of respondents felt comfortable using CADx to support a “diagnose and leave” strategy for hyperplastic polyps, which may indicate that while endoscopists may be more welcoming towards the use of AI as an adjunct for diagnosis, there is still a significant proportion who are skeptical toward relying on AI solely to make decisions on intervention. Interestingly, it was shown that postfellowship experience of less than 15 years was the most important factor in determining whether physicians were likely to believe that CADe would lead to more removed polyps (odds ratio, 5.09; P=0.01), which serves to highlight how “expert” endoscopists may find AI less beneficial than “novice” endoscopists in elevating their endoscopic practice.
While the above survey was a prospective one undertaken prior to the use of actual AI in endoscopy, our center conducted a retrospective survey of endoscopists who had already experienced using AI-aided colonoscopy. We also sought to determine whether one’s existing experience with AI influenced the uptake of AI-aided colonoscopy. Using the same sample of 24 consultant-grade endoscopists from the study by Koh et al. [52], our center conducted a survey on the knowledge of AI, perceptions of AI in medicine, and behaviors regarding use of AI-aided colonoscopy, 2 months after the implementation of Medtronic’s GI Genius in colonoscopy, with a response rate of 66.7% (16 of 24) [56]. The parts of our survey pertaining to knowledge and perceptions of AI were modeled after the survey administered by Mehta et al. [57] to investigate knowledge and perceptions of AI amongst medical students in Ontario.
Knowledge of AI varied amongst endoscopists. Most (100%) understood common terms like “artificial intelligence” and “machine learning,” but only 9 (56.3%) understood more in-depth terms like “neural network” and “deep learning.”
Regarding perceptions of AI in medicine, most endoscopists were optimistic about AI’s capabilities in performing objective administrative (81.3% to 93.8%) and clinical tasks (62.5% to 93.8%). However, most (93.8%) were reserved about AI providing personalized, empathetic care. A minority (18.8%) of endoscopists perceived that AI would reduce the number of jobs available to them.
Regarding behaviors involving the use of AI-aided colonoscopy, only 11 endoscopists (68.8%) agreed or strongly agreed that GI Genius should be used as an adjunct in colonoscopy. Analyzing the 5 endoscopists (31.3%) who disagreed or were ambivalent about its use, there was no significant correlation with their knowledge or perceptions of AI, but a significant number did not enjoy using the program (P=0.01) and did not think it improved the quality of colonoscopy (P=0.03). We thus concluded that the acceptance of AI-aided colonoscopy systems is largely related to the endoscopist’s experience with using the program, rather than general knowledge or perceptions towards AI. The uptake of such systems will thus rely greatly on how the device is delivered to the end user.
INCREASING THE ADOPTION OF AI IN COLONOSCOPY
There is no doubt that AI in colonoscopy is a rapidly developing field, and it will likely find its way into mainstream colonoscopy practice and guidelines in the future. It is our opinion that AI should be embraced, having already proven its clinical benefits, and with the potential to do so much more. Time will tell if AI will help usher in a new era of enhanced colonoscopy surveillance and significant reductions in colorectal cancer mortality, which would indeed be practice-changing.
The benefits of AI in colonoscopy could also extend beyond the clinical, as a US modeling study [58] estimated that the addition of AI support to guideline-based screening for 60% of eligible US adults would cost USD 250 million per year, but could prevent approximately 7,000 colorectal cancer cases and 2,000 deaths every year. This would translate into net cost savings of more than USD 300 million per year, after accounting for the costs of missed cancers. These savings could make AI-aided colonoscopy attractive from a longer-term health economics standpoint, and similar data would be valuable in obtaining the necessary buy-in from policymakers in supporting AI technologies in colonoscopy.
While the incidence rates of colorectal cancer have traditionally been highest in developed countries, these rates have been notably increasing in developing countries [59], as they undergo economic growth, with increased adoption of a “western” lifestyle and dietary habits characterized by higher meat, fat, and total caloric intake, along with increased life expectancy and population growth [60]. However, this has rarely been accompanied by the implementation of appropriate colonoscopy screening programs [61], due to issues pertaining to as scarcity of resources and governance. Thus, the increased incidence of colorectal cancer in developing countries has also been paralleled by increasing mortality rates [60, 62] and represents a significant health burden. The implementation of AI technologies in colonoscopy, which is often first introduced in developed countries, hence presents issues with equity and access. Efforts to bridge these gaps are underway, including programs such as the Medtronic Health Equity Assistance Programme, which has delivered GI Genius modules to 62 facilities performing colonoscopies in less-developed communities [43].
Enabling AI modules to be compatible with various endoscopy stacks would certainly also help to increase uptake. Medtronic takes the lead in this—the GI Genius Intelligent Endoscopy Module is “brand agnostic” and compatible with most existing endoscopy stacks from all companies in use across the world [20, 52].
On an individual level, building on the conclusions of our research on endoscopists’ perceptions of AI in colonoscopy [56], it would be important for companies to focus on optimizing the user experience of their AI-aided colonoscopy products. There will be value in undertaking research pertaining to user experience and using implementational frameworks to improve uptake of AI-aided colonoscopy systems. Implementational frameworks, such as the integrated-PARIHS (integrated-Promoting Action on Research Implementation in Health Services) framework, focus on facilitation, whereby external facilitators train internal facilitators to become local experts) [63]. Similarly, priority could be given to providing hands-on sessions for endoscopists to build familiarity with the system, which is likely to improve their eventual acceptance and uptake of it. A summary of the measures that may increase institutional AI uptake and implementation for colonoscopies is depicted in Fig. 3.
FUTURE DIRECTIONS FOR AI IN COLONOSCOPY
There is still much room for improvement, especially in the field of CADx. Besides Fujifilm’s CAD-EYE system, there has not been another commercial real-time colonoscopy system with both CADe and CADx functions that has been able to achieve the ASGE PIVI threshold of an NPV greater than 90% for resecting and discarding diminutive polyps without a pathologic assessment [20].
Furthermore, current CADx systems are only able to binarily distinguish “adenoma” from “nonadenoma”, but not able to perform more granular and informative classification such as “hyperplastic,” “sessile serrated,” “carcinoma,” and so on [43]. Current systems are also not able to further stratify lesions in terms of severity of dysplasia and depth of invasion. These limitations have several implications for practice. For instance, sessile serrated lesions are considered preneoplastic, whilst hyperplastic polyps are considered nonneoplastic; however, both are currently grouped together by current CADx under the umbrella term “nonadenoma” even though their management is vastly different. In truth, it is difficult to differentiate sessile serrated lesions from hyperplastic polyps, as they have similar surface structures, and current CADx systems will require more training or the use of alternative image-enhancing methods to differentiate these lesions with greater accuracy [20]. Other areas where CADx can potentially prove more beneficial are in the detection of submucosal tumors (in which the overlying mucosa may resemble normal mucosa), in differentiating adenoma from T1 cancer, and T1a from T1b cancer (which will have implications for the treatment strategy—namely, whether to perform endoscopic mucosal resection, endoscopic submucosal dissection, or surgery) [20], and in evaluating the adequacy of large polyp resection to ensure clear margins [7].
Moving forward, another key area to explore would be the use of AI in transforming endoscopy education. Colonoscopy is known to be a challenging procedure with a steep learning curve, and it is demanding of both cognitive and technical abilities [64]. Mastering colonoscopy necessitates accurate identification of colonic polyps and meticulous mucosal exposure, all of which require hands-on experience and time [52, 65]. Studies have reported a minimum number of 100 to 200 colonoscopy procedures to reach technical competence [66–68], with an average training period of 4 years in the United Kingdom [69]. Thus, there is bound to be great variation amongst endoscopists in terms of skill [26]. A US study reported differential ADR rates of 7.4% to 52.5% between endoscopists [27]. AI can potentially act as a levelling tool between novice and expert endoscopists, as a study by Jin et al. [70] in 2020 evaluating the efficacy of a CADx tool noted that the use of CADx in colonoscopy led to the greatest improvement in novice endoscopists (73.8% to 85.6%, P<0.05), who almost reached the accuracy of experts (89.0%, P=0.10). AI can potentially also help novice endoscopists achieve competency faster, through CADe, CADx, and real-time feedback on the quality of one’s endoscopy. Studies evaluating the learning curve of novice versus expert endoscopists with and without AI-aided colonoscopy are still underway, and it would be interesting to see how AI can help novice endoscopists mount the learning curve.
CONCLUSION
In conclusion, AI-aided colonoscopy is an expanding and exciting field of development that has shown promise in improving the quality of screening and diagnosis of colorectal cancer. Further studies are required to evaluate its real-world impact on colorectal cancer incidence and mortality rates and cost-effectiveness for implementation. The technology is still early on the adoption curve, and efforts to increase the uptake of the technology should take into account accessibility, usability, and physician sentiment.
Notes
Conflict of interest
Frederick H. Koh serves on the Editorial Board of Annals of Coloproctology, but was not involved in the reviewing or decision process of this manuscript. No other potential conflict of interest relevant to this article was reported.
Funding
None.
Author contributions
Conceptualization: all authors; Data curation: all authors; Writing–original draft: all authors; Writing–review & editing: all authors. All authors read and approved the final manuscript.
Additional information
SKH Endoscopy Centre members are listed as follows: Fung-Joon Foo, Winson J. Tan, Sharmini S. Sivarajah, Leonard M.L. Ho, Jia-Lin Ng, Frederick H. Koh, Cheryl Chong, Darius Aw, Nathanelle Khoo, Juinn-Haur Kam, Alvin Y.H. Tan, Tousif Kabir, Choon-Chieh Tan, Baldwin P.M. Yeung, Wai-Keong Wong, Bin-Chet Toh, Lester Ong, Jasmine Ladlad, Koy-Min Chue, Faith Leong, Hui-Wen Chua, Sabrina Ngaserin, Cui-Li Lin, Eng-Kiong Teo, Yi-Kang Ng, Tze-Tong Tey, Marianne A. De-Roza, Jonathan Lum, Kalki R. Chandrasekaran, Xiaoke Li, Pei-Shi Goh, Jinliang Li, Nazeemah B. Mohd-Nor, Siok-Peng Ng.