Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Additional content is available online at Elsevier eBooks for Practicing Clinicians
Artificial intelligence (AI) is ubiquitous. It autocompletes the sentences we type, populates web searches before we complete our thoughts, enables our phones to understand verbal commands, permits cars to drive themselves to destinations we speak, and increasingly supports medical diagnostic tests. In medicine it has identified retinal pathology with a skill that exceeds that of a trained ophthalmologist, can tirelessly detect mammographic lesions, and identify abnormalities on a pathologic slide. Some revile it as a technology that will lead to massive unemployment, economic disruption, and serve as an existential threat to humanity; others embrace it as the tool that will liberate humanity from drudgery and elevate the most noble of human tasks.
Three broad capabilities of AI apply to the field of medicine. The first is the automation of fatiguing processes that involve analysis of massive amounts of data, such as continuous ECG tracings acquired over months. In this context, AI performs human-like tasks at massive scale. AI also permits embedding technology in novel forms such as clothing and other wearables to extract physiologic information to enable continuous monitoring of health. The application of AI also, by extension, enables remote monitoring in rural locations, space exploration, and extreme conditions. The second is the ability to extract signals beyond that which a human is capable of recognizing, for example determining the presence of ventricular function from a standard 12-lead electrocardiogram or single-lead ECG acquired from a watch- or smartphone-enabled electrodes. In this context, AI brings new value to well-established medical diagnostic tests that exist in current clinical workflows and practice. Thirdly, and more broadly, the ability to specifically, richly, and uniquely characterize an individual’s physiologic data allows for a new level of personalized predictive models, potentially creating a whole new category of individual “previvors” who know a disease is impending before any signs of symptoms develop, opening the doors for potential interventions, and with associated social, legal, and economic implications. This deep phenotyping may inform additional fields, such as genetics. AI in medicine is in its early stages; the promise is large, but its application requires rigorous testing, vetting, and validation, as do all tests that impact human health. Here we focus on AI and its role in cardiovascular medicine.
If intelligence is a cake, the bulk of the cake is unsupervised learning, the icing on the cake is supervised learning, and the cherry on the cake is reinforcement learning (RL). Yann LeCun, 2016
AI is a lay term, referring to machine learning (ML). In his cake analogy, Dr. Yann LeCun ∗
∗ Yann LeCun, Geoffrey Hinton and Yoshua Bengio—often referred to as the “Godfathers of AI” or the founding fathers of modern AI research, have were awarded together the prestigious Turing award in 2018 for their contribution to the AI revolution.
divides ML into its three main branches and presents one of the technology’s main challenges—the amount of data required for implementation. In all three types of learning (supervised, unsupervised, and reinforcement), instead of using an explicit set of human-devised rules to interpret a signal, large volumes of data are fed to a model, which uses statistical processes to identify relationships within the data. In short, the data train the model, free from human hypothesis ( eFig. 11.1 ).
Learning is the process of improving the ability to complete a task based on experience. As the task is repeated, ML improves by getting feedback (via an error or loss function) and changing the way it performs the task (by changing with weights and biases of the mathematical functions that comprise the “neurons” in a neural network, for example), until the feedback is that the task is done correctly, or at least above a certain standard. In all three types of ML, the feedback is the loss function—the difference between a wanted outcome (how we think the task should have been performed) to the actual outcome (how the task was performed). Learning, or training, is often computationally intensive. Once trained, many networks can then operate with limited computational resources, for example on a smartphone. This makes many AI tools massively scalable.
Supervised learning is the most commonly used form of ML. Supervised learning requires labeled data (images and captions, ECGs and their rhythm description), with labels often provided by human experts. The discovery of the rules that explain the relationship between the input (a signal) to the output (a label) is called training. For example, if ECG samples labeled normal rhythm or atrial fibrillation (AF) are fed to a model, it will learn to differentiate between the two rhythms. The specific features of the signal used to generate model output are determined by the computer during training and are not discernible to humans ( Fig. 11.1 ). Thus, AI is at times referred to as a “black box.” In most cases, the model will be a parametric function ( F ) of the inputs, and it will be initialized using a set of random parameters (weights). During training, in an iterative manner, F is applied on a set of inputs with known outputs (the labels). The results of applying the function F on the inputs yields estimated outputs (in the example, the probability of AF), and with each iteration, using the error between the estimated outputs and the real labels, model performance is assessed and the function weights are adjusted in a direction to minimize the error, improving model performance. The methods used to adjust to weights will be described in the “optimization and hyperparameters” subsection. The task can be either a classification—determination of the appropriate class for a data sample from a limited set of options (dogs versus cats, male versus female)—or a regression—a continuous value for each sample (age from an image). Because supervised learning in a neural network is an iterative process, with each step inching toward an improved solution, the biggest challenge is that large datasets are required. Because each sample in the dataset requires a label, attaching an accurate label to each element may be a limiting factor.
In unsupervised learning, the task revolves around the structure of the data itself. One most common form of unsupervised learning is clustering , in which the model clusters data based on its characteristics, instead of based on labels during the training stage. The model is fed only unlabeled data, and clusters samples based on similarity, using each sample’s distance (Euclidian or other) from other samples. If the label of just a few samples in each cluster is known, the label of other samples in the cluster can be inferred because all the samples in that cluster would have similar features, but the model itself created the clusters without specific labels on the data elements. An example would be the acquisition of multiple ECG segments from a patient during a dialysis session at various potassium blood levels. The ECG segments could be clustered, and the potassium value of each cluster should be similar. Because unsupervised learning requires only the raw samples and basic assumptions regarding the data structure (such as the number of clusters), the barrier imposed by labeled data is lowered.
RL develops the optimal strategy for an agent in an environment with known rules and rewards. An example would be a chess player—learning chess by playing against itself, without labels or recorded human games. It uses only the rules and the game score. As RL requires known rules and rewards, its use in health care is still limited and it is outside of the scope of this chapter.
Inspired by the human brain, a fully connected neural network is a multi-layer parametric function that implements a nonlinear function between the inputs to the outputs (see Fig. 11.1 ). Each node (neuron) in each layer receives a weighted sum of all the nodes in the preceding layers and is activated using a nonlinear function. The values of the weights are defined during training, as the network learns the relationship between the input and output.
In convolutional neural networks, convolutional filters extract feature information from images in the convolutional layers, with the weights of the filters determined during training, so that the features selected are those that best define the desired network output. Both types of networks can be either used for classification tasks or for regression tasks.
Become a Clinical Tree membership for Full access and enjoy Unlimited articles
If you are a member. Log in here