Neurobiological Basis of Drug Reward and Reinforcement

Introduction

Drug use disorders involve a number of factors including genetic and environmentally influenced predispositions, the actions of the drugs themselves, the immediate environment, and the neurobiological mechanisms that promote and support drug actions and addiction. This chapter deals mostly with the latter aspect of drug use, abuse, and addiction, as we explore the ways in which the brain is built to adapt to environmental circumstances, and how these aspects of neural function can promote the continued use and abuse of certain drugs and ultimately promote disorders related to these drugs. We then consider the mechanisms through which drugs of abuse interact with the brain systems that promote maladaptive drug use and addiction.

Drug use disorders have been defined in several different ways, most of which stress the habitual or compulsive nature of addictive behavior; the physical, psychological, and social damage produced by the behavior; and the trauma associated with cessation of the behavior. Drug use disorders also share many features in common with disorders related to natural biological drives (e.g., sex and food consumption), physical activities (e.g., excessive exercise), and relatively benign drugs (e.g., caffeine) (discussed in Brunton et al. and Koob and Le Moal ). Drugs of abuse with harmful effects are the main focus of the present discussion. In addition, excessive use of drugs may lead to health and psychological problems even in the absence of an agreed-upon definition of use disorders. Thus, it is important to understand the neural mechanisms that contribute to prolonged and maladaptive drug use.

The Diagnostic and Statistical Manual of Mental Disorders , Fifth Edition (DSM-5), covers substance-related and addictive disorders. It should be noted that addiction per se is not a diagnosis recommended in the manual, as the term substance use disorder is preferred. Within this classification, the manual defines substance use disorder as a “…cluster of cognitive, behavioral, and physiological symptoms indicating that the individual continues using the substance despite significant substance-related problems.” Common features of these disorders are cognitive, affective, sleep, and behavioral changes that center on use or cessation of use of the addictive substance or action (sometimes referred to as a habit). Central features of the use disorder are risky drug use, craving, social impairment, continued use despite negative consequences, and the possible negative consequences of cessation of drug use. All of these aspects of substance use and addictive disorders can be seen to relate to innate brain mechanisms underlying processes referred to as reinforcement and/or reward. In this context, it is important to discuss current ideas about the neural mechanisms of reinforcement and reward before discussing the impact of drugs of abuse on these processes.

Experimental Psychology Concepts of Reward and Reinforcement

An important aspect of the use and abuse of a wide range of drugs is their reinforcing properties. A reinforcer is defined in experimental psychology as a substance or stimulus presented following a behavior that increases the incidence of the behavior above baseline levels. As Skinner wrote:

The operation of reinforcement is defined as the presentation of a certain kind of stimulus in a temporal relation with either a stimulus or a response. A reinforcing stimulus is defined as such by its power to produce the resulting change [in the response]. There is no circularity about this; some stimuli are found to produce the change, others not, and they are classified as reinforcing and non-reinforcing accordingly.

It is worth highlighting the dual use of the term stimulus by Skinner to refer to both the result of the action (the reinforcer), and a stimulus within the environment that can become associated with the response and the reinforcer. One definition of reinforcement, although not Skinner’s position, is that it involves a strengthening of the ability of stimuli to elicit responses (the so-called stimulus-response model ), while a somewhat looser definition is a strengthening of the ability of the environment in general, including some neural activity within the animal itself and the animal’s past history in that environmental context, to elicit the response.

The concept of reinforcement is best known from the work of Konorski and Skinner on what is now called operant or instrumental conditioning. However, the term reinforcement has also been used in the context of Pavlovian, or classical, conditioning. One use of this term is that presentation of an unconditioned stimulus subsequent to the conditioned stimulus reinforces the ability of the conditioned stimulus to elicit a conditioned response. This term has also been used to refer to the effects of stimuli that predict the value of a rewarding stimulus presented prior to presentation of food or another naturally desirable outcome. For example, in the paradigm used by Schultz, responding for food (licking) as well as neuronal activity related to stimulus presentation and responding can be measured. Dayan and Balleine provided a nice discussion of the distinctions between reinforcement in the context of Pavlovian and instrumental conditioning.

Two forms of reinforcement, termed positive and negative, have also been postulated. Positive reinforcement refers to the process in which delivery of a desirable consequence increases the incidence of the behavior. This is easily understood in the context of the instrumental or operant conditioning paradigm in which delivery of palatable food will increase bar pressing by a rodent or key pecking by a pigeon. Negative reinforcement occurs when the performance of an action results in omission or avoidance of an undesirable stimulus (e.g., foot shock), and the incidence of the behavior increases as a result of this learning process. The initial phases of learning some skills, such as swimming, involve what might be termed negative reinforcement, as the skill helps to reduce the undesirable effects of the environment. Many investigators do not subscribe to the idea that positive and negative reinforcement are distinct processes, as both types of reinforcement basically refer to something that increases the incidence of a given behavior. However, negative reinforcement is a useful concept when measuring stimulus-behavior relationships, as it describes a condition in which increasing behavior leads to omission/avoidance of a stimulus. Learning in everyday life will often involve both positive and negative reinforcement.

Two other terms that have come to be used in the context of instrumental learning and addiction are punishment and reward. Consideration of the conditions that promote cessation of behavior led to the definition of the undesirable outcome as punishment, although the role of punishment has been hotly debated. The term reward was not so readily accepted by early behaviorists but has come into common use as a reference to the desirable outcome in an instrumental learning paradigm. The terms reward and positive reinforcement are often used interchangeably, but as we will discuss, these terms can be used to refer to different processes that control instrumental learning of actions, including drug self-administration.

Studies conducted over the last few decades have led to the refinement of the concepts of instrumental conditioning, reward, and reinforcement based on the role of the outcome produced by a particular behavior in conditioning paradigms. Dickinson, Balleine, and others have shown that responses developed under certain types of conditioning schedules will rapidly diminish if the value of the outcome is decreased or if receipt of the outcome is no longer contingent on making the response. This learning of action-outcome contingencies is best achieved with training schedules where the outcome is easily predictable and the probability of obtaining the outcome is enhanced with increased rates of responding (e.g., fixed or random ratio schedules). In this case, the outcome has been termed to have a rewarding action, based on its intrinsic value to the organism at the time of testing and association with the instrumental action itself.

In contrast, training with schedules where predictability is poorer and increasing rates do not increase probability of successful outcomes (e.g., random interval schedules) produces responding that is insensitive to outcome devaluation or noncontingent presentation of the outcome. This stimulus-response type of conditioning can also occur with extensive training using schedules with higher predictability (discussed in Yin and Knowlton ). In the case of stimulus-response learning, the association is made between antecedent environmental stimuli and the subsequent response, with the outcome serving as a reinforcer regardless of the immediate value of this outcome to the animal. As you can see, this is closer to the classical definitions of reinforcement favored by stimulus-response theorists, Donahoe, and perhaps Skinner. It should also be noted that White drew a similar distinction between reward and reinforcement, albeit with a more traditional behaviorist emphasis on the definitions of these terms. Other investigators have defined reward in terms of positive reinforcement in combination with positive hedonic value, an idea that suggests more overlap between the two processes. Although the separate definitions of reward and reinforcement in this context may be debated, there is strong evidence for the two instrumental conditioning processes themselves. Thus the differentiation of the roles of stimuli/environment and outcome in the two different learning processes is important, and separate discussion of reward and reinforcement in these contexts is useful.

Before we consider how reward and reinforcement contribute to addiction it is worth discussing the adaptive purpose of these neural systems. Behaviors that lead to enhanced survival and/or reproduction are necessary for propagation of genes and species. Innate feeding, reproductive, and harm-avoidance behaviors exist in all animals, but learning about features of the environment is necessary to obtain the opportunity to express these innate behaviors. Pavlovian conditioning is one such learning process whereby performance of something approximating an innate or reflexive behavior can come to be elicited by stimuli that were originally neutral with respect to predicting a particular outcome (e.g., obtaining food or avoiding harm). Instrumental conditioning adds another layer of sophistication to this process. Animals with this capacity can learn to perform new actions and new sets of actions to obtain a positive consequence or avoid punishment. Both types of learning have obvious adaptive utility, as the animal can now integrate complex features of the world and new behavioral strategies into maintaining safety, as well as the quest for food and mating partners. The power of the neural mechanisms involved in reward and reinforcement likely derives from this relationship to survival and reproductive success.

However, there is the possibility that reward and reinforcement mechanisms will not always be used for adaptive purposes. One such example is the phenomenon of self-starvation. Animals that are trained to perform an intracranial self-stimulation task, described later, will perform this task at the expense of sufficient eating if access to food is time-restricted. Similar self-starvation is observed if animals are given the opportunity to run on a wheel when on a limited food access schedule. This particular form of self-starvation has been considered as a model of human anorexia nervosa, which itself is clearly an example of maladaptive behavior involving the brain systems we will consider. Stimuli that originally signal a positive outcome can change their predictive value (a certain location may contain food at one time and a predator at another). Furthermore, stimuli or substances that interact with the neural mechanisms involved in reinforcement may come to have reinforcing value even when they are not coupled to a favorable outcome, or even when they are associated with harmful results. Most drugs of abuse can act in this manner, and can lead to reinforcement of what we might call maladaptive behaviors. In the remainder of this chapter we will consider the brain circuitry and cellular and molecular mechanisms involved in reinforcement. Consideration of this topic will also entail some discussion of the experimental techniques used to uncover these mechanisms.

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here