From A to CBT: Behavioral Treatments for Substance Abuse: May 2012

Saturday, May 26, 2012

       In their article "Addiction," Robinson and Berridge (2003) explore the phenomenon of drug addiction from the perspective of seeking to answer the question of why it is that while many people experiment with drugs, only some become "addicts" (where "becoming an addict" is operationalized as developing a pattern of compulsive drug-seeking and self-administration which often comes at the expense of other activities). In particular, the seek to discover first, what it is about certain individuals which makes them particularly susceptible to the development of such habits, and second, what it is that makes it so difficult for these individuals to overcome their drug-taking habits once they are established.

       In attempting to answer these question, Robinson and Berridge (2003) begin by suggesting that the potent and addictive effects of drugs rely upon their ability to "hijack" the brain systems normally involved in the processing of learning about rewards and in enabling organisms to attribute motivational value to stimuli important to survival (Kelley and Berridge, 2002). They also argue that successive drug administrations can result in changes in these neural systems which can lead the drug-taking behavior becoming increasingly compulsory. Among the neural systems involved in these processes, they include those of the "... dopamine projections from the ventral tegmental area and the substantia nigra to the nucleus accumbens (NAcc) and striatum, as well as glutamate inputs from the prefrontal cortex, amygdala, and hippocampus and other key parts of the system to which [they] refer to as NAcc-related circuitry," (Robbinson and Berridge, 2003, p. 26). According to Robinson and Berridge (2003), the connection between these systems and the process of addiction is rooted in first, the fact that drugs are able to engage these neural systems to a greater extent and more powerfully than can natural rewards, and second, the fact that over successive administrations, drugs can actually change the way in which these systems operate, in such a way as to make drug-taking behavior increasingly compulsory, (Robinson and Berridge, 1993, 2000).

Robinson and Berridge (2003) go on to present several possible accounts for what these drug-induced changes in psychological functioning might be, and of how they might come about as a result of drug-induced changes in brain circuitry.

         One of these involves the Opponent Processes Theory of Motivation (Solomon & Corbit 1973), in which a compensatory b-process (a process designed to return the system back to a state of homeostasis following a drug-induced disruption), which originally comes on only as a gradual decay in the intense reward state induced by drug self-administration, begins, over time, to come on sooner and be stronger when it does so. Over time, the b-process begins to come on earlier and to be stronger when it does so. This phenomenon of the b-process becoming stronger eventually allows it to compete successfully with the a-process (the set of physiological processes triggered by the ingestion of the drugs themselves- i.e., the rapid heart rate and feelings of euphoria which follow the ingestion of cocaine, for example). The result of this "competition" between the stimulus-bound, drug-induced a-process and the compensatory b-process is that the effects of the a-process come to be greatly attenuated (i.e., the addict obtains progressively less of the desired effects from each successive drug self-administration). Furthermore, the now-stronger b-process causes progressively more severe and debilitating withdrawal symptoms (the withdrawal symptoms come on as a result of the fact that, over time, the b-process becomes able to not only to counteract the effects of the a-process, but also to outlast its effects). Crucially, only the effects of the b-process change over time. The effects of the stimulus-dependent a-process neither increase not decrease over time; rather, they become increasinly attenuated solely as a result of the increasing ability of the b-process to counteract them. The withdrawal symptoms tend to be particularly severe when the addict is exposed to the types of environments and other conditioned stimuli in the presence of which they would usually have taken the drugs, but is deprived of the opportunity to self-adminster the drugs (Siegel et al. 2000). At the same time, the increasing attenuation of the a-process results in the addict having to take ever-increasing amounts of the drug in order to obtain the same degree of desired effects as they had been able to before (the phenomenon of the buildup of "drug tolerance"). The overall result of these phenomena is the seemingly-paradoxical phenomenon of addicts continuously self-administering drugs, to the detriment of their health and personal well-being, not for the purpose of attaining any sort of pleasurable effects, but rather solely with the goal of avoiding the debilitating withdrawal symptoms which are certain to ensue should they cease to do so! In this way, as Robinson and Berridge argue, drug taking can become truly compulsory (Robinson and Berridge, 2003). Interestingly, according to Robinson and Berridge (2003), "Once the b-process is strengthened, even a small drug dose can instate it and thereby trigger withdrawal again. Conversely, prolonged absitnence from the drug would decay the b-process, and the ability to reactivate it would return back to normal. Once the b-process returns back to normal, the person would no longer be addicted," (Robinson and Berridge, 2003, p. 28). This presents a strong argument for treatment programs which emphasize total abstinence from alcohol. It also goes some distance in explaining the phenomenon of people who recently graduated from detoxification programs accidentally overdosing on their "one last time," intake of their drug of choice.

       In their explanation of how the Opponent Processes Theory of Motivation might work at the neural level, Robinson and Berridge (2003) suggest that "... the positive a-process is caused by activation of mesolymbic dopamine projections to the nucleus accumbens and amygdala which in turn govern the reinforcing properties of drugs," (Robinson and Berridge, 2003, p.29) (Koob and Le Moal, 1997). Furthermore, they propose that the buildup of tolerance that comes as a result of repeated drug self-administrations comes as a consequence of a "... downregulation of in the mesolimbic dopamine system," (Robinson and Berridge, 2003, p. 29) which repeated drug self-administration causes. Furthermore, they note that sudden cessation of drug intake (i.e., as might come about when people quit "cold turkey"), "... causes dopamine (and serotonin) levels to drop below normal levels... resulting in a dysphoric B-state of withdrawal," (Robinson and Berridge, 2003, p. 29). As if this were not enough, sudden cessation of drug self-administration apparently also causes the activation of "...additional compensatory b-processes via the hypothalamic-pituitary axis stress system, causing the release of corticotropin releasing factor (CRF) in the amygdala, as well as other stress responses," (Robinson and Berridge, 2003, p. 29) (Koob and Le Moal, 1997). As these researchers argue, such effects cause addicts to what has quickly become (as a result of these neurochemical changes) a negative emotional and motivational state. In order to escape the aversive nature of this state, addicts may attempt to "self-medicate" by resuming to take at least some amount of the drug to which they were previously addicted, rekindling their previous dependence upon it.

     Robinson and Berridge (2003) go on to offer several interesting criticisms of the Opponent Processes Theory of Motivation as applied toward explaining drug addiction. For one thing (they argue), withdrawal (induced by activation of the b-process, as noted above), may actually be a far less powerful motivator of drug-seeking behavior than either the pleasurable effects which come as a direct result of taking the drugs, or the effects of stress, (Stewart and Wise, 1992)  For example, Stewart and Wise (1992) performed a study involving rats and lever-pressing behavior as part of a study attempting to elucidate which factors might be responsible for stimulating relapse among rats which had previously been addicted to drugs (cocaine or heroin specifically) but who had since been deprived of the drugs for a sufficient amount of time to break their addiction. In that study, Stewart, Shaham, and other researchers measured the amount of lever-pressing which the rats were willing to perform under extinction conditions in order to obtain the reward of an injection of the drug to which they had previously been addicted (Stewart and Wise, 1992). In that study, the researchers performed the manipulation of activating either the "a process," or the "b process." "Activating the 'a' process was done by giving the rats a small injection of the drug to which they had previously been addicted (Stewart and Wise 1992).To activate the b-process, the researchers administered a drug called naltrexone to the rats (Stewart and Wise, 1992). Naltrexone is an opioid antagonist, which means that it blocks the brain's naturally-occurring opioid receptors, with the result that when it is administered to individuals who are dependent on the opioid drug heroin, these individuals begin experiencing withdrawal symtoms (Stewart and Wise, 1992). For a heroin addict, the state which they would be experiencing following an adminstration of naltrexone is similar to the type of drug-withdrawal state which they would be experiencing if they suddenly ceased self-administering heroin (Stewart and Wise, 1992). As Robinson and Berridge (2003) explain, in the context of the Opponent Processes Theory of Motivation as applied to drug addiction, the artificial "withdrawal" state which an administration of naltrexone induces among former drug addicts is likely to be the most powerful force for getting them to relapse and resume self-administering heroin (this is the case because, according to the Opponent Processes Theory of Motivation's explanation of drug addiction and relapse, the withdrawal symptoms which individuals experience following their ceasing to self-administer a drug are the most powerful force toward motivating them to relapse in their drug-taking habits. Interestingly, however, the results of the Stewart and Wise (2003) experiment would suggest that, actually, both the administration of a small amount of the drug of choice, or exposure to a stressor are actually much more powerful means by which to induce a relapse in drug-seeking behavior, than the naltroxene-induced experience of withdrawal symptoms! (Stewart and Wise 1992). In other words, activation of the A process (as operationalized by exposing the rats to a small amount of the drug to which they had previously been addicted), is actually much more powerful than activation of B-process withdrawal symptoms. This would imply that there must be another force at work, aside from the experience of the withdrawal symptoms alone, which would be motivating individuals to continue self-administering the drugs.

As Robinson and Berridge (2003) also note, another problem which the Opponent Processes Thoery of Motivation has difficulty explaining is that of the unfortunate case in which individuals who were previously addicted to drugs, but who have since been abstinent for a significant period of time suddenly begin self-administering drugs, seemingly of their own volition. This especially seems to be the case when one notes that, by this time, the b-process (which will have been causing negative withdrawal symptoms) should have decayed away a long time ago- and, thus, according to the Opponent Processes Theory of Motivation, these individuals should no longer be addicted to drugs!
A possible explanation for this, as Domjan notes, is that of the drug cues surrounding the self-administration of a drug (i.e., being in a particular location, being surrounded by particular friends, etc.) becoming, over many consecutive drug self-administrations, conditioned to act as conditioned stimuli predicting the impending self-administration of drugs. This also allows them to stimulate the initiation of compensatory b-processes, allowing an individual's body to compensate ahead of time for the dose of the drug which the conditioned cues predict. In accordance with this, studies with both humans and animals have shown that indeed,  such conditioned cues can, indeed, elicit withdrawal symptoms- at least in theory (Robinson and Berridge, 2003). The problem, however, is that they often do not; in fact, as Robinson and Berridge (2003) report, many former addicts report that they fail to do so! The fact that conditioned cues related to the self-administration of drugs do not appear powerful enough, on their own, to explain why relapse occurs has led researchers to search for different explanations.

       One such alternative, which Robinson and Berridge (2003) also present, is the possibility is that addiction to drugs comes as the result of learning processes. According to Robinson and Berridge (2003),this view came about as a result of a series of findings illustrating the role of neural circuitry, particularly that associated with the nucleus accumbens, in learning about rewards. Most hypotheses attempting to explain drug addiction through the role of learning processes argue that what occurrs specifically is "... drugs produce abnormally strong or aberrant associations involved in reward learning, more powerful than natural reward associations," (Robinson and Berridge, 2003). According to Robinson and Berridge (2003), the specific type of learning occuring in these types of situations could be of pretty much any kind.

The first possibility which Robinson and Berridge (2003) consider is that of drugs being involved in explicit learning, in which the association formed just happens to be abnormally strong. As Robinson and Berridge (2003) state, in such a case, individuals addicted to drugs will simply have formed an unusually strong action-outcome association between the action of self-administering drugs and the outcome of receiving them, as well as another, also unusually strong, association between perceiving certain environmental cues and the imepending delivery of drugs (Balleine and Dickinson, 1998).
       Evidence for the fact that humans establish Response-Outcome associations in this manner comes, for instance, from a study by Balleine and Dickinson (1998) establishing the existance of what they call a "contingency degradation effect"- the idea that reducing the contingency between an action and a desired outcome by adding extra instances of the delivery of the outcome (reinforcer) should (and does) reduce the number of instances of the action.
      Similarly, a study by Colwill and Resorla (1986) established the existance of what they called the "Reinforcer Devaluation Effect"- the idea of which is that, if a subject performs an action in order to get a desired effect, then reducing the desirability of the effect should lead to a decrease in the numnber of instances of the action preceding it (which is exactly what happens).
      The fact that Robinson and Berridge (2003) include animals, as well as people, in this category of individuals who can form such abnormally-strong associations
differentiates their work from, for example, that of Thorndike, who, as Domjan states, all throughout his studies involving animals learning to escape out of puzzle boxes and the like, remained convinced that the animals' escapes were merely the result of trial-and-error learning, rather than their becoming aware, in any kind of intelligent way, of their specific actions to escape the box and the ensueing positive outcome. Rather, he believed the animals initially made their escapes through trial-and-error, and then simply repeated the same type of escape over and over again.
        According to Robinson and Berridge (2003), "Abnormally strong explicit learning might distort declarative memories or expectations in two ways. (a) Conscious memories of the hedonic drug experience might be especially vivid and/or abnormally intrusive. (b) Drugs could exaggerate or distort declarative memories such that memory-based cognitive expectations about drugs become excessively optimistic," (Robinson and Berridge, 2003, p. 32). As Robinson and Berridge (2003) go on to explain, however, addicts' excessive drug-taking behaviors do not seem to be tied to distorted memories and excessively positive expectations about the consequences of doing so. Evidence for this comes through such sources as what the addicts themselves say about their lives and the amount of pleasure they expect from drug taking- which they themselves agree does not justify the excessive costs of continuing to self-administer controlled substances (Robinson and Berridge, 2003).
These conclusions are also consistent with the results of the previously-mentioned study by Colwill & Rescorla (1986) establishing the existance of the reinforcer devaluation effect- the main conclusion of which was that devaluing the value of a desired reinforcer should lead a decrease in the occurences of actions being performed with the goal of obtaining that reinforcer. Since, as Robinson and Berridge (2003) claimed, most addicts do not have unrealistically positive expectations of regarding the types of outcomes which drug-seeking and drug-taking behaviors, it appears that the "outcome" in this situation (i.e., whatever sensations they obtain from taking the drugs that they do) has effectively been "devalued" for them. Given this, according to the results of the study by Colwell and Rescorla (1986), the drug-seeking behaviors should lessen or cease. Thus, it makes sense that Robinson and Berridge (2003) claim that drug addiction likely cannot be explained as being rooted in the formation of conscious, although overly-intense and overly-positive, response-outcome associations between the response of drug self-administrations and the outcome of drug-induced "highs."

In response to this, Robinson and Berridge (2003) came up with an alternative explanation for drug addiction- that of it being the result of the establishment of deeply-ingrained and implicit (rather than declarative) S-R (stimulus-response) or S-S (stimulus-stimulus) habit associations within an addict's mind. As Robinson and Berridge (2003) note, the most promising of these possibilities is that of addicts' drug habits progressing from what are initially readily-verbalized, declarative-memory associations between the action of taking a drug (a response) and the outcome of a "high,"which, based on previous congitive expectations has been labeled a "desirable outcome," to to more automatic, stimulus-response guided behavior. As habits, the latter associations occur without the involvement of conscious memory processes. Essentially, such hypotheses suggest, that, as was suggested in my Fundamentals of Learning class lecture, over-learned action patterns often become automatic habits. What Robinson and Berridge (2003) add to this is the idea that, over time, such habits can actually become compulsive.
This idea also makes sense in light of a study conducted by Adams (1982), which sought to investigate the question of why behavior is mediated by goal-directed, stimulus-response associations at some times, and by habitual, stimulus-response assocations at other times. In that study, rats were initially trained to press a lever (make a response) in order to obtain the reward of one sucrose pellet (an outcome/reinforcer) (Adams 1982). Initially, the schedule of reinforcement in this study was one of "continuous reinforcement," or a "fixed response or Fr1 schedule"- that is, the rats were rewarded with a sucrose pellet for every lever press they made (Adams 1982). The rats were then divided into two groups: one group received 100 trials of training, while another group received 500 trials of training (Adams 1982). Then, the rats in each of these two groups were further divided up into two groups (i.e., the rats in the "500 trial group" were divided into two groups, as were the rats in the "100 trial group")(Adams 1982). For one of the two groups of rats in each of the trail conditions (i.e., the 500-trial group, and the 100-trial group, respectively), the sucrose-pellet outcome was then devalued though pairing the administration of the sucrose-pellet outcome with the administration of an aversive lithium chloride stimulus (Adams 1982). For the other half of the rats in each number-of-tirials condition, the stimulus was not devalued (i.e., the sucrose pellets were still delivered normally(Adams 1982).Then, both groups of rats were tested under extinction conditions in order to see how many lever pressess they would make (Adams 1982). Fascinatingly, for the rats in the 100-trial group, the lever-pressing behavior remained goal-directed; that is, the rats would only press the lever if doing so allowed them to attain a desired outcome (that of being rewarded with regular sucrose pellets, which they saw as desirable, and not lithium-chloride sucrose pellets, which they saw as undesirable)(Adams 1982). Among the rats in the "500 trail group," however, all rats continued to press the lever, whether or not the sucrose-pellet outcome for doing so had been devalued! (Adams 1982). The results of this experiment by Adams (1982) lend further support to the idea that overtraining a behavior can cause it to transition from being a goal-directed behavior to becoming one that is habitual and mindlessly insensitive to the outcome it might result in.

However, in response to this, Robinson and Berridge (2003) offer the interesting criticism that while an overtrained behavior is indeed likely to become habitual, in their words, "... habits are not necessarily likely to become compulsive in any motivational sense, no matter how automatic they are,"(Robinson and Berridge, 2003, p. 33). To this ends, Robinson and Berridge (2003) present the examples of habits such as brushing one's teeth and the like and argue that, while these are habits and thus require minimal cognitive attention, they are not motivationally compulsive; people, with very few exceptions,  feel absolutely compelled to do them, even in the face of great sacrifice. Indeed, Robinson and Berridge (2003) argue that the sheer flexibility and the great range of behaviors in which addicts are willing to engage in order to obtain the drugs they crave, as well as the severely agitated and distressed manner in which many addicts react when they are suddenly deprived of a routine dose of the drugs to which they have become accustomed mandates a search for further explanations of drug addiction, beyond simply the formation of strong habits.

Similarly, while Robinson and Berridge (2003) briefly consider the possibility that perhaps drug addiction is mediated by a distorted form of learning in which an addict might form an unreasonably strong association between, for instance, a particular location and the environmental stimuli present there, and the administration of a drug, they ultimately dismiss the possibility that such actions might be able to elicit compulsive drug taking. Rather, they suggest that "... most S-S associations may actually remian normal in addicts. What is aberrant in aberrant in addiction is the response of brain motivational systems to Pavlovian-conditioned drug cues," (Robinson and Berridge, 2003, p. 35).

Thus, in sum, Robinson and Berridge (2003) argue that ".. learning and incentive motivational processes are joined in the transition to addiction," (Robinson and Berridge, 2003, p. 36). Their view that learning and motivational processes are joined in the development of addictions is evident in their own theory of drug addiction, which they refer to as that of "incentive sensitization," (Robinson and Berridge, 2003, p.36).

According to the "incentive sensitization," (Robinson and Berridge, 2003, p. 36) view of addiction pioneered by Robinson and Berridge (2003), drug addiction involves "drug-cues triggering excessive incentive motivation for drugs, leading to compulsive drug seeking, drug taking, and relapse," (Robinson and Berridge, 2003, p. 35), (Robinson and Berridge, 1993, 2000).The main idea behind this theory is that the self-administration of drugs creates lasting changes in the brain's nucleus accumbens-related structures. The structures which drugs tend to modify are those which attribute salience (or lack thereof) to various environmental stimuli. As a consequence of these drug-induced modifications, areas in the nucleus accumbens become lastingly hypersenstive [or as Robinson and Berridge call it, "sensitized," (Robinson and Berridge, 2003, p. 36) to the effects of particular drugs and to the stimuli predicting their administration. These changes lead areas in the3 nucleus accumbens area of the brain to attribute excessively-high ratings of stimulus salience to drugs and cues related to their self-adminstration- a process which, on a psychological level, leads to extreme and abnormal desire to take drugs. As Robinson and Berridge (2003) go on to explain, such abnormal "wanting" "...can sometimes become manifest implicitly in drug-seeking behavior," (Robinson and Berridge, 2003, p. 36). The fact that this often happens at an implicit level means that individuals often remain unaware, on a conscious level, of the processes taking place [although Robinson and Berridge (2003) go on to explain that sometimes other cognitive processes can sometimes cause these representations which these processes involve to become consciously manifest, resulting in the affected individuals developing a very consciously-noticable desire to take drugs.
       Significantly, Robinson and Berridge (2003) draw a distinction between "the sensitized neural systems responsible for incentive salience," (Robinson and Berridge, 2003, p. 36)- in other words, those responsible for how much a particular drug might be "wanted"- and "the neural systems that mediate the hedonic effects of drugs, how much they are 'liked,'" (Robinson and Berridge, 2003, p. 36). They further suggest that these two processes are governed by different areas of the brain, and thus largely operate independently of each other.
       In defining "sensitization," Robinson and Berridge (2003), speak of it being largely the opposite of tolerance; that is, in sensitization, with each subsequent self-administration of a drug actually increases its effects. According to Robinson and Berridge (2003), there are two major classes of sensitization: psychomotor sensitization, and motivational sensitization. Both, however,are sensitive to changes in the neural structure of the nucleus accumbens- a process which they argue takes place when individuals repeatedly self-administer drugs. Interestingly, psychomotor sensitization, at least, is context-specific; for instance, rats which had been exposed to stimulants which resulted in their exhibiting symptoms of psychomotor sensitization only showed these symptoms when they were subsequently tested in an environment that contained many of the same cues as had been present when the drugs were orginally being administered to them (Robinson and Berridge, 2003).
       Interestingly, as Robinson and Berridge (2003) note, individuals can differ in their degree of susceptibility to sensitization- something which they argue can be helpful in unraveling the puzzle of why some individuals, but not others, become drug addicts. Furthermore, they note that "... once sensitized, most individuals show cross-sensitization, which means that sensitization to one drug can cause sensitized effects for other drugs as well," (Robinson and Berridge, 2003, p. 38). Finally, Robinson and Berridge (2003) note that cross-sensitization can even occur between drugs and nondrug stress! Taken together, these factors begin to suggest how such factors as gateway drugs and exposure to stress may come into play in addiction.

In terms of the specific changes in the brain involved in sensitization, Robinson and Berridge (2003) argue that "sensitization-related changes have been described in many neurotransmitter systems that are integral to the function of NA-cc related circuits including serotonin, norepinephrine, acetylcholine, opioid and GABA systems," (Robbinson and Berridge, 2003, p. 38). Thus, the changes in the nucleus accumbens by sensitization appear to be quite widespread. Furthermore, drug-induced changes in the nucleus accumbens occasionally occur even at the length of neurons, with neurons in the nucleus accumbens and in the prefrontal cortex often showing "...changes in the lengths of dendrites and the extent to which dendrites are branched," (Robbinson and berridge, 2003, p. 38). Furthermore, "...changes can also occur in the density and types of dendritic spines, which are primary site of excitatory gluatamate synapses," (Robinson and Berridge, 2003, p. 39). As Robinson and Berridge (2003) argue, such changes may significantly alter the way information about rewards is processed in this area of the brain.

Overall, these effects can go some distance in explaining not only why it is that some people become addicted to drugs, but also why many of these individuals continue self-administering them, even in the face of deriving little pleasure and, often, much suffering as a result of continuing to do so.

                                                                      References

Adams, C. D. (1982). Variation in the sensitivity of instrumental responding to reinforcer devaluation. The Quarterly Journal of Experimental Psychology B: Comparative and Physiological Psychology, 34B(2), 77-98. Retrieved from http://www.tandf.co.uk/journals/pp/02724987.html

Colwill, R. M., & Recorla, R. A. (1986). Postconditioning devaluation of a reinforcer affects instrumental responding. Journal of Experimental Psychology: Animal Behavior Processes, 11(1), 120-132. Retrieved from http://www.apa.org/pubs/journals/xan/index.aspx

Domjan, M. (2009). Learning and behavior. (6 ed., pp. 107, 115). Belmont, CA: Wadsworth, Cengage Learning.

Kelley, A. E., & Berridge, K. C. (2002). The neuroscience of natural rewards: Relevance to addictive drugs. Journal of Neursocience, 22(9), 3306-3311. Retrieved from Kelley, A. E., & Berridge, K. C. (2002). The neuroscience of natural rewards: Relevance to addictive drugs. Journal of Neursocience, 22(9), 3306-3311.

Koob, G. F., & Le Moal, M. (1997). Drug abuse: Hedonic homeostatic dysregulation. Science, 278(5335), 52-58. doi: 10.1126/science.278.5335.52

Robinson, T. E., & Berridge, K. C. (2003). Addiction. Annual Review of Psychology, 54, 25-53. doi: 10.1146/annurev.psych.54.101601.145237

Robinson, T. E., & Berridge, K. C. (2000). The psychology and neurobiology of addiction: An incentive-sensitization view. Addiction, 95(Suppl2), S91-S117. doi: 10.1080/09652140050111681

Robinson, T. E., & Berridge, K. C. (1993). The neural basis of drug craving: an incentive-sensitization theory of addiction. Brain Research Review, 18(3), 247-291. doi: 10.1016/0165-0173(93)90013-P

Siegel, S., Baptista, M. A. S., Kim, J. A., & Weise-Kelley, L. (2000). Pavlovial psychopharmacology: The associative basis of tolerance. Clinical Psychopharmacology. Special Issue: The Decade of Behavior: Psychopharamcology and Substance Abuse Research, 8(3), 276-293. doi: 10.1037/1064-1297.8.3.276

Solomon, R. L., & Corbit, J. D. (1973). An opponent process theory of motivation: I. temporal dynamics of affect. Psychological Review, 81(2), 119-145. doi: 10.1037/h0036128

Stewart, J., & Wise, R. A. (1992). Reinstatement of heroin self-administration habits: Morphine

prompts and naltrexone discourages renewed responding after extinction. Pychopharmacology, 108

(1-2), 79-84. doi: 10.1007/BF02245289

Saturday, May 19, 2012

In their article "The Neural Mechanisms of Drug Addiction," Hyman, Malenka, and Nestler (2006) approach the long-standing problem of drug addiction from the point of view of understanding the addictive nature of drugs as being rooted in the ability of the addictive substances to hijack the brain circuitry normally involved in reward-related learning. Specifically, they believe that the cause for the compulsive nature of drug addiction (i.e., the fact that addicts continue self-administering the addictive substances despite full knowledge of the negative consequences of doing so) is to be found among the same areas of the brain as are responsible for associative learning and the formation of long-term memories. These areas of the brain include the ventral (or "underside of") and dorsal (or "upper side of") regions of the striatum, an area of the brain involved in the anticipation of reward, (Speert, 2012), which, as Hyman, Malenka, and Nestler (2006) go on to explain, is sensitive to electrochemical signals from dopamine neurons located in the midbrain.
Hyman et. al (2006) begin their article by introducing drug addiction, and especially the problem of relapse despite a patient's willingness to stop self-administering a particular drug, as a serious public health problem, the solution to which- the creation of improved treatment programs- can only come as a result of greater understanding of the specific neural processes involved in the addiction process itself, (Hyman, Malenka, and Nestlar, 2006).
While Hyman et al. (2006) note that a large amount of progress, especially involving the use of animal models, has already been made, (this has largely been possible because the drugs themselves can readily be identified as the cause of the addiction), they emphasize the importance- and the difficulty- of linking these findings obtained with animals (whose worlds can be carefully controlled by experimenters in the laboratory), to the experiences of human beings, whose worlds are far more complex and not subject to careful control and experimental manipulation. They argue, for example, that while much is known about the immediate and short-term effects of the binding of addictive drugs to specific sites in the brain, the question of the long-term effects of such binding and the changes in the brain they might engender merits additional research. Thus far, as they explain, it has been found that several distinct types of adaptation to long-term drug exposure occur, including those of homeostatic adaptation [this was discussed in a previous blog post in the context of opponent-process theory of human motivation, in terms of the gradual strengthening, over the course of repeated drug self-administrations, of a "b" process, which would come online to balance out the effects of a stimulus-dependent "a" process, (which, in this case, would represent the disruption of homeostasis caused by the self-administration of a psychoactive drug)- and how, over time, the "b-process," which would come to be initiated by cues related to the impending self-administration of a drug, would become stronger and come online sooner, resulting in a decreased overall effect of drug-taking and the phenomenon of the buildup of drug-tolerance (the need for an addict to take an ever-increasing amount of a drug of abuse in order to get the same effect as they did earlier, before the b-process became strengthened via repeated exposure to cues signaling the impending self-administration of a drug- and the disruption of homeostasis that would inevitably come as a result, (Domjan, 2009, p. 117))]- as well as, in the words of Hyman et al. (2006), "synapse-specific 'Hebbian' adaptations of the type thought to underlie specific long-term associative memory," (Hyman, Malenka, and Nestler, 2006, p. 566).
Hyman et al. (2006) begin their discussion by noting that while the amount and variety of natural stimuli that activate the reward circuitry of the brain is relatively large, the amount of substances which can "hijack" this circuitry, thereby mimicking the effects of natural rewards, is actually relatively small- limited to just "the psychostimulants (cocaine and amphetamine), the opiates, nicotine, ethyl alcohol, and the cannabinoids..." (Hyman, Malenka, and Nestler, 2006, p. 567). Interestingly, however, once they are exposed to these drugs, humans and other animals rapidly learn the cues that predict their availability, and (as is further discussed in previous blog posts), exposure to these conditioned stimuli initiates strong cravings for the drugs, and may lead to relapse among recovering addicts. Hyman et al. (2006), go on to describe this in terms of a conditioned place preference model- one in which, if rats or mice have been previously exposed to a pleasurable unconditioned stimulus (such as a drug) in a particular location, they will come to develop a preference for being in that location over other locations within the same enclosure or other general space (presumably due to the association which they have formed between the conditioned stimulus of being in that location and the unconditioned stimulus of receiving the reinforcing stimulus of the drug (Hyman et al., 2006). An interesting study of this was conducted by Akins (1998, Experiment 1). In that study, which involved male quail, the conditioned stimuli associated with being in a particular location within the cage, were conditioned to predict the availability of sexual reinforcement, in the form of the presence of a female quail. In that study, the male quail were placed in an enclosure which had two compartments, both of which looked visually appeared very different. After a test designed to establish the baseline preference of the male quail in regards to which of the two compartments they preferred to be in, the compartment which the quail preferred least was chosen to be the one for which preference would be conditioned (with the visual cues related to being in this compartment coming to be conditioned as the visual stimuli). The quail were then divided into two groups, an experimental group and a control group. The conditioning itself consisted of having one male quail from the experimental group at a time enter the chamber for which the male quail had originally demonstrated less preference, and remain there for a duration of five minutes, after which a sexually-receptive female quail was placed into the chamber with them, also for a duration of five minutes. Thus, for the male quail in the experimental group, the conditioned stimulus of the visual and other stimuli denoting the previously-less-preferred conditioning chamber were paired with the unconditioned stimulus of the appearance of a sexually-receptive female bird. For the quail in the control group, by contrast, the sexually receptive female quail was presented to them in a different location and a full 120 minutes before their subsequent exposure to the previously-less-preferred conditioning chamber and its related contextual stimuli. Thus, for the quail in the control group, the conditioned stimulus of the visual and other appearance of the previously-less-preferred experimental chamber and the presence of the sexually-receptive female quail were presented in what was essentially an explicitly unpaired fashion, preventing the conditioning of an association between these two cues among members of the control group. As was predicted, members of the experimental group of quail came to develop a preference for being in the previously-less-preferred chamber following the pairings of the visual and other cues representative of this chamber with the unconditioned stimulus reinforcer of the presence of the sexually-receptive female quail (Akins, 1998, Experiment 1).
Furthermore, Hyman et al. (2006) go on to mention that while repeated self-administrations of any drug will result in homeostatic adaptations within the circuitry of the regions of the brain stimulated by the drug, the specific means by which this happens and the resulting likelihood of the development of drug tolerance and addiction, and the ways in which such phenomena might become manifest vary markedly between different types of addictive substances, "... depending on the expression patterns of each drug's receptors and the signaling mechanisms engaged by the drug stimulation in relevant cells," (Hyman, Malenka, and Nestler, 2006, p. 567). Similarly, they note that substance dependence- which they characterize as the unmasking of the changes in the brain that have been caused by the self-administration of a particular drug which become evident as soon as regular drug self-administrations cease- can also vary greatly depending upon the drug taken and the specific neural circuitry involved. They go on to give the examples of how "... withdrawal from opiates or ethanol can produce serious physical symptoms, including flu-like symptoms and abdominal cramps (opiates), or hypertension, tremor, and seizures," (Hyman, Malenka, and Nestler, 2006, p. 568).
This variability in the withdrawal symptoms which come following an individuals' ceasing to self-administer a drug to which they have become accustomed, is consistent with Siegel's Conditioning Model of Drug Tolerance, according to which repeated self-administrations of a drug lead to the cues related to the impending administration of the drug coming to be conditioned to predict the physiological effects of the impending drug administration- which in turn leads to the earlier and increased activation of a "b process," which comes to attenuate the "high" or other pleasurable sensation experienced following an individual's self-administration of a drug, leading to the buildup of drug tolerance (Domjan 2009). As Domjan (2009) also explains, over the course of repeated conditioning trials, the stimuli related to the impending self-administration of a drug come to elicit a conditioned response . According to Pavlov's Stimulus Substitution model, which argues that, in the development of conditioned responding, the conditioned stimulus comes to operate in the exact same way as the unconditioned stimulus did previously, with the conditioned stimulus coming to activate the same neural connections as the unconditioned stimulus previously activated (Domjan 2009), cues related to the impending administration of a drug should then come to lead to the experience of the same physiological effects as the administration of the drug itself. While in many cases, the conditioned response to the self-administration of a drug is indeed just like the unconditioned response to the drug would be [an example of this is a study conducted by Ehrman, Robbins, Childress, and O'Brien (1992), in which two groups of participants- one, a group of former cocaine users, and another, a group of men with no history of using cocaine, were exposed to three experimental conditions: one in which they were exposed to cues related to the use of cocaine, one in which they were exposed to cues related to the use of heroin (none of the participants had any experience with using heroin), and another condition in which cues unrelated to the use of either drug were presented. Interestingly, in that study, the group of former cocaine users specifically experienced an increase in heart rate when exposed to the cues related to the cocaine-related stimuli (and not in response to the heroin-related or neutral stimuli)- a result exactly in line with Pavlov's Stimulus Substitution Model! In that particular case, the conditioned response to the conditioned stimuli of the cocaine-related cues came to evoke exactly the same response as the unconditioned response the former cocaine users would normally have to cocaine- that is, an increase in heart rate from baseline], this is not always the case; in fact, in many cases, the form of the conditioned response to environmental cues which have come to be associated with the self-administration of the drug is just like the form of the compensatory unconditioned response, rather than the primary unconditioned response. The case of the opioid drug heroin, for instance, is a great example of this. When heroin is administered, the drug itself is an Unconditional Stimulus (US) which elicits two distinct unconditioned responses- one of these is the primary unconditioned response, (this response has also been mentioned above in the context of the opponent process theory of motivation as being the "primary" or "a" process), which moves the system out of homeostasis, and another of which is the compensatory unconditioned response, (discussed above as being the "b process" in the opponent-process theory of motivation), which counteracts the effects of the primary unconditioned response and returns the system to homeostasis. In the case of heroin and many other drugs, the form of the conditioned response to drug-related cues is just like the form of the compensatory response; thus, while the primary unconditioned response to a self-administration of heroin would be a lower heart rate, lower blood pressure, and analgesia (decreased sensitivity to pain), the compensatory unconditional response is an increase in heart rate and blood pressure, and an unpleasant increase in sensitivity to pain. This stands in sharp contrast to what Pavlov's Stimulus Substitution Model would predict- but goes some distance in explaining why the withdrawal symptoms of many drugs (i.e., heroin, alcohol, etc.) appear to be almost "the opposite of" those of the primary unconditioned response to the self-administration of these drugs; these effects are seen because, in the case of these drugs, the conditioned response an individual might have to being exposed to cues related to the consumption of the drugs is similar to the compensatory unconditioned response, rather than the primary unconditioned response to them! The fact that this conditioned response may be similar to or different from the primary unconditioned response also might go some distance in explaining the extent of the variation in symptoms, as Hyman et al. (2006) describe them, which individuals withdrawing from different types of drugs might experience.
However, as Hyman et al. (2006) note, "Whereas avoidance of withdrawal likely contributes to ongoing drug use (especially with opiates, alcohol, and tobacco), it does not explain the most frustrating characteristic, from a clinical point of view, of addiction: the persistence of a relapse risk long after a person has ceased taking drugs," (Hyman, Malenka, and Nestlar, 2006, p. 569). Instead, Hyman et al. argue that "... the primary neural substrates of persistent compulsive drug use are not homeostatic adaptations leading to dependence and withdrawal, but rather long-term associative memory processes occurring in several neural circuits that receive input from midbrain dopamine neurons," (Hyman, Malenka, and Nestlar, 2006, p. 569). Such a model of drug dependence can also go further to explain why drug addiction can be so persistent and difficult to overcome; in the words of Hyman et al., (2006), "Long-term memories, unlike most homeostatic adaptations, can last for many years or even a lifetime," (Hyman, Malenka, and Nestlar, 2006, p.569). Given this, and the fact that, as they also mentioned, many instances of relapse are fueled by exposure to stimuli previously associated with drug cues, it becomes more easy to understand the possible reasons why individuals might suddenly resume a drug-taking habit, even years after initially attempting to (or perhaps even totally) "quitting."
Furthermore, while noting a possible role for emotional stress as well as the initiation of drug cravings following exposure to drug-related stimuli, Hyman et al. (2006) nonetheless go on to emphasize the role of brain pathways in drug addiction. Crucially, they emphasize that while drugs and other addictive substances do operate upon the same circuitry as natural rewards, the "hijacking," of natural reward circuitry in which addictive substances engage in is particularly detrimental, both because, "unlike natural rewards, drug rewards tend to become overvalued," (Hyman, Malenka, and Nestlar, 2006, p. 574), and because "... unlike natural rewards, addictive drugs do not serve any beneficial homeostatic or reproductive function, but instead often prove detrimental to health and functioning," (Hyman, Malenka, and Nestlar, 2006, p. 571).
Furthermore, they note that evidence suggests that, in the case of both natural rewards and those related to the self-administration of drugs, it is the increase in synaptic dopamine in the nucleus accumbens region of the brain. As Hyman, Malenka, and Nestlar (2006) go on to note, many of the neurons within the nucleus accumbens region of the brain have dendritic projections which allow for the binding of many molecules at once through synapses, with these neurons receiving signals from neurons that themselves release an opioid, made within the brain, which attaches to what is called a "mu receptor." Furthermore, more dopamine neurons in the ventral tegmental area of the midbrain also supply neurons within the nucleus accumbens region of the brain with additional dopamine. Psychoactive drugs can also influence the activity of neurons in the nucleus accumbens by influencing the release of both opioids and dopamine made within the brain, directly influencing the dopamine-sensitive neurons within the nucleus accumbens, or by having an impact upon the actions of the neurons which generate the inhibitory neurotransmitter GABA (and thereby regulate neural activity). Furthermore, neurons within the cortex can also have an influence upon the neurons within the nucleus accumbens, through their release of the excitatory neurotransmitter glutamate. Significantly, changes in how the post-synaptic (receiving) cell responds to the release of glutamate can trigger long-term changes in how a particular neural circuit operates- which in term can have important implications for learning processes (Hyman et al., 2006). In the specific case of the cortex and its release of glutamate onto the nucleus accumbens, the release of glutamate from the cortex is believed to provide the nucleus accumbens with crucial information regarding the engagement of particular sensory systems. To complete the picture, neurons in the cortex also release dopamine onto the neurons in the nucleus accumbens- with the latter neurotransmitter providing neurons within the nucleus accumbens with crucial information regarding the motivational state of the organism. When these two cues are taken together (and indeed, they often occur at the same time), the dopamine being released by the cortex can be seen as assigning a reward value of sorts to the level of engagement of various sensory systems (which can be gauged by the organism via noting the current level of release of glutamate neurotransmitter by the cortex) (Hyman et al. 2006). Crucially, what the animal seems to actually learn from this pairing of glutamate and dopamine release from the cortex onto the nucleus accumbens appears to be the difference between the level of reward it expected, and the level of reward it actually obtained, (Schultz, 2006). As Domjan (2009) notes, this learning-by-surprise parallels the way learning is hypothesized to occur in the Rescorla-Wagner equation, wherein, if a US is no longer surprising for any reason, including if it is already being perfectly predicted by a conditioned stimulus previously conditioned to asymptote at the time when an additional conditioned stimulus is added, then no additional learning will take place. In the case of dopamine release onto the ventral tegmental area of the nucleus accumbens, the same principle applies; if a forthcoming burst of dopamine from the cortex is already being accurately predicted by one stimulus, then another stimulus predicting the same phenomenon will fail to result in any additional learning taking place. In the case of drug abuse, psychoactive drugs have an unfair advantage in this process: since they can artificially ramp up dopamine release by the cortex- and this unexpected release of dopamine results in surprise, which stimulates the learning of a pairing between the sensory cues associated with the drugs of abuse and the dopamine release in the cortex- drugs can thus artificially acquire motivational value- which, tragically, makes the habit of using them all the harder to break.

References

Akins, C.K. (1998). Context excitation and modulation of conditioned sexual behavior. Animal Learning & Behavior, 26, 416-426.

Domjan, M. (2009). Learning and behavior. (6 ed., pp. 107, 115). Belmont, CA: Wadsworth, Cengage Learning.

Ehrman, R. N., Robbins, S. J., Childress, A. R., & O'brien, C. P. (1992). Conditioned responses to cocaine-related stimuli in cocaine abuse patients. Psychopharmacology, 107(4), 523-529. doi: 10.1007/BF02245266

Hyman, S. E., Malenka, R. C., & Nestlar, E. J. (2006). Neural mechanisms of addiction: The role of reward-related learning and memory. Annual Review of Neuroscience, 29, 565-598. doi: 10.1146/annurev.neuro.29.051605.113009

Schultz, W. (2006). Behavioral theories and the neurophysiology of reward. Annual Review of Psychology, 57, 87-115. doi: 10.1146/annurev.psych.56.091103.070229

Speert, D. (2012, February 02). Neuroeconomics: Money and the brain. Retrieved from http://www.brainfacts.org/In-Society/In-Society/Articles/2012/Neuroeconomics-Money-and-the-Brain

Saturday, May 12, 2012

Another way of understanding substance abuse is through looking at how the consumption of illicit substances progresses from an initial point, during which a drug is consumed voluntarily and the motivation for doing so is tied to the drug being perceived as having pleasurable and reinforcing effects, to an ultimate end point at which an addict consumes a drug in a compulsive manner and complains of being "unable to quit," and how this transition is correlated with changes in activity in specific areas of the brain. A study by Everitt and Robbins called "Neural systems of reinforcement for drug addiction: from actions to habits to compulsion" did just that.
Interestingly, Everitt and Robbins (2005) argued that such transitions depend upon "interactions between pavlovian and instrumental learning processes," (Everitt and Robbins, 2005, p. 1481). As Domjan (2009) explains, the difference between pavlovian (or "classical") conditioning processes and instrumental (or "operant") conditioning processes is that, while undergoing classical conditioning, an organism develops an understanding of relationships between processes in its environment which it cannot directly control, and develops appropriate responses to these processes. Thus, classical conditioning can be seen as more "passive." By contrast, during instrumental conditioning (sometimes called "operant" conditioning, in the sense that an organism "performs operations" on its environment, in the pursuit of a specific goal), "responding is necessary to produce a desired environmental outcome," (Domjan, 2005, p. 144).
At the neural level, Everitt and Robbins hypothesized that such a change from voluntary and purposeful self-administration of a drug to the more compulsive drug-seeking and self-administration which most people would see as characterizing an addict "... represents a transition at the neural level from prefrontal cortical to striatal control over drug seeking and drug taking behavior as well as a progression from ventral to more dorsal domains of the striatum, involving its dopaminergic innervation," (Everitt and Robbins, 2005, p. 1481). In unpacking this statement, it's important to note that while the cortex, near the front of the brain, is an area involved with higher-order thinking and planning, ("Mapping the brain," 2012), striatum is an area lying deeper within the brain that is involved with reward and anticipation of pleasurable outcomes (Speert, 2012).
As Everitt and Robbins go on to explain, the striatum as a whole has increasingly been implicated in both drug abuse and subsequent drug addiction and dependence. As they note, this view has gained traction due to both increased understanding of how various parts of the striatum are linked to one another, and of how behavioral output (i.e., the actions an organism engages in) is a product of both classical and instrumental conditioning. They argue that the two types of learning initially occur in parallel, but that, as drug-seeking and self-administration continues, instrumental learning comes to dominate. They also argue that, as a whole, these processes eventually result in "action- outcome and stimulus-response ('habit') learning," (Everitt and Robbins, 2005, p. 1481).
It is interesting to look at how the "action-outcome and stimulus-response ('habit') learning," (Everitt and Robbins, 2005, p. 1481) mentioned by Everitt and Robbins relates to the pioneering psychologists Thorndike's conceptualization of instrumental behavior and its outcomes as "... reflecting the learning of an S-R association," (Domjan, 2009, p. 146). As Domjan explains, Thorndike's Law of Effect "states that if a response in the presence of a stimulus is followed by a satisfying event, the association between the stimulus (S) and the response (R) is strengthened. If the response is followed by an annoying event, the S-R association is weakened," (Domjan, 2009, p. 146). Furthermore, as Domjan is careful to note, "... the consequence of the response is not one of the elements in the association. The satisfying or annoying consequence simply serves to strengthen or weaken the association between the preceding stimulus and response," (Domjan, 2009, p. 147).
Thus, in order for Thorndike's Law of Effect to successfully map onto the increasingly-compulsive drug-seeking and drug-taking behavior of addicts, the addicts' behavior would have to function such that the association between their response (that of self-administration of the drug), which would necessarily be taking place in the presence of particular stimuli (i.e., the inside of a particular apartment, building, or other environment; the presence of the addicted individual's friends who share their interest in taking the drug, etc.) would be strengthened if administration of the drug leads to a desirable consequence (i.e., a pleasurable "high") and weakened self-administration of the drug were to lead to a negative consequence (i.e., foolishly smoking marijuana while onboard an airplane, resulting in the airplane's smoke detectors, which might lead to a large fine or even arrest for the offender).
It is important to note how, in the above example, while the setting of "onboard an airplane" may seem a bit contrived, it illustrates something important about how such "S-R" (as Thorndike saw them) associations function; in that case, the addict's association between smoking and being on an airplane would be weakened- but would be no change in any associations the addict may have formed between smoking and positive/negative consequences in general.
This is an important point, as it might serve to explain why in a previous study performed by Caddy and Lovibond (1976), wherein participants, in the context of a highly-controlled therapeutic setting, were exposed to a discriminated aversive conditioning procedure, which involved the participants entering a specific room in which the bookshelves and other surfaces contained numerous empty canisters from various alcoholic beverages, and other stimuli relating to the consumption of alcohol. While in this setting, the participants were fitted with shock electrodes (these were attached to their larynx). The participants were then encouraged (by the therapeutic personnel) to continue consuming alcohol beyond a pre-prescribed limit of 0.065% (as measured by a breath analysis test). If the participants did indeed continue drinking above this limit, electric shock was then administered to them (through the electrodes with which they had been outfitted). While the idea in that study was to get the participants to form an association between consuming alcohol above a certain pre-prescribed limit, what might perhaps have actually happened is that the participants formed an association, in line with Thorndike's Law of Effect, between the environmental stimulus "being in the highly-controlled therapeutic setting (i.e., "being in that room with all the bottles on display and other stimuli which were related to alcohol and its consumption) and the response of "consuming alcohol." Since the consumption of alcohol was, in that study, followed by the aversive stimulus [according to the criteria described by Domjan (2009), this would be called "punishment" or "positive punishment," (Domjan, 2009, p. 154)] of the administration of shock, the participants' association between that setting and the consumption of alcohol was then weakened. However, upon completion of the study, and after the participants returned to the real world, wherein the settings were quite different from the highly-controlled therapeutic setting which they had just left-[especially because, in Caddy and Lovibond's (1976) study, the criteria for "success" in treatment allowed for the participants to still engage in moderate drinking- so the participants were still likely to frequent the same bars and other alcohol-centered environments which they had previously gone to]- and because the participants had no previously-formed association between the environments and settings of the outside world (a new "S") and the response of drinking (the same "R)- they were far less successful in abstaining from drinking when they returned to the outside world than they had been while in therapy; while Caddy and Lovibond (1976) cite success rates as high as 85% of the participants showing some type of improvement at 12 months follow-up, this figure drops to a far-lower 59% at 24 months follow-up. Perhaps Thorndike's Law of Effect and the specific types of associations the participants will have formed during the experiment can go some distance in explaining this- especially given that the "S" part of the "S-R" association which (according to Thorndike's Law of Effect) is formed during conditioning includes the stimulus and the response- but not the eventual consequence of making that response (in which case, the results of that particular type of learning might be limited to a particular type of setting).
In terms of how their study, Everitt and Robbins (2005), argue that drugs act as 'instrumental reinforcers' (Everitt and Robbins, 2005, p. 1481), thus making them the "reinforcers" in Thorndike's Law of Effect. They thus "increase the likelihood of the responses that produce them, resulting in drug self-administration of 'drug-taking,'" (Everitt and Robbins, 2005, p. 1481). Furthermore, the "S" or "stimuli" portion of Thorndike's Law of Effect would be covered by the stimuli that have contiguity (close association in time and space) with the administration of the drugs. According to Everitt and Robbins, these would "gain incentive salience through pavlovian conditioning," (Everitt and Robbins, 2005, p. 1418). They argue that the "rewarding" effects of a drug likely result from the increasing attention to interoceptive physiological cues which they produce, as well as (in the case of hallucinogens and the like) the changes in perception of the environment and other external cues that drug taking results in. They go on to mention that this can be particularly reinforcing if it occurs in relation to Conditioned Stimuli (those which are reliably paired with the occurrence of important environmental events). Crucially, Everitt and Robbins (2005) argue that it is the sense of control over environmental and interoceptive cues that an individual who is using drugs feels that using the drug of choice allows them to obtain which acts as the instrumental reinforcer (R) in the context of Thorndike's Law of Effect, as applied to drug abuse and addiction.
Interestingly, Everitt and Robbins (2005) mention that conditioned stimuli (CSs) which act as signals for the impending delivery of positive reinforcers can have several other effects, aside from simply eliciting approach and consummatory behaviors. For instance, when conditioned stimuli are presented unexpectedly, increased rates of responding often result- which, according to Everitt and Robbins (2005), implies that conditioned stimuli can have motivational effects. This effect of the unexpected presentation of conditioned stimuli resulting in increased rates of responding is also consistent with the results of an experiment on the effects of a shift in the quantity of a reinforcer conducted by Mellgren (1972). In his experiment, Mellgren tested the performance of four groups of rats in an experiment involving a runway apparatus. Initially, rats in two of the four groups received a food reward of two food pellets for every successful completion of the runway task, while rats in the other two groups were rewarded with a comparatively much-larger reward of twenty-two food pellets for each successful trip down the runway which they achieved. Then, in the second phase of the experiment, Mellgren took one group of rats from each of the two reward conditions (i.e., one of the two groups of rats from the two-pellet-reward condition, and one of the two groups of rats from the twenty-two pellet reward condition), and switched them with two groups of rats in the corresponding reward condition (i.e., one group of rats that had previously been in the two-pellet reward condition suddenly found itself in the twenty-two pellet reward condition, while another group of rats which had previously been in the twenty-two pellet reward condition suddenly found itself in the two pellet reward condition. The other two groups of rats continued to receive the same amount of food reinforcement for each successful completion of the runway task as they had been before. Interestingly, while following the completion of the first phase of the experiment, rats that were initially assigned to the twenty-two pellet reward condition ran only slightly faster than rats that had been assigned to the two-pellet condition, following the switch, the rats that had suddenly been switched into the twenty-two pellet food condition from the two-pellet condition were suddenly running much faster than they had been before, while rats that were suddenly switched into the two-pellet condition from the twenty-two pellet condition were suddenly running much slower than they had been before the switch. These phenomena- with the one where the rats began running much faster than before following the increase in reward from the baseline level being called positive contrast (Mellgren, 1972, page 185), and the other one where the rats began running much more slowly than before following the decrease in reward from the baseline level being called negative contrast (Mellgren, 1972, page 185), is led Mellgren to conclude that there are emotional and motivational aspects to conditioned responding- which is similar to the conclusion drawn by Everitt and Robbins (2005) regarding there sometimes being an emotional component to conditioned responding- such as might become evident upon the unexpected presentation of a conditioned stimulus.
Everitt and Robbins also argued that, on a neural level, it is the "midbrain dopamine neurons" (dopamine is a neurotransmitter involved in pleasure and reward) (Kullmann and Jennings, 2011) that "show fast phasic burst firing in response to such CSs..." (Everitt and Robbins, 2005, p. 1481). In testing this, Everitt and Robbins (2005) also found that unexpected presentations with conditioned stimuli that were normally paired with the administration of drugs also "resulted in dopamine release in the core but not in the shell region of the nucleus accumbens," (Everitt and Robbins, 2005, p.1482). Furthermore, Everitt and Robbins (2005) found that disabling the sensitivity of the nucleus accumbens to dopamine (as done through either selective lesioning of core areas of the nucleus accumbens, or administration of NMDA or dopamine antagonists during conditioning), results in greatly attenuated conditioned responding, while infusing either NMDA or dopamine antagonists into the core region of the nucleus accumbens after results in the subject having poor memory of the conditioned procedure (Everitt and Robbins, 2005).
However, Everitt and Robbins (2005) caution that while, from the above-mentioned data, it may thus seem that drug addiction depends upon the presentation of the above-mentioned factors, (i.e., that drug addiction might be dependent on either the presentation of a conditioned stimulus that was previously paired with a drug, resulting in approach to that conditioned stimulus, or that drug-seeking and, thus, self-administration can be made more frequent via the unexpected presentation of a conditioned stimulus). Everitt and Robbins (2005) argue that while these effects have been proven to be causally involved in the conditioning of animals with natural rewards, these same effects have yet to be proven definitively in regards to the drug-seeking behaviors of humans.
Everitt and Robbins (2005) do note, however, that "In certain circumstances, CSs can also function as conditioned reinforcers," (Everitt and Robbins, 2005, p. 1482). As they go on to explain, this occurs when certain stimuli which were "initially motivationally neutral," [i.e., they did not have anything about them which made them immediately responsible to an organism' biological needs], became reinforcing in their own right via association with primary reinforcers such as food or drugs. These stimuli help to maintain instrumental responding by bridging delays to the ultimate goal..." (Everitt and Robbins, 2005, p. 1482). Such "bridging reinforcement" via conditioned reinforcers that have come to function in the manner of primary reinforcers is important, because conditioned responding can be severely disrupted by even seemingly-insignificant temporal delays in the delivery of a reinforcer. Domjan (2009) explains that this occurs for several reasons, including the fact that a delay can make it difficult for a participant to determine which of several actions or responses they provided (as they have likely performed several different actions since the time of the delivery of the previous reinforcer, i.e., a lever press, walking about the cage in a circle, sniffing the food delivery magazine, etc.) is the one which actually led to the delivery of the reinforcer! Using a conditioned stimulus that was previously trained with and has thus come to be associated with the reward is one way to overcome this potential difficulty (Domjan, 2009).
A study by Cronin (1980) involving pigeons and a visual discrimination task confirmed these results. In her study, Cronin divided a group of pigeons into three groups, and varied the conditions to which they were exposed during the delay interval between an instrumental response and reinforcement between the groups. One group, called the "nondifferential" (Cronin 1980, p. 352) group, was presented with the same type of stimulus in between an instrumental response and a reward, whether they had made a correct or an incorrect response. The other three groups of pigeons, by contrast, were exposed to different stimuli during the delay period between making an instrumental response and receiving a reinforcer, based on whether they had made a correct or incorrect choice. One of these latter three groups, the "differential"(Cronin 1980, p. 352) group, received the differential stimuli (differential depending on whether the response the pigeon made was correct or not) over the course of the entire delay period. Another of the latter three groups, the "reinstatement"(Cronin 1980, p. 352) group, received the differential stimulus during the ten seconds immediately following their response (whether correct or incorrect, although the nature of the stimulus varied, of course, with whether the previously made response had been correct or incorrect), and during the ten seconds immediately preceding the reinforcer. The last of the differential groups, this one called the "reversed cue group,"(Cronin 1980, p. 352) was treated in the same way as the "reinstatement"(Cronin 1980, p. 352) group (in that they were exposed to the differential stimulus during the ten seconds immediately following their making a response, and during the ten seconds prior to the actual delivery of the reinforcer), although for this "reversed cue" (Cronin 1980, p. 352) group, cues were indeed "reversed" in that the same cue which was presented ten seconds following an incorrect responses was also presented ten seconds before the reinforcing stimulus following correct responses (hence the reversal!)- and, likewise, for this group, the stimulus which was normally presented ten seconds after a correct response was also presented ten seconds before a lack of reward on nonreinforced trails (trials on which the birds had failed to make the correct response). The results of the study indicated that pigeons in the "nondifferential"(Cronin 1980, p. 352) group failed to learn how to make appropriate instrumental responses to the task, while birds in both the "differential"(Cronin 1980, p. 352) and "reinstatement" (Cronin 1980, p. 352) conditions had no difficulty learning the task. Finally, the "reversed-cue" (Cronin 1980, p. 352) birds learned to make a response- but the wrong response! Thus, this study highlights the importance of the delay interval between the making of an instrumental response and the presentation of a reinforcer (or nonreinforcement) and the stimuli which are presented during it, which can either aid in the formation of conditioned responding, or prevent the formation of conditioned responding, or even condition responding- but responding involving the giving of incorrect responses!
In their discussion of how CSs can sometimes come to act as conditioned reinforcers, Everett and Robbins (2005) specify that "The effects of conditioned reinforcers, especially drug-related conditioned reinforcers, are pervasive and profound. For example, they support the learning of new drug-seeking responses, an effect that persists for at least two months without any further experience of self-administered cocaine and that is resistant to the extinction of the original CS-drug association. Drug-associated conditioned reinforcers also help maintain responding under second-order schedules of reinforcement." (Everitt and Robbins, 2005, p. 1483) Thus, it would appear that the use of drugs as conditioned reinforcers not only seems to be something that leads further into itself (with such use of drugs as conditioned reinforcers leading to further increases in drug-seeking behavior), but the fact this effect still persists even after extinction of the original CS-US association is truly remarkable, and indeed sheds some light on some of the factors making drug addictions so difficult to overcome!
Everitt and Robbins (2005) do qualify these claims about the potency of drugs as secondary reinforcers, however, with the statement that "The CSs must be presented as conditioned reinforcers (that is, their presentation must depend on the animal's behavior); merely presenting them unexpectedly fails to increase drug seeking. This seems to contradict the 'incentive salience' model of drug-seeking behavior, which would predict enhancement from pavlovian, or unexpected presentations of the CS,"(Everett and Robbins, 2005, p.1483). This latter point underscores both the importance of contingency, as well as contiguity in instrumental conditioning, and fits nicely with Staddon and Simmelhag's (1971) attempted replication (and resulting re-interpretation) of Skinner's famous "superstitious behavior" experiment. In their study, Staddon and Simmelhag (1971) used more systematic measures of pigeon behaviors than Skinner had utilized- and, crucially, they also specified when, in relation to prior and subsequent deliveries of free food, each type of behavior occurred, labeling such responses as occurred just prior to the next delivery of food "terminal responses,"(Staddon and Simmelhag, 1971, p. 4) and responses that occurred closer to the middle of the time interval between deliveries of food "interim activities," (Staddon and Simmelhag, 1971, p.4). Furthermore, they found the effects of accidental reinforcement (i.e., reinforcement occurring in the absence of the target response, as if "by accident") to be minimal; according to them, presentations of reinforcement in the absence of the target response merely strengthened the termal responses, but had little other influence (Staddon and Simmelhag, 1971).

References

1). Caddy, G. R., & Lovibond, S. H. (1976). Self-regulation and discriminated aversive conditioning in the modification of alcoholic's drinking behavior.Behavior Therapy, 7(2), 223-230. doi: 10.1016/S0005-7894(76)80279-1

2). Cronin, P. B. (1980). Reinstatement of postresponse stimuli prior to reward in delayed-reward discrimination learning by pigeons. Animal Learning & Behavior, 8(3), 352-358. Retrieved from http://www.springerlink.com/content/c016725275680566/

3). Domjan, M. (2009). Learning and behavior. (6 ed., pp. 59-60). Belmont, CA: Wadsworth, Cengage Learning.

4). Everitt, R. J., & Robbins, T. W. (2005). Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nature Neuroscience,8(11), 1481-1489. doi: 10.1038/nn1579

5).Kullmann, D., & Jennings, A. (2011, June 01).Dopamine and addiction. Retrieved from http://www.brainfacts.org/Diseases-Disorders/Addiction/Articles/2011/Dopamine-and-Addiction

6). Mapping the brain. (2012, April 01). Retrieved from http://www.brainfacts.org/Brain-Basics/Neuroanatomy/Articles/2012/Mapping-the-Brain

7). Mellgren, R. L. (1972). Positive and negative contrast effects using delayed reinforcement. Learning and Motivation, 3(2), 185-193. doi: 10.1016/0023-9690(72)90038-0

8). Staddon, J. E., & Simmelhag, V. L. (1971). The "superstition" experiment: A reexamination of its implications for the principles of adaptive behavior . Psychological Review, 78(1), 3-43. doi: 10.1037/h0030305

9). Speert, D. (2012, February 02). Neuroeconomics: Money and the brain. Retrieved from http://www.brainfacts.org/In-Society/In-Society/Articles/2012/Neuroeconomics-Money-and-the-Brain