Skinner studied, in detail, how animals changed their behaviour through reinforcement and punishment, and he developed terms that explained the processes of operant learning Table 8. Skinner used the term reinforcer to refer to any event that strengthens or increases the likelihood of a behaviour, and the term punisher to refer to any event that weakens or decreases the likelihood of a behaviour.
And he used the terms positive and negative to refer to whether a reinforcement was presented or removed, respectively. Thus, positive reinforcement strengthens a response by presenting something pleasant after the response, and negative reinforcement strengthens a response by reducing or removing something unpleasant.
For example, giving a child praise for completing his homework represents positive reinforcement, whereas taking Aspirin to reduce the pain of a headache represents negative reinforcement. In both cases, the reinforcement makes it more likely that behaviour will occur again in the future.
Reinforcement, either positive or negative, works by increasing the likelihood of a behaviour. Punishment , on the other hand, refers to any event that weakens or reduces the likelihood of a behaviour. Positive punishment weakens a response by presenting something unpleasant after the response , whereas negative punishment weakens a response by reducing or removing something pleasant.
A child who is grounded after fighting with a sibling positive punishment or who loses out on the opportunity to go to recess after getting a poor grade negative punishment is less likely to repeat these behaviours. Although the distinction between reinforcement which increases behaviour and punishment which decreases it is usually clear, in some cases it is difficult to determine whether a reinforcer is positive or negative.
On a hot day a cool breeze could be seen as a positive reinforcer because it brings in cool air or a negative reinforcer because it removes hot air.
In other cases, reinforcement can be both positive and negative. One may smoke a cigarette both because it brings pleasure positive reinforcement and because it eliminates the craving for nicotine negative reinforcement. It is also important to note that reinforcement and punishment are not simply opposites. The use of positive reinforcement in changing behaviour is almost always more effective than using punishment. This is because positive reinforcement makes the person or animal feel better, helping create a positive relationship with the person providing the reinforcement.
Types of positive reinforcement that are effective in everyday life include verbal praise or approval, the awarding of status or prestige, and direct financial payment. Punishment, on the other hand, is more likely to create only temporary changes in behaviour because it is based on coercion and typically creates a negative and adversarial relationship with the person providing the reinforcement.
When the person who provides the punishment leaves the situation, the unwanted behaviour is likely to return. Perhaps you remember watching a movie or being at a show in which an animal — maybe a dog, a horse, or a dolphin — did some pretty amazing things.
The trainer gave a command and the dolphin swam to the bottom of the pool, picked up a ring on its nose, jumped out of the water through a hoop in the air, dived again to the bottom of the pool, picked up another ring, and then took both of the rings to the trainer at the edge of the pool. The animal was trained to do the trick, and the principles of operant conditioning were used to train it.
But these complex behaviours are a far cry from the simple stimulus-response relationships that we have considered thus far. How can reinforcement be used to create complex behaviours such as these? In order to avoid future punishment, an individual may change his or her behavior. For example:. A positive punishment is a stimuli imposed on a person when they behave in a particular way. Over time, the person learns to avoid the positive punishment by altering their behavior.
Negative punishment is the removal of a benefit or privilege in response to undesirable behavior. A person wants to retain the benefits that they previously enjoyed, and avoids behavior which may lead to their rights being revoked. As with its classical counterpart, operant conditioning depends on the repetition of a stimulus in order to maintain the association between behavior and a reinforcement. Initial conditioning is repeated in order to create an association, and must then be periodically repeated so that the link between the two is not lost.
If, after initial conditioning, the reinforcement is removed e. Extinction can result in the person or animal resuming their original behavior. Skinner was curious to find out what variables affected the effectiveness of operant conditioning. He conducted research into the effect of timing on conditioning with Charles B. Ferster, a fellow behavioral psychologist who worked at the Yerkes Laboratories of Primate Biology in Florida.
Ferster and Skinner found that schedules of reinforcement - the rate at which a reinforcement is repeated - can greatly influence operant conditioning. A number of types of schedules of reinforcement have been proposed by Skinner, Ferster and others, including:. A reward or punishment is provided every time an individual exhibits a particular mode of behavior. Through continuous reinforcement , the subject learns that the result of their actions will always be the same.
However, the dependability of continuous reinforcement can lead to it becoming too predictable. A subject may learn that a reward will always be provided for a type of behavior, and only carry out the desired action when they need the reward.
For instance, a rat may learn that pushing a lever will always lead to food being provided. Given the security that this schedule of reinforcement provides, the rat may decide to save energy by only pressing the lever when it is sufficiently hungry.
Instead of responding every time a person behaves in a particular way, partial reinforcement involves rewarding behavior only on some occasions. A subject must then work harder to receive a reinforcement and may take longer to learn using this type of operant conditioning.
Partial reinforcement can be used following a period of initial continuous reinforcement to prolong the effects of operant conditioning. For example, an animal trainer might give a treat to a dog every time it sits on command. Once the animal has learnt that a reward provided for obeying the trainer, partial reinforcement may be used. The dog may receive a treat only every 5 times it obeys a command, but the conditioned behavior continues to be reinforced and extinction is avoided.
Partial reinforcement modifies the ratio between the conditioned response and reinforcement, or the interval between reinforcements:.
Although classical and operant conditioning share similarities in the way that they influence behavior and assist in the learning process, there are important differences between the two types of conditioning. During classical conditioning, a person learns by observation, associating two stimuli with each other. A neutral stimuli is presented in conjunction with another, unconditioned, stimulus. Through repetition, the person learns to associate the first seemingly unrelated stimuli with the second.
A person behaves in a particular manner and is subsequently rewarded or punished. They eventually learn to associate their original behavior with the reinforcement, and either increase, maintain or avoid their behavior in future in order to achieve the most desirable outcome. It explains why reinforcements can be used so effectively in the learning process, and how schedules of reinforcement can affect the outcome of conditioning. An advantage of operant conditioning is its ability to explain learning in real-life situations.
Praise following an achievement e. When a child misbehaves, punishments in the form of verbal discouragement or the removal of privileges are used to dissuade them from repeating their actions.
Through the first part of the 20th century, behaviorism became a major force within psychology. The ideas of John B. Watson dominated this school of thought early on.
Watson focused on the principles of classical conditioning , once famously suggesting that he could take any person regardless of their background and train them to be anything he chose. Early behaviorists focused their interests on associative learning.
Skinner was more interested in how the consequences of people's actions influenced their behavior. Skinner used the term operant to refer to any "active behavior that operates upon the environment to generate consequences. His theory was heavily influenced by the work of psychologist Edward Thorndike , who had proposed what he called the law of effect. Operant conditioning relies on a fairly simple premise: Actions that are followed by reinforcement will be strengthened and more likely to occur again in the future.
If you tell a funny story in class and everybody laughs, you will probably be more likely to tell that story again in the future. If you raise your hand to ask a question and your teacher praises your polite behavior, you will be more likely to raise your hand the next time you have a question or comment.
Because the behavior was followed by reinforcement, or a desirable outcome, the preceding action is strengthened. Conversely, actions that result in punishment or undesirable consequences will be weakened and less likely to occur again in the future.
If you tell the same story again in another class but nobody laughs this time, you will be less likely to repeat the story again in the future. If you shout out an answer in class and your teacher scolds you, then you might be less likely to interrupt the class again. Skinner distinguished between two different types of behaviors.
While classical conditioning could account for respondent behaviors, Skinner realized that it could not account for a great deal of learning. Instead, Skinner suggested that operant conditioning held far greater importance. Skinner invented different devices during his boyhood and he put these skills to work during his studies on operant conditioning. He created a device known as an operant conditioning chamber, often referred to today as a Skinner box.
The chamber could hold a small animal, such as a rat or pigeon. The box also contained a bar or key that the animal could press in order to receive a reward.
In order to track responses, Skinner also developed a device known as a cumulative recorder. The device recorded responses as an upward movement of a line so that response rates could be read by looking at the slope of the line. There are several key concepts in operant conditioning. Reinforcement is any event that strengthens or increases the behavior it follows. There are two kinds of reinforcers. Consequently, Pavlovian and operant conditioning can differ in the behaviors they produce, their underlying learning processes, and the role of reinforcement in establishing conditioned behavior.
The scientific study of operant conditioning is thus an inquiry into perhaps the most fundamental form of decision-making. There is also phylogenetic selection — selection during the evolution of the species. It emerges full-blown as the animal matures and may be relatively insensitive to immediate consequences.
Even humans who should know better! The selecting consequences that guide operant conditioning are of two kinds: behavior-enhancing reinforcers and behavior-suppressing punishers , the carrot and the stick, tools of parents, teachers — and rulers — since humanity began.
When the dog learns a trick for which he gets a treat, he is said to be positively reinforced. If a rat learns to avoid an electric shock by pressing a lever, he is negatively reinforced. There is often ambiguity about negative reinforcement, which is sometimes confused with punishment — which is what happens when the dog learns not to get on the couch if he is smacked for it.
In general, a consequence is called a reinforcer if it strengthens the behavior that led to it, and it is a punisher if it weakens that behavior. The scientific study of operant conditioning dates from the beginning of the twentieth century with the work of Edward L.
Thorndike in the U. Lloyd Morgan in the U. Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal…will, other things being equal, be more firmly connected with the situation…; those which are accompanied or closely followed by discomfort…will have their connections with the situation weakened…The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond.
Thorndike, , p. Thorndike soon gave up work with animals and became an influential educator at Columbia Teachers College. But the Law of Effect, which is a compact statement of the principle of operant reinforcement, was taken up by what became the dominant movement in American psychology in the first half of the twentieth century: Behaviorism. The founder of behaviorism was John B. Watson at Johns Hopkins university. They sought mathematical laws for learned behavior. Soon, B. Skinner, at Harvard, reacted against Hullian experimental methods group designs and statistical analysis and theoretical emphasis, proposing instead his radical a-theoretical behaviorism.
The best account of Skinner's method, approach and early findings can be found in a readable article -- "A case history in scientific method" -- that he contributed to an otherwise almost forgotten multi-volume project "Psychology: A Study of a Science" organized on positivist principles by editor Sigmund Koch.
A third major behaviorist figure, Edward Chace Tolman, on the West coast, was close to what would now be called a cognitive psychologist and stood rather above the fray.
Skinner opposed Hullian theory and devised experimental methods that allowed learning animals to be treated much like physiological preparations. It was nevertheless valuable because it introduced an important distinction between reflexive behavior, which Skinner termed elicited by a stimulus, and operant behavior, which he called emitted because when it first occurs i. Skinner and several others noted this connection which has become the dominant view of operant conditioning.
Reinforcement is the selective agent, acting via temporal contiguity the sooner the reinforcer follows the response, the greater its effect , frequency the more often these pairings occur the better and contingency how well does the target response predict the reinforcer. It is also true that some reinforcers are innately more effective with some responses - flight is more easily conditioned as an escape response in pigeons than pecking, for example.
Contingency is easiest to describe by example. Suppose we reinforce with a food pellet every 5th occurrence of some arbitrary response such as lever pressing by a hungry lab rat. The rat presses at a certain rate, say 10 presses per minute, on average getting a food pellet twice a minute. Will he press more, or less? The answer is less. Lever pressing is less predictive of food than it was before, because food sometimes occurs at other times.
Exactly how all this works is still not understood in full theoretical detail, but the empirical space — the effects on response strength rate, probability, vigor of reinforcement delay, rate and contingency — is well mapped.
What happens during operant conditioning? The experimenter intervened no further, allowing the animal to do what it would until, by chance, it made the correct response. The result was that, according to what has sometimes been called the principle of postremity, the tendency to perform the act closest in time to the reinforcement — opening of the door — is increased.
Notice that this account emphasizes the selective aspect of operant conditioning, the way the effective activity, which occurs at first at 'by chance,' is strengthened or selected until, within a few trials, it becomes dominant. The nature of how learning is shaped and influenced by consequences has also remained at the focus of current research. Omitted is any discussion of where the successful response comes from in the first place.
It is something of a historical curiosity that almost all operant-conditioning research has been focused on the strengthening effect of reinforcement and almost none on the question of origins, where the behavior comes from in the first place, the problem of behavioral variation , to pursue the Darwinian analogy.
0コメント