Theory of Operant Conditioning

Theory of Instrumental or Operant Conditioning

Operant conditioning theory of learning was formulated by B.F. Skinner. Who was an American Psychologist. His theory came based on the lapses discovered in the classical conditioning theory. Skinner believed that classical conditioning explained only how behaviour that has already been acquired can occur in the presence of a new stimulus.

Operant or instrumental conditioning, however, believed that most learning consist of acquiring new behaviour. He believed that behaviour is an outcome of response that follows the action. The learner will possibly repeat the action or a particular behaviour if it is followed/ rewarded with a pleasant consequence (positive reinforcement).

Skinner explained the two types of responses in his theory. One can be elicited only by the stimulus or information an individual acquires at a particular period (reflex response). For example, stepping on a sharp object or touching a hot metal will originally make someone to produce reflex response. The second type is the response that an individual elicits following his/her own decision.

This type of response is called operant conditioning. It is based on the fact that behaviour operates upon the environment to generate its own response. This operant behaviour emits voluntary response. Operant conditioning believes that behavioural responses become connected to environmental stimuli largely as a result of what happens after the response occurs.

To establish his claims, Skinner performed many experiments with pigeons and white rats in the laboratory. He constructed a box (Skinner box) with a small lever inside it. The lever releases food to the animals whenever the lever is pressed. In one of the experiments, an hungry rat is placed in the box and if the rat presses the lever, the food would drop for it. The lever in this box is mechanically connected to a device that automatically records every attempt the rat made.

Skinner’s box and Operant Conditioning

In the box the rat moved around tirelessly and each time the lever is pressed, the food falls for the rat. The rat becomes persistent in pressing the lever so that the food could fall. The food that comes down for the rat reinforces its action, this lever pressing becomes a conditioned response for the rat.

In contrast, if the food is not accompanied with the pressing of lever, the number of presses would fall gradually to the lowest point. In this type of theory, it is the result or consequence of a behaviour that makes that behaviour more likely to be repeated on learned. If the result of behaviour is gratifying, one is likely to respond the same way the next time one encounters that stimulus. In the above experiment, the pressing of lever becomes instrument (instrumental).

Skinner in this theory identified the two types of reinforcers, they are homework is a positive reinforcer. By this action, it is likely that such a student will want to continue doing his/her assignment promptly. However, the student who receives punishment for misbehaving in the classroom is not likely to repeat the action for which he/she has received unpleasant/negative reward.

Table of Contents

Toggle

TYPES OF OPERANT CONDITIONING PROCEDURES

Five procedures are defined by the presentation or removal of a reinforcement or punishment. In this scenario, the term positive is used to imply addition, and negative to denote subtraction of an incentive. The procedures are:

1. Positive reinforcement (Reinforcement)

This occurs when a behaviour (response) of the subject is followed by a stimulus that is rewarding and this increases the frequency of that behaviour. Positive reinforcement usually is favourable events given to the subject after portraying a desirable behaviour.

In an experiment involving a rat for instance, a stimulus such as food or a sugar solution could be delivered when a rat engages in a target behaviour, such as pressing a lever. Other than food, other forms of positive reinforcement may include praise, rewards, smile and so on. Positive reinforcement essentially targets to increase the likelihood of certain behaviours.

2. Negative reinforcement

A negative reinforcer is any stimulus the removal or withdrawal of which increases the likelihood of a particular behaviour. Electric shock, loud noise and so on, are said to be negative reinforcers (Shah, 2009). Negative reinforcement occurs when a behaviour is followed by the removal of an aversive stimulus, thereby increasing that behaviour’s frequency. This kind of negative reinforcement is called escape. A similar procedure called avoidance occurs when the desired behaviour allows the organism to totally avoid shock, like in the case of exposure to electric shock.

Negative reinforcers typically are characterized by the removal of an undesired or unpleasant outcome after the desired behaviour. A response is strengthened as something considered negative is removed. In the Skinner box experiment, negative reinforcement can be a loud noise continuously sounding inside the rat’s cage until it engages in the target behaviour, such as pressing a lever, upon which the loud noise is removed.

As noted from the foregoing, reinforcement is a central concept in Behaviourism, and was seen as a central mechanism in the shaping and control of behaviour. A common misconception however, arises that negative reinforcement is synonymous with punishment. This misconception is rather pervasive.

To be clear, while positive reinforcement is the strengthening of behaviour by the application of some event (e.g., praise after some behaviour is performed), negative reinforcement is the strengthening of behaviour by the removal or avoidance of some aversive event (e.g., opening and raising an umbrella over your head on a rainy day is reinforced by the cessation of rain falling on you).

The key aspect to note in reinforcement is that both types of reinforcement strengthen behaviour, or increase the probability of a behaviour reoccurring; the difference is in whether the reinforcing event is something applied (positive reinforcement) or something removed or avoided (negative reinforcement).

Schedules of reinforcement

Schedules of reinforcement in Operant Conditioning

Part of Skinner’s analysis of behaviour involved not only the power of a single instance of reinforcement, but the effects of particular schedules of reinforcement over time. Munsaka (2011:11) and Karen, (1975) both identify two main categories of schedules of reinforcement being continuous and intermittent reinforcements. These are described in depth below.

Continuous reinforcement refers to constant delivery of reinforcement for an action; every time a specific action was performed the subject instantly and always received reinforcement. This method is impractical to use, and the reinforced behaviour is prone to extinction.

Under intermittent reinforcement, we have interval and ratio schedules.

(i) Interval Schedules are based on the time intervals between reinforcements. Interval schedules are further sub divided into:

Fixed Interval Schedule (FI): Here incentives are based on the principle in which reinforcements are presented at fixed time periods, provided that the appropriate response is made.
Variable Interval Schedule (VI): An operant conditioning principle in which behaviour is reinforced based on an average time that has expired since the last reinforcement.

Both FI and VI tend to produce slow, methodical responding because the reinforcements follow a time scale that is independent of how many responses occur.

(ii) Ratio Schedules: based on the ratio of responses to reinforcements

Fixed Ratio Schedule (FR): An operant conditioning principle in which reinforcement is delivered after a specific number of responses have been made.
Variable Ratio Schedule (VR): An operant conditioning principle in which the delivery of reinforcement is based on a particular average number of responses (ex. slot machines).

Schedules of reinforcement

VR produce slightly higher rates of responding than FR because an organism doesn’t know when next reinforcement is. The higher the ratio, the higher the response rate tends to be.

1. Positive punishment (Punishment)

Simply put, punishment is the opposite of reinforcement. Weber (1991:72) noted that, “punishment is any operation that decreases the rate of response. For example, when a rat presses the lever, shock is presented.” This action will automatically lead to a decrease in lever pressing, as such, punishment would have occurred.

Positive punishment is sometimes referred to as punishment by application. When shock or other unpleasant means are instituted to decrease behaviour, the consequence is referred to as positive punishment. This involves the presentation of an unfavourable event in order to weaken the response that follows. Positive punishment is sometimes a confusing term, as it denotes the “addition” of a stimulus or increase in the intensity of a stimulus that is aversive (such as spanking or an electric shock).

2. Negative punishment (Penalty)

This kind of punishment is associated with the removal of a stimulus, such as taking away a child’s toy or withdrawing a privilege following an undesired behaviour. This results in a reduction of unwanted behaviour. This procedure is considered negative because something is removed or taken away from the child (in the case of a human subject).

3. Extinction

Occurs when, a behaviour (response) that had previously been reinforced is no longer effective. For example, a rat is first given food many times for lever presses. Then, in “extinction”, no food is given. Typically, the rat continues to press more and more slowly and eventually stops, at which time lever pressing is said to be “extinguished.”

Extinction may mean the loss of an acquired response or the failure to make a learned response. Usually, extinction is brought about when following a period of reinforcement for responding, reinforcement is no longer offered. We can thus conclude that the goal of extinction is to curtail or completely eradicate a certain response action by stopping the provision of a stimulus. This in due course eliminates certain behaviour.

As noted from the above descriptions on punishment and extinction, these two have the effect of weakening behaviour, or decreasing the future probability of a behaviour’s occurrence, by the application of an aversive stimulus/event (positive punishment or punishment by contingent stimulation), removal of a desirable stimulus (negative punishment or punishment by contingent withdrawal), or the absence of a rewarding stimulus, which causes the behaviour to stop (extinction).

The aim of punishment and extinction is to weaken the incidences of a particular behaving reoccurring stimulus/event (positive punishment or punishment by contingent stimulation), removal of a desirable stimulus (negative punishment or punishment by contingent withdrawal), or the absence of a rewarding stimulus, which causes the behaviour to stop (extinction). The aim of punishment and extinction is to weaken the incidences of a particular behaving reoccurring.

Classroom Implications of Instrumental/Operant Conditioning Theory

The teacher should know that the environment or the conditions in which the students learn are very significant to the learning outcomes, hence, the teacher should provide conducive learning environment and conditions for his/her students.

Reinforcement is an essential factor if the students must perform well in a given task. To this end, the teacher should not neglect the use of motivation that can adequately propel the students into actions.
If a student engages in a disruptive behaviour, the teacher should not reinforce such a behaviour rather, he/she should endeavour to tell such a student the dare consequence of that action.
When there is interference in the transfer of experiences by the learners, the teacher may use explanations and reinforcement to strengthen the desired facts and weaken the undesired one.

Conclusion

The Instrumental/Operant Conditioning of Skinner revealed that behaviour is an outcome of response that follows the action. Skinner believed that responses are divided into two. These are the involuntary behaviour and operant response i.e. the behaviour that is dictated by the learner’s interaction with his/her environment. The relevance of reinforcement/motivation and punishment to students learning is also discussed in this article.