We’re getting in the car, make sure the seat belt is on, start the car and drive away. If we didn’t, that annoying warning sound would keep increasing in volume, making driving unpleasant. It’s a beautiful sunny day, and we decide to open the windows to enjoy the spring warmth, the air in our eyes makes us blink.
Why does that make us more like our animals than we can believe? Because the rules governing learning are the same for animals and humans alike. In the first case, we learn to wear a seatbelt thanks to what is called operating conditioning; in the second case blinking is a reflex governed by the laws of classical conditioning. In the first case, we learn to wear a seatbelt to stop an unpleasant stimulus (the alarm): this means that we learn from the consequences of our behavior. Specifically is what we call negative reinforcement. In the second case, we can’t do anything because we can’t control reflexes, and it is something that comes from our genes. But let’s see in detail what this is all about.
Classical conditioning and operant conditioning
Respondent conditioning is better known as classical or Pavlovian conditioning, named as the Russian scientist who first described the phenomenon in 1927: Ivan Pavlov. During his experiments, he discovered that dogs began to salivate not only when they saw food or took food in their mouths, but also before it happened. This phenomenon was already known as psychic salivation, but the importance of behavior had not yet been understood. Animals began to respond to the stimulus in an anticipatory manner: other stimuli, besides meat, could cause salivation. For example, the attendant’s dress, the sound of a metronome, the sound of the bowl.
In the respondent conditioning, a neutral stimulus can become a conditioned stimulus if paired a certain number of times with an unconditioned stimulus. That is, by pairing the sound of the metronome (neural stimulus) with meat (unconditional stimulus) enough times, the metronome starts to elicit the salivation response (unconditioned response). Therefore, the metronome becomes a conditioned stimulus and the salivation resulting from its conditioned response.
Therefore, organisms can associate stimuli that occur together in the same period, and by association, one or more stimuli can be replaced by another stimulus.
Thanks to the respondent classical conditioning, it is possible to modify the stimulus for most reflexes while it is practically impossible to change the response. An example of this type of conditioning comes from the first phase of clicker training when you want to introduce the mean of clicker: the sound of the clicker will be repeatedly paired with the appearance of food so that even the click becomes reinforcing: be careful that the click will never be the same as the food.
Many trainers try to establish with precision whether it is a respondent or operant conditioning, but to make a clear distinction between the two is not possible because there is always a respondent basis in operant behavior.
Operant conditioning
Not all behavior is elicited by a stimulus as in the responding conditioning; often, the behavior is motivated by its consequences that may increase or decrease the frequency with which it is emitted. In this second case, we speak of operant behavior because the response operates on the environment to produce consequences.
This type of conditioning is called instrumental conditioning because the response is instrumental in producing consequences. It is also known as operant conditioning.
Scientists who described this type of learning were Thorndike and Skinner. Also, in this case, we do not speak of “theory” but of a description of the natural phenomenon of how animals.
Similarities between classical and operant conditioning
Animals learn to interact with the environment. It is a powerful tool for behavioral modification.
It is essential to underline the need not to teach the dog enough skills but to teach behavior in the way to achieve perfection in training: superficially using operant conditioning is as simple as it is risky.
Respondent learning is composed, as we have seen, of two elements: stimulus (antecedent) and response (behavior). It is the stimulus that guides the behavior. That is, the antecedent is the cause of the behavior. In operant learning, there are instead three elements: an antecedent, an action, and a consequence.
The consequence is the one that most influences the behavior that guides it. As we have said, those of the respondent and operant behavior are not dichotomous categories, but instead, we have respondent and operant behavior arranged along a continuum: all the operant behaviors have a respondent component. Simplifying the concept as much as possible, we can say that voluntary action is modified through operant conditioning, while reflexes follow the laws of the respondent conditioning.
Operant conditioning is based on simple principles validated by experimental data. According to the Brelands, the principles on which the operating conditioning is based are:
- Stimulation: animals respond to stimuli. Responses can be both instinctive and learned.
- Reinforcement: it increases the frequency of the behavior. Reinforcement is a procedure that involves following the consequences of behavior that enhances or maintains the frequency of that behavior. Reinforcement is the result of biological evolution. Be careful because it is worse to reinforce unwanted behavior than to fail to reinforce desired behavior.
- Extinction: it decreases the frequency of the behavior
- Punishment: it reduces the frequency of the behavior
- Stimulus generalization and response generalization: tendency behavior diffusion
What are the consequences of behavior?
Reinforcement is an environmental change that follows a response and increases or maintains the future frequency of that behavior.
Positive reinforcement is an environmental change in which a stimulus is added following a response that increases or maintains the future frequency of that response. For example, I ask my dog to sit (antecedent); he sits (behavior). I give him a treat (consequence – positive reinforcement).
Negative reinforcement is an environmental change in which a stimulus is subtracted following response, and which increases or maintains the future frequency of that behavior. For example, a mosquito bites me (antecedent), I scratch myself (behavior), itch attenuated (consequence – negative reinforcement)
Punishment is an environmental change that follows a response and decreases the future frequency of that behavior.
Positive punishment is an environmental change in which a stimulus is added following response, and which decreases the future frequency of that behavior. For example, the traffic light is red (antecedent), you don’t stop your car (behavior), the police stop and give you a ticket (consequence – positive punishment).
Comparison of classical and operant conditioning
Negative punishment is an environmental change in which an (appetitive) stimulus is subtracted following response, and which decreases the future frequency of that behavior. For example, you have 20 points on your driver’s license (antecedent appetitive) – you drive too fast (behavior) – you lose 10 points (consequence – negative punishment – response cost).
We will not go into detail on the reinforcement schedules; here, it is enough to remember three key concepts in the planning of training:
- Timing: When
- Criteria: What
- Frequency: How often
It is necessary to know precisely when to reinforce; in general, it is essential to reinforce only the desired behavior and to set only one criterion at a time. Remember that a high frequency of reinforcement reduces distractibility. For your animal, you can find professional training at animal training platform Tromplo.com.
It is essential to set criteria during training planning to decide precisely what to reinforce and what not to reinforce. If the criterion set is too high, it will result in a too low of a reinforcement frequency; if the criterion is too low, you will end up feeding the animal. It is crucial to establish the reinforcement frequency to ensure that it is beneficial for the animal.