Rewarding an agent when they behave a certain way makes them more likely to repeat this behavior. Partially randomizing the rewards get them even more likely to repeat this behavior. The uncertainty of the reward schedule makes it more attractive and is even the base of addiction.

This is commonly called “intermittent variable rewards” in tech, and its scientific name in psychology is intermittent reinforcement following a Variable ratio schedule.

I prefer the term “partially randomized rewards” as this is usually much better understood than the other terms by people outside the psychology/design/etc. fields.

The classic example of this concept are slot machines. But the same mechanism is at work for email notifications, mobile devices notifications, etc. Each notification is a chance to either win (interesting email/message/etc.) or lose (notifications with no interest). It’s also the same mechanism at work behind YouTube autoplay settings or even personalized recommendations, or when you scroll down your Twitter stream: each new element is a win or lose opportunity and you usually reach a win before giving up.

Positive or Negative Reinforcer?

You could classify this Reinforcer as either a Positive or Negative one. Perhaps the best way to describe it would be Addictive Reinforcer. At its core, it provides the agent with a burst of pleasure each time there is an occasion to win or lose. This is what makes it so addictive. Positive or Negative (see how I define the difference here) then depends on the context.

If the repeated behavior occurs during limited and pre-allocated moments (“I’m going to waste time on YouTube for a few mins”), then it could be classified as Positive. But if it spills over other moments and is impacting a agent’s lifestyle (“Wow, I just spent 3 hours up to 1am watching cat videos”), then it’s definitely a Negative Reinforcer, i.e. it is the cost of stopping the behavior (no more pleasure, no more dopamine) that is preventing the agent from stopping the repetitions.

How to use partially randomized rewards as a positive reinforcer?

Try to find opportunities in your customer journey to provide partially randomized rewards to your users. You can engineer it as part of product design through notifications for example. Or you can piggy-back on others and surface them through your own product as Uber is now doing.

Things to watch for: designing them well is a difficult craft. Too small of a win ratio will turn off users fast. Too large will negate the randomized aspect and effects. Testing ratios should be seen as a necessary step.

Another consideration is ethics: this reinforcer is one of the most addictive one for users and can lead to serious lifestyle impacts. Use it responsibly and consider introducing some safeguards against additive behavior.

