American Society of Naturalists

A membership society whose goal is to advance and to diffuse knowledge of organic evolution and other broad biological principles so as to enhance the conceptual unification of the biological sciences.

“Reinforcement learning theory reveals the cognitive requirements for solving the cleaner fish market task”

Posted on

Andrés E. Quiñones, Arnon Lotem, Olof Leimar, and Redouan Bshary (Apr 2020)

Reinforcement learning theory unveils the cognitive mechanisms cleaner fish use to deal with their social environment

Read the Article

What cleaner fish need to make better economic decisions

The cleaner fish Labroides dimidiatus fulfils the important task of ridding other coral reef fish of parasites. The demand for cleaning services can be so high that often cleaners will have to choose between several potential clients who seek service at the same time. Experimental work carried on the Lizard Island research station, at the Great Barrier Reef, shows that some cleaners can be strategic when making these decisions. They more often prioritize clients that, due to large range sizes, can switch to a different cleaner if not immediately served. Cleaners using this strategy get access to both impatient and patient clients, and hence more food. This strategic decision-making is, however, not innate. Juveniles, and some adults, do not prefer clients with more leverage in the economic transaction. Why, then, do some cleaners fail at learning the more profitable preference? Using a computational model based on the building blocks of machine learning algorithms, Quiñones and collaborators show that cleaners need two important adaptations in the learning process. First, they need to account for the future effect of their choices whilst making decisions. That is because the extra food will come much later than their decision of whom to clean first. Second, they must be able to develop different preferences for a client type, depending on which other client is available. A particular client type should only get priority when it is together with another type, which has less access to alternative cleaners. This boils down to making decisions according to the context in which they are made. Interestingly, in the past, researchers considered these two cognitive adaptations to be exclusively human, as they are involved in human cognitive processes such as language acquisition. This opens up the question of whether these seemingly similar processes are implemented in comparable ways.


Learning is an adaptation that allows individuals to respond to environmental stimuli in ways that improve their reproductive outcomes. The degree of sophistication in learning mechanisms potentially explains variation in behavioral responses. Here, we present a model of learning that is inspired by documented intra- and interspecific variation in the performance in a simultaneous two-choice task, the ‘biological market task’. The task presents a problem that cleaner fish often face in nature: the decision of choosing between two client types; one that is willing to wait for inspection and one that may leave if ignored. The cleaners’ choice hence influences the future availability of clients, i.e. it influences food availability. We show that learning the preference that maximizes food intake requires subjects to represent in their memory different combinations of pairs of client types rather than just individual client types. In addition, subjects need to account for future consequences of actions, either by estimating expected long-term reward or by experiencing a client leaving as a penalty (negative reward). Finally, learning is influenced by the absolute and relative abundance of client types. Thus, cognitive mechanisms and ecological conditions jointly explain intra and interspecific variation in the ability to learn the adaptive response.