Delayed-response tasks (DRT's) have been used to assess working
memory (WM) processes in human and nonhuman animals. Experimental studies
showed that the basal ganglia and prefrontal cortex (PFC) subserve DRT
performance. We hypothesize that the basal ganglia subserve selection of
both motor- and cognitive-related information (i.e., the uniform selection
hypothesis) to perform these tasks. We propose an Actor-Critic model (where
the matrisomes represent the Actor and the striosomes represent the Critic)
that simulates reward-based acquisition of these functions. The model
incorporated both closed- and open-loop pathways between the basal ganglia
and dorsolateral prefrontal cortex (DLPFC). The purpose is to respectively
select task-relevant cognitive information to be maintained in WM and also
to select appropriate motor responses. Training for both types of functions
is based on the temporal difference algorithm.
A novel feature of the model is the incorporation of delay-active neurons in
the striatum (as well as DLPFC). Another novel feature of the model is the
subdivision of the matrisomal neurons into delay- versus transiently-active
for respectively maintaining cognitive information and selecting motor
actions.
The model accounts for DRT performance, as evident by tracking the changes
in connection strength during learning. Further, the significance of the
uniform selection hypothesis is tested against some lesioning and
reward-based behavioral studies related to WM processes of the basal ganglia
and DLPFC.