Step 4: PCA-based Subspace Matching in the Right Inferior Frontal Gyrus
Together, the early vision system and the LOC form a feature extraction subsystem, with the early vision system computing Gabor features and the LOC transforming them into non-accidental feature vectors, as shown in Figure 1. Similarly, the fusiform gyrus and right inferior frontal gyrus combine to form a feature-based appearance matching subsystem.

The appearance-based matching system is divided into two components: an unsupervised clustering system and a subspace projection system. This is motivated by the psychological observation that categorical and instance level recognition cannot be disassociated, and the mathematical observation that subspace projection methods exploit the commonality among images to compress data. If the images are too diverse, for example pictures of faces, pets, and chairs, then there is no commonality for the subspaces to exploit.

To avoid this, we model the fusiform gyrus as an unsupervised clustering system, and the right inferior frontal gyrus as a subspace matching system. This anatomical mapping is partly for simplicity; the exact functional division between these structures is not clear. Lesion studies associate the right inferior frontal lobe with visual memory [20], and rTMS and PET data suggest that these memories are compressed images [16]. Since compressed memories are stored in the frontal gyrus, it is easy to imagine that they are matched there as well, perhaps using an associative network. At the same time, clustering is the first step that is unique to expert recognition and the fusiform gyrus is the first anatomically unique structure on the expert pathway, so it makes sense to associate clustering with the fusiform gyrus. Where images are projected into cluster-specific subspaces is not clear however; it could be in either location, or both.

It is important to note that the categories learned by the clustering mechanism in the fusiform gyrus are non-linguistic. The images in a cluster do not need to be of the same object type or viewpoint, nor do all images of one object need to appear in one cluster. Clustering simply divides the training data into small groups of similar samples, so that PCA can fit a unique subspace to each group. This is similar to the localized subspace projection models in [7, 13]. We have implemented K-Means and an EM algorithm for mixtures of PCA analyzers similar to [32]. Surprisingly, so far we get the best results by using K-Means and overestimating the number of clusters K, possibly because non-symmetric Gaussians can be estimated by collections of symmetric ones.