Ｖarious FEMs have distinct emphases. For example, several may focus more attention on the contour information, whereas others may lay particular emphasis on the texture information. The single-head feature is only a one-sided representation of the sample. Besides the negative influence of cross-domain (e.g., the trained FEM can not adapt to the novel class flawlessly), the distribution of novel data may have a certain degree of deviation compared with the ground truth distribution, dubbed as distribution-shift-problem (DSP). To address the DSP, we propose the Multi-Head Feature Collaboration (MHFC) algorithm, which attempts to project the multi-head features to a unified space and fuse them to capture more discriminative information.