Abstract:
The state-of-the-art total variability factor analysis in language recognition can only preserve the global structure of speech data, without mining the local structure information, and the language of training speech data is not considered. To solve the problem, neighborhood preserving embedding (NPE) algorithm is introduced into the language recognition system. NPE can preserve the local neighborhood structure of speech data by constructing a neighborhood graph. As well,NPE can effectively use the language label information of speech data by supervised training. The proposed method is compared with the total variability factor analysis in 30 s and 10 s tasks of the NIST 2011 language recognition evaluation (LRE) dataset. The experimental results indicate that the proposed NPE method can overcome the deficiency of total variability factor analysis and improve the system performance significantly.