Speaker-Independent Mandarin Polysyllabic Word Recognition
Student : Berlin Chen Advisor : Dr. Chi-Min Liu
Institute of Computer Science and Information Engineering
National Chiao Tung University
This thesis considers the design of speaker-independent Mandarin polysyllabic word recognition system from two main viewpoints: the phonetic modeling and the recognition speeds. We first establish a baseline system based on a system established last year. The baseline system improves the recognition performance by increasing the training data and adopting another feature. On the baseline system, we consider the acoustic characteristics of Mandarin speech for phonetic modeling. We design and experiment with three phonetic models: context-independent INITIAL model, right-context-dependent INITAL model, and right-context-dependent null-INITIAL model. For the most accurate model, in the 500-word, 5000-word, and 25000-word tasks, the system can provide respectively an average recognition rate 99.1%, 93.7% and 83.6% for top 1 word, and 99.8%, 98.5% and 95.2% for top3 words. On the basis of the recognition results, we consider search algorithms to increase the search efficiency. Since that the tree-trellis search has the potential to greatly reduce the computation time without deteriorating the recognition rate of the baseline system, we adopt the search algorithm as the basic framework and investigate four implementation techniques. The results show that the tree-trellis search can provide a search time slightly dependent with word size. In comparison with the tree-trellis search, we further develop a kind of beam search algorithm, which we call the fast-match search, to our recognition system. A real-time demo system has been implemented on the Pentium-90 PC for vast testing.