This Paper aims to presents an approach of ASR system based on isolated word structure using mel-spectrogram feature and recurrent neural network model. The mel-spectrogram feature used to capture the significant characteristics of the speech signals. A Long Short Term Memory (LSTM) architecture for variable length feature sequence classification is presented. However, it can be applied to any sequence modelling or classification task. The experimental setup includes words of Chinese language collected from five speakers. These words were spoken in an acoustically balanced, noise free environment.The experimental results show about 16.44% improvement when compared with Dynamic Time Wrapping (DTW) based method.