The study examined the impact of three kinds of captions such as full-word, keyword and no-word captions instruction on improving students’ listening comprehension in a local university. The researchers interviewed the students on what they experienced and learned in three kinds of captions instruction respectively. The result showed that the audio-visual textbook can provide important contextual information and more easily stimulate students’ interest in learning. The keyword caption is the best one among all three kinds of captions instruction to improve students’ listening comprehension. The reason is that keyword captions can help them adjust their listening strategy, reduce students’ cognitive overload, complement their linguistic knowledge and promote them to confirm audio and visual information.