Multimodal systems have been previously used as an aid to improve quality and safety inspection in various domains, though few studies have evaluated these systems for accuracy and user comfort. Our research aims to combine our software interface designed for high usability with multimodal hardware configurations and to evaluate these systems to determine their user performance benefits and user acceptance data. We present two multimodal systems for using a novel system-directed interface to aid in inspecting vehicles along the assembly line: (1) wearable monocular display with speech input and audio output and (2) large screen display with speech input and audio output. We conducted two evaluations: (a) an experimental evaluation with novice users, resulting in accuracy, timing, user preferences, and other performance results and (b) an expert-based usability evaluation conducted on and off the assembly line providing insight on user acceptance, preferences, and performance potential in the production environment. We also compared these systems to current technology used in the production environment: a handheld display without speech input/output. Our results show that for visual and tactile tasks, benefits of system-directed interfaces are best realized when used with multimodal systems that reduce visual and tactile interaction per item and instead deliver system-directed information on the audio channel. Interface designers that combine system-directed interfaces with multimodal systems can expect faster and more efficient user performance when the delivery channel is different from channels necessary for task completion.