Robuta

https://www.mdpi.com/1424-8220/23/4/2284
Audio-visual speech recognition (AVSR) is one of the most promising solutions for reliable speech recognition, particularly when audio is corrupted by noise....
audio visualgesture recognitionspeechsensorsmobile
https://openreview.net/forum?id=vWSll6M9pj&referrer=%5Bthe%20profile%20of%20Stavros%20Petridis%5D(%2Fprofile%3Fid%3D~Stavros_Petridis1)
Research in auditory, visual, and audiovisual speech recognition (ASR, VSR, and AVSR, respectively) has traditionally been conducted independently. Even recent...
speech recognitionunifiedsinglemodelauditory
https://openreview.net/forum?id=bJuLbTmkeR&referrer=%5Bthe%20profile%20of%20Stavros%20Petridis%5D(%2Fprofile%3Fid%3D~Stavros_Petridis1)
Multimodal large language models (MLLMs) have recently become a focal point of research due to their formidable multimodal understanding capabilities. For...
large language modelsvisual speech recognitionstrongaudiolearners