A Whisker of Truth: A Multimodal Interdisciplinary Machine Learning Approach to Vocal, Visual, and Tactile Signals in the Domestic Cat
We propose a multimodal deep learning framework for automated analysis ofcat–human communication, integrating acoustic, visual, and tactile signals throughtransformer-based fusion. Using the largest expert-annotated dataset of its kindand interdisciplinary collaboration, we combine semi-supervised learning withethological and phonetic expertise to detect subtle behavioural and phonetic cues,enableWe propose a multimodal deep learning framework for automated analysis ofcat–human communication, integrating acoustic, visual, and tactile signals throughtransformer-based fusion. Using the largest expert-annotated dataset of its kindand interdisciplinary collaboration, we combine semi-supervised learning withethological and phonetic expertise to detect subtle behavioural and phonetic cues,enable
