This technology relates to a method for emotion classification using speech signals, and a recording medium and apparatus for performing the same. Specifically, it is a speech-based emotion classification technology that combines multiple emotion classification models.
Existing technologies faced the challenge of lower accuracy in emotion classification methods using audio signals compared to those using facial expressions, necessitating a solution to improve the accuracy of audio-based emotion classification. To address this, the present technology is implemented through steps including constructing multiple emotion grouping models trained on a large number of audio signals containing various emotions; calculating an emotion discrimination certainty factor, specifically using these multiple emotion grouping models; and comparing the calculated emotion discrimination certainty factor with a standard value.
Accordingly, this technology can improve the accuracy of emotion classification by incorporating all speech feature data extracted from audio signals and utilizing multiple machine-executable classifiers. It has application value in the fields of software, IT, internet, gaming, and entertainment.
N/A