تشخیص احساسات از سیگنال های گفتار براساس روش های فیلتر

نوع مقاله: مقاله پژوهشی

نویسندگان

1 کارشناس ارشد - دانشکده مهندسی برق، واحد نجف‌آباد، دانشگاه آزاد اسلامی، نجف‌آباد، ایران

2 استادیار - دانشکده مهندسی برق، واحد نجف‌آباد، دانشگاه آزاد اسلامی، نجف‌آباد، ایران

چکیده

گفتار ابزار اولیه ارتباط بین انسان‌‌ می‌باشد. با افزایش تراکنش میان انسان و ماشین نیاز به محاوره خودکار این دو و حذف کاربر انسانی مورد توجه قرار گرفته است.هدف از انجام این تحقیق، تعیین یک مجموعه از ویژگی‌های تاثیر گذار در تشخیص احساسات مبتنی بر سیگنال صحبت می‌باشد. در این مقاله، سیستمی طراحی گردید که شامل سه بخش اصلی، استخراج ویژگی، انتخاب ویژگی و طبقه‌بندی می‌باشد. پس از استخراج ویژگی‌های پرکاربردی چون ضرایب کپسترال فرکانسی مل (MFCC)، ضرایب پیشگویی خطی (LPC)، ضرایب پیشگویی خطی ادراکی (PLP)، فرکانس فرمنت، نرخ عبور از صفر، ضرایب کپسترال، فرکانس گام، میانگین، جیتر، شیمر، انرژی، ضرایب تبدیل فوریه، کمترین مقدار در هر پنجره، بیشترین مقدار در هر پنجره، دامنه هر سیگنال و انحراف از معیار، در مرحله بعد به کمک روش‌های فیلتر چون معیار همبستگی پیرسون، آزمون t ، رلیف و بهره اطلاعاتی به انتخاب و رتبه‌بندی ویژگی‌های تاثیرگذار در تشخیص احساسات رسیده‌ .سپس نتایج بصورت زیرمجموعه‌ای از ویژگی‌ها به عنوان ورودی به یک سیستم طبقه‌‌بند داده شده است، که در این مرحله از طبقه‌بند ماشین بردار پشتیبان چندگانه برای طبقه‌بندی هفت کلاس احساسی استفاده شده است. براساس نتایج بدست آمده روش انتخاب ویژگی رلیف به همراه طبقه‌بند ماشین بردار پشتیبان چندگانه دارای بیشترین میزان دقت طبقه‌بندی برای تشخیص احساسات مورد نظر با نرخ تشخیص% 94/93 می‌باشد.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Emotion Recognition of Speech Signals Based on Filter Methods

نویسندگان [English]

  • Narjes Yazdanian 1
  • Hamid Mahmoodian 2
1 MSc - Department of Electrical Engineering, Najafabad Branch, Islamic Azad University, Najafabad, Iran
2 Assistant Professor - Department of Electrical Engineering, Najafabad Branch, Islamic Azad University, Najafabad, Iran
چکیده [English]

Abstract: Speech is the basic mean of communication among human beings.With the increase of transaction between human and machine, necessity of automatic dialogue and removing human factor has been considered. The aim of this study was to determine a set of affective features the speech signal is based on emotions. In this study system was designs that include three mains sections, features extraction, features selection and classification. After extraction of useful features such as, mel frequency cepstral coefficient (MFCC), linear prediction cepstral coefficients (LPC), perceptive linear prediction coefficients (PLP), ferment frequency, zero crossing rate, cepstral coefficients and pitch frequency, Mean, Jitter, Shimmer, Energy, Minimum, Maximum, Amplitude, Standard Deviation, at a later stage with filter methods such as Pearson Correlation Coefficient, t-test, relief and information gain, we came up with a method to rank and select effective features in emotion recognition. Then Result, are given to the classification system as a subset of input. In this classification stage, multi support vector machine are used to classify seven type of emotion. According to the results, that method of relief, together with multi support vector machine, has the most classification accuracy with emotion recognition rate of 93.94%.

کلیدواژه‌ها [English]

  • Speech emotion recognition
  • features extraction
  • Features selection
  • filter method
  • Support Vector Machine
[1] O. Shaughnessy, "Speech communication human and machine ", Reading: Addison- Wesley,1987.

[2] D.J. France, et. al., "Acoustical properties of speech as indicators of depression and suicidal risk," IEEE Trans. Biomedical Engineering, Vol. 47, No. 7, pp. 829-837,2000.

[3] N. Dehak, "Modeling Prosodic Features With Joint Factor Analysis for Speaker Verification", IEEE Transactions on Audio, Speech, and Language Processing 15, 2095–2103. 2007.

[4] X. Helander, Nurminen, "On the importance of prosody on speaker identity," Dans les actes de Eurospeech. 2007.

[5] E. Shriberg, “Modeling prosodic feature sequences for speaker recognition", speech communication 46(3-4), pp. 455–472, 2005.

[6] B. Tischer,"Acoustic correlates of perceived emotional stress.", 1995.

[7] V. Petrushin, "Emotion recognition in speech signal: experimental study, development and application", Proceedings of the ICSLP 2000, Beijing, China, pp. 16-20, October 2000.

[8] D. Ververidis, C. Kotropoulos, "Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition" , signal processing, Vol. 88, pp. 2956-2970, 2008.

[9] E. Bozkurt, E. Erzin, "Improving automatic emotion recognition from speech signals." In 10th annual conference of the international speech communication association (interspeech), Brighton, UK, Sept ,pp. 324–327, 2009.

[10] B. Yang, M. Lugger, " Emotion recognition from speech signals using new harmony features" ,signal processing 90, pp. 1415-1423, 2010.

[11] B. Yang, M. Lugger, " Emotion recognition from speech signals using new harmony features" ,signal processing 90, pp. 1415-1423, 2010.

[12] E. Albornoz, et. al., "Spoken Emotion recognition using hierarchical classifier", Computer speech and language , pp. 556-570, 2011.

[13] A. Hassan, R. Damper, "Classification of emotional speech using 3DEC hierarchical classifier.", speech communication 54, pp.903-916. 2012.

[14] K. Kishore, P. Satish, "Emotion recognition in speech using MFCC and wavelet features." ,IEEE International Advance Computing Conference (IACC), 2012.

[15] M. Bhargavaa, T. Polzehl, “Improving Automatic Emotion Recognition from speech using Rhythm and Temporal feature.” School of Electrical Sciences, Indian Institute of TechnologyBhubaneswar,Bhubaneswar 751013,India., Quality and Usability Lab, T labs/ TU Berlin, Germany, 7 Mar 2013.

[16] R. Asadi, H. Fell, 'Improving the accuracy of speech emotion recognition using acoustic landmarks and Teager energy operator features", The Journal of the Acoustical Society of America. ,Vol.137, April2015.

[17] H. Harb, L. Chen, “Voice-based gender identification in multimedia application”, Jornal of Intelligent Informatin Systems, Vol.24, issue.2-3, pp.179-198, March 2005.

[18] S. McGilloway, R.Cowie, "Approaching automatic recognition of emotion from voice", A rough benchmark. In ISCA workshop onspeech and emotion, Belfast, 2000.

[19] S. Biersack, V. Kemp, “Exploring the influence of vocal emotion expression on communicative effectiveness”, phonetica, Vol. 62, pp. 106-119, 2005.

[20] T. Waaramaa, P. Alku, A. Laukkanen, “The role of F3 in the vocal expression of emotion”, logopedics phoniatrics Vocology, Vol. 31, pp. 153-156, 2006.

[21] J. Makhoul, “Linear prediction: A tutorial review,” Pro-ceedings of the IEEE, pp. 561-580, 1975.

[22] Shikha Gupta, Jafreezal Jaafar, "FEATURE EXTRACTION USING MFCC," Universiti Tecknologi PETRONAS, CIS Dept, Perak, Malaysia, August 2013.

[23] M. Farrús, J. Hernando, P. Ejarque, “Jitter and Shimmer Measurements for Speaker Recognition," TALP Research Center, Department of Signal Theory and Communications Universitat Politècnica de Catalunya, Barcelona, Spain.

[24] J. Jarmulak, S. Craw, “Genetic algorithms for feature selection and weighting”, Appears in Proceedings of the IJCAI’99 workshop on Automating the Construction of Case Based Reasoners, 1999.

[25] Kabir, Md. M., Islam, Md. M., Murase, K., “A new wrapper feature selection approach using neural network”, Neurocomputing, pp. 3273-3283, 2010.

[26] R. Kohavi, G. John, "Wrappers for Feature Subset Selection.", Artificial Intelligence, pp.273- 324, 1997.

[27] K. Pearson, "Mathematical contributions to the theory of evolution. III. Regression, heredity and panmixia," Phil Trans. R Soc Lond Series A, pp. 253-318.

[28] R.P.L. DURGABAI, "Feature Selection using Relief Algorithm.", International Journal of Advanced Research in Computer and Communication Engineering, Vol. 3, Issue 10, Oct. 2014.

[29] Andrew W. Moore "Information Gain", School of Computer Science, Carnegie Mellon University, 2003.

[30] L. Wang. "Support Vector Machines: Theory and Applications.", ISBN: 3540243887, Springer-Verlag, New-York, Inc. 2005.

[31] Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B. "A database of German emotional speech." In Interspeech, 2005.

[32] Sara Motamed, "Speech Emotion Recognition Based on Learning Automata in Fuzzy Petri-net." Journal of mathematics and computer science, pp. 173-185, 2014.

[33] B. Rahul Lanjewar, D.S. Chaudhari, "Comparative analysis of speech emotion recognition system using different classifiers on Berlin emotinal speech database." International Journal of Electrical and Electronics., Vol. 3, Issue 5, pp. 145-156, Dec 2013.

[34] Mayank Bhargavaa, T. Polzehl, "Improving Automatic Emotion Recognition from speech using Rhythm and Temporal feature." School of Electrical Sciences, Indian Institute of Technology Bhubaneswa, 2012.

[35] Peipei Shen, Z. Changjun., "Automatic Speech Emotion Recognition Using Support Vector Machine." International Conference on Electronic & Mechanical Engineering and Information Technology, 2011.