Механизм предварительной обработки текста перед анализом настроений

А. П. Бельтюков; М. М. Аббаси

Информационные технологии интеллектуальной поддержки принятия решений, Информационные технологии интеллектуальной поддержки принятия решений 2018

А. П. Бельтюков, М. М. Аббаси

Изменена: 2018-06-20

Аннотация

В этой статье исследуются способы предварительной обработки текста перед применением методов идентификации и анализа эмоций, выраженных в этом тексте. Замечено, что во всех методах машинного обучения предварительная обработка данных является очень важной фазой. Однако в анализе чувств этому не уделялось должного внимания. В этой статье мы анализируем работу разных исследователей, в области предварительной обработки текста. Мы покажем важность такой обработки, необходимые для неё факторы или механизмы. Для различных типов текстов покажем, как тип и назначение текста влияет на процесс его предварительной обработки. В разделе выводов мы обобщаем проанализированный нами материал и делаем некоторые заключения о предварительной обработке текста при анализе эмоций.

Ключевые слова

обработка текста; обработка данных; анализ текста

Литература

1. Pang B. Lee L. Opinion mining and sentiment analysis// Foundations and trends in information. 2008. P. 1-135.

2. Cambria E. Olsher D. Rajagopal D. A common and common-sense knowledge base for cognition-driven sentiment analysis. // Proc. of the 28 AAAI conference on artificial intelligence. 2014.

3. Poria S. Cambria E. Winterstein G. Huang G.B. Dependency based rules for concept-level sentiment analysis. Knowledge-Based Systems. 2014. P. 45– 63.

4. Emma H. Xiaohui L. Yong S. The Role of Text Pre-processing in Sentiment Analysis // Proc. of the International Conference on Information Technology and Quantitative Management. 2013. P. 26 – 32.

5. Pritam C.G. Patil L.H. Chaudhari P.M. Preprocessing Techniques in Text Categorization. /National Conference on Innovative Paradigms in Engineering & Technology (NCIPET-2013) //Proceedings published by International Journal of Computer Applications (IJCA), 2013.

6. Xue X. Zhou Z. Distributional Features for Text Categorization// IEEE Transactions on Knowledge and Data Engineering, 2019. Vol. 21 №.3. P. 428-442.

7. The Transformation, Analysis, and Retrieval of Information by Computer // G. Salton- Pennsylvania, Addison Wesley, Reading, 1989.

8. Porter M. An algorithm for suffix stripping, Program. 1980. Vol. 14 №.3. P. 130–137.

9. Salton G. Buckley C. Term weighting approaches in automatic text retrieval. Information Processing and Management .1988. Vol. 24 №.5. P. 513- 523.

10. Karbasi S. Boughanem M. Document length normalization using effective level of term frequency in large collections //Advances in Information Retrieval, Lecture Notes in Computer Science, Springer Berlin / Heidelberg. 2006. Vol. 3. P. 72-83.

11. Moral C. Antonio A. Imbert R. Ramírez J. A survey of stemming algorithms in information retrieval/ Information Research. 19(1) paper 605. 2014.

12. Blinov P. D. Klekovkina M. V. Kotelnikov E. V. Pestov O. A. Research of lexical approach and machine learning methods for sentiment analysis. Vyatka State Humanities University, Kirov, Russia, 2013.

13. Giulio A. Laura F. Tomaso F. Paolo F. Eleonora I. Federico M. Stefano M. A Comparison between Preprocessing Techniques for Sentiment Analysis in Twitter. Dipartimento di Ingegneria dell'Informazione Parco Area delle Scienze 181/A, 43124 Parma, Italy, 2017.

14. Arjun S. N. Ananthu P. K. Naveen C. Dr. Balasubramani. Survey on Pre-Processing Techniques for Text Mining. International Journal of Engineering and Computer Science. ISSN: 2319-7242. June 2016. Vol. 5. P. 16875-16879.

15. Abbasi M.M. Beltiukov A.P. Analysis of sentiment and emotion from text written in Russian language. 5th All Russian Conference on Information technology for intelligent decision making support, Ufa, Russian Federation, May 16-19, 2017.

Полный текст: PDF