Textual Content and Engagement Correlation Analysis with Naive Bayes

Tomislav Krištof, Vanja Šebek, Mario Fraculj


With the constant improvement of sentiment analysis software, it is possible to determine whether there is a correlation between the sentiment of the content and the content engagement. By combining two platforms we were able to prove that there is a moderate correlation between the content sentiment and content engagement. Furthermore, there are other correlations regarding numeric variables describing the properties of the content, like content length and title length compared to the content consummation and engagement. Determined values are showing strong negative correlation between the content length and content consummation. Content platform was Medium.com social network and software platform for sentiment determination was an online tool based on enhanced Naïve Bayes model. For finding correlations we used the Pearson’s correlation coefficient because it gives information about the magnitude of the association, or correlation, as well as the direction of the relationship.


natural language processing, sentiment analysis, Pearson’s correlation coefficient, enhanced Naïve Bayes model

