World’s Opinion Page — and COVID 19!

Text Mining & Analysis of project-syndicate.com articles related to COVID19 using NLP.

Rutvij Bhutaiya
Analytics Vidhya

--

Photo by United Nations COVID-19 Response on Unsplash

Project Syndicate is a prominent website. The website’s tag line is ‘World’s Opinion Page’, that is indeed true. World Leaders, Economists, and scholars around the globe publish their views on various topics, ranging from macroeconomics to world politics.

For this study, we decided to analyze their views — especially on the COVID19 crisis — with the help of NLP and text analysis. For the study, we specifically only chose Artiles related to the COVID19 topic

We studied articles from 1st April 2020 to 14th May 2020. And which counts 91 articles.

In the study we analyzed articles from three approaches.

  1. Sentiment Analysis and Term Frequency of all the articles in a given period.
  2. Term Frequency of all articles with clear text in a given period.
  3. Author and his/her particular article Sentiment Analysis and Term Frequency analysis.

Study 1: Sentiment Analysis and Term Frequency of all the articles

Study 1 is focused on, all the collected articles from the project-syndicate website. Here, we choose only COVID related articles from authors around the world. In this study, we did TF — IDF term frequency-inverse document frequency.

Word Cloud

As we can see in the out, term used in articles COVID and crisis is as expected. Based on this we can argue that the collection of the articles is good, as we are able to achieve our objective. Now, further important to our study is on sentiment analysis.

Sentiment Analysis

Analysis of the articles indicated positivity and trust. This is an important factor in these unprecedented times.

Study 2: Term Frequency of all articles with clear text in a given period

This study is interesting, we came across this part of the study by accident.

This study is interesting, we came across this part of the study by accident. At the time of web scraping, formate was not clear of all the data, and we decided to remove white shapes and separate Date and Author for better analysis. We used gsub() function for this and we many interesting phrases in the articles. For this, we also studied term frequency, and the following is the output.

Term Frequency

Few phrases are ad like bookreview and andinterviews etc.

We ignore these TFs, and we focused on a few important ones like.. thecrisisofalifetime which has 182 counts. howwillthegreatcessationend and howtodevelopacovid-19vaccineforal which again has 182 counts. And ‘aboutthepandemic’sthreattotheruleoflaw’ which has 91 counts.

This study tells the authors' uncertainty on COVID vaccine and economic consequences post-COVID and recession.

Study 3: Author and his/her particular article Sentiment Analysis and Term Frequency analysis

In this study we analyzed individual articles for two authors and did the term frequency and sentiment analysis.

First we did an analysis on Kaushik Basu’s Article which he published on PS — May 6th, 2020. The second analysis we did on Andrew Scott’s article — April 22nd, 2020.

Term Frequency

Based on the TF- IDF analysis we can say that Kaushik Basu frequently used terms people and risks in his articles, where, Andrew is trying to convey a message from 1920 — the great depreciation.

But, let’s see what’s the sentiment analysis says,

Sentiment Analysis

Based on the sentiment analysis, we can say that Kaushik Basu article reflects positivity, but at the same time he is also negative and fearful. On the other hand, Andrew’s article has significant positivity.

--

--