Country Risk Sentiment Analysis

Tuesday, 28 July 2016

Fernando Pratama  InfoTrie, Singapore

This report explores methods to create time series of country risk sentiment analysis. In particular, United States is chosen. Four risk topics are discussed; economic risk, legal risk, security risk, and political risk.

Firstly, the keywords of each risk topics need to be extracted. Labeled US articles from Reuters and guardian are extracted in order to generate keywords. Term weighting tf-idf is then applied to each article set for all topics. The top 200 words with the highest weighting on each topic are chosen to be the keywords. Here the top 10 words for each risk topics.

These keywords are then used to query articles from InfoTrie website. These 10000 articles for each topic downloaded for this report. The data acquired is then pre-processed into time series of articles with sentiments. Here is the summary of the economic time series articles.

NLP | Natural Language Processing | Sentiment Analysis

One of the way to measure the topic risk is to perform sentiment analysis on each topic where low sentiment implies high risk and vice versa. Past papers have conducted experiment on relationship between article sentiment and stock price (1), (2). Since there is already infotrie’s sentiment indicator, we can use it to create time series of sentiment to estimate country risk. The sentiment itself is ranging from 1 to 10. Here are the time series plots on average daily sentiment for each topic:

Country Risk Time Series | Sentiment Analysis

Another approach is to consider certain keywords that relate to certain risk event. Firstly, The articles are grouped based on date and converted into document term matrix with tf-idf weighting. The sum of weight in all documents for each term becomes the term value in each day.

Examples of Country Risk Sentiment Analysis

Here are the examples for “greek” and “bailout” terms in economic time series and “nuclear” in politic time series.

Economic Risk Time Series | Alternative Data

Economic Risk Time Series | Alternative Data

Environmental | Nuclear | Alternative Data

Note that terms which do not appear in certain days result in disconnected lines in the time series graphs.

Discover FinSentS! | Check our I-Feed API| Try our Excel Plugin

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.