Last year the Federal Reserve Board analyzed over 900,000 news stories. They weren’t interested in who would be the next president, the latest gadgets to be released, or what Miley Cyrus had named her cat. In fact, they weren’t interested in news at all. They were interested in prices.
What exactly does the news have to do with prices? According to the Federal Reserve Board, the secrets of prices yet to come may be bundled up in the text of news reports. Using neural network based artificial intelligence algorithms and controlling for neutral news stories, the board was able to predict stock market returns up to thirteen weeks after the stories had been published. It is a brave new world of big data, and we finally have the tools necessary to put our ever-increasing information into action. One of these tools is natural language processing (NLP)-based sentiment analysis, a form of text analysis.
At its most basic level, NLP-based sentiment analysis filters through the language used in a body of work, assigning sentiment values to a subject of interest. These values are then used to predict the general sentiment towards this subject, which will, in turn, provide information on its investment risk. Not all sentiment analysis is the same, however, and deepening on the situation a polarity analysis or a sentiment analysis will be preferable. A polarity analysis hones in on negative and/or positive feelings associated with a given subject, while sentiment analysis is based on subjective fact-collection. Either tool can be a powerful predictor of consumer preference and likely future activity.
The idea is simple, but its implementation is tricky. Language cannot be analyzed with a simple Ruby on Rails application. Seemingly small changes in syntax, word choice, and even grammar result in significant changes to meaning. As the oxford comma proponents remind us: “We invited the strippers, JFK, and Stalin,” differs significantly from: “We invited the strippers, JFK and Stalin.” In addition to the complexity of small language choices, meaning and connotation change over time, trends develop and fade away, and humor can distort meaning in a way that people recognize, but machines often miss.
How then can we build a reliable system of NLP-based sentiment analysis? One answer is machine learning tools like Vector. By incorporating a learning component into our text parsing algorithms, we are able to produce an intelligence of sorts. This intelligent, machine-learning program is the herald of our brave new world of data processing.
As NLP-based sentiment analysis becomes more efficient, its predictive power becomes more useful. The Federal Reserve Board aren’t the only organization experimenting with how to harness and use this information. Data scholars are exploring the possibilities from all angles. Selene Yue Xu at the University of Berkeley, for example, has successfully forecasted stock prices with an algorithm that uses Google Trend as its body of work. Others are using twitter. Wesley S. Chan at M.I.T. is looking into how returns patterns differ for stock that has recently been in public news as compared to those that are not discussed in our headings. There is still much to study, but many features of NLP-based sentiment analysis can already be utilized.
Beyond economic governance and academia, NLP-based sentiment analysis has a role to play in business. This tool can already help investors mitigate risk by providing a quantitative assessment of public opinion. Public opinion, when calculated in this way, has been shown time and time again to effectively predict stock market movements, implying that it will be an invaluable tool to investors. A number of traders and hedge-fund terms are already taking advantage of this information by purchasing broad reports on stock market expectation. The price tag can be high – iSentium charges their clients 15,000 USD/month for a twitter-based daily +/- indicator on stocks – but the rewards to be reaped are well worth the cost. With Vector’s faceted search capability, investors will not need to purchase full packages of data. Instead, they will be able to target the specific information that is relevant to them, pulling sentiment from broad bases of public information.
The NLP marketplace is forecast to grow from 7.63 Billion USD in 2016 to 16.07 Billion USD by 2021. With the market growing alongside increasingly advanced and precise NLP technology, access is becoming an essential investment tool. Those investors that are savvy enough to take advantage of NLP-based sentiment analysis will have a distinct advantage over their competitors.
Vector is a company built to help investors do just that. With state-of-the-art software that reports numerical sentiment subjectivity and sentiment polarity data on almost any subject in today’s newsfeed, Vector presents vital information in an easy to use, consumer-friendly format. Scan the bulk of the internet for indicators on both sentiment direction and magnitude. Access millions of news sites with greater efficiency than ever before. The future of investment is here, and it’s all about data.