What can a fundamental investment manager learn from machine learning? On the contrary, what can a data scientist learn from the fundamental investment? For both, plenty, it turns out. For the past years, BALANSTONE has conducted deep-dive technology research on machine learning and deep neural network. This research activity has extended and resulted in design and development of the new AI of fundamental investment. In this article, we would like to share what we have learned through our journey: the keys to developing AI for fundamental investment research.
Evolution of AI (ML and DNN)
AI has made considerable progress in the last several years. Machine learning (ML) has made progress, and text analysis including recommender system has got advanced significantly in the last decade.
In particular, deep neural network (DNN) has got a major advancement, and many successful cases of DNN were reported for the selected applications such as natural language processing, image recognition, and text analysis. Deep reinforcement learning’s current success has been remarkable, and it beat the world’s top-ranked professional. That event has become a milestone in the birth of super-human level AI.
There have been several catalysts that have enabled the success. They include the elements and designs of the model and the size of data available for training, in addition to the improvement in the computing capability. All of them led to the development of the theme: “AI replaces human.” https://goo.gl/FBX5j3
Fundamental Active Investment
The story of evolutional AI is nothing new, but most professionals in the fundamental investment community should have asked the question in mind: does AI technology innovate the fundamental active investment and if so, how? Everyone in the fundamental investment industry is aware of its potential influence as indicated by the equity market. At the same time, the fundamental investment industry is in a structural shift. To search for further return potential and accuracy (i.e. low risk), the bottom-up information gathering activities have got deep and intensified. Focus and engagement became popular strategies. Some even find the limit of early acquisition of legitimate information, which ended up as several cases of insider information. On the other hand, new rule-based & low-cost funds are gaining share. The elevated fee for HF and PE is getting scrutinized. All in all, the application of fundamental approach got diversified but the core approach of fundamental research remains largely unchanged, and the same textbooks have been used for many years. Innovations in fundamental investment research does not seem to be progressing at the same velocity as other industries in the world, while seeking new opportunities in ESG and other initiatives.
We had the same question before and after the work of elucidating the current feasibility of AI for fundamental investment, which is structurally and conceptionally different from what we traditionally call ‘quantitative investment.’ We have learned several clues uniquely from the journey of our development process.
Artificial Fundamental Investment Research Intelligence (AFIRI)
In contrast, algorithmic program trading has been progressing for years, and the application of AI to such algorithms has already had good enough results. It evaluates the positions of trading rapidly and selects the best next moves constantly without a human. A human cannot execute such ultra-high-speed trading. There are even performance comparison sites of AI trading for the platform to provide the development environment, make the participants compete, reward them, and gain information from their activities. The continued innovations will make the trading more and more automated, and human traders will be finding new roles as the developers and the managers of the trading machine.
However, AI for fundamental investment research is “fundamentally” different from automated AI trading system. The objective that we focused on was to develop the process of the intrinsic value-oriented investment and not to learn from successful AI trading strategy and to apply it. Intrinsic-value oriented fundamental investment is not made of consecutive successful trades. The following description would be useful.
“Intrinsic value dominates the process of fundamental investment management and price is just an attached tag to the business as a system to deliver an excess return. Price fluctuates by its nature according to the speculative activities and the market making function to balance demand and supply continuously. Price movement must be understood as a crucial factor of action, but it is not the dominant and primary factor that governs the fundamental investing decision.”
To further clarify it, try to give thoughts on what would happen if the objectives of a portfolio were set to maximize the short-term risk-adjusted return on the following rationale.
“Because long-term return is a simply a sum (compounded) of short-term returns, to be successful in the long-term is the same as to be successful in the short term. As such, we have to control short-term return and measure our results as the most important performance indicator. That makes investment process successful in the long term.”
What is wrong? It looks as if there were no gap between short-term trading and long-term investment.
Most of you must have discerned the trick of this rhetoric immediately. It is completely ignorant of the truth that the process and approach of short-term trading are totally different from those of the fundamental investment. Time horizon forms the distinct border between the two.
The best investment is not the same as a continuous and successful executed trading ideas. A trading strategy needs to be oriented toward prediction of price, and it is essentially a price-oriented decision generator. Intrinsic value does not play a key role as the determinant of the decision. AI is deployed typically to analyze and predict the relative price behaviors across many assets.
As a result, it turns out that the great AI framework developed and optimized for best trading is not compatible at all with fundamental investment research that has a different objective, although AI for trading may (depends on the execution process) help traders of an investment team when executing the orders.
A distinct name tells what specific task AI does. Automated AI algorithmic trading can, for instance, be called as ATI, Artificial Trading Intelligence. In case of fundamental investment management and research, it should be called as AFIRI, Artificial Fundamental Investment Research Intelligence.
This makes it clear that AI belongs to the specific process of the targeted process that gets intellectualized. The process must be explicit to get modeled accurately.
Framing the General Process of AI
Understanding the process flow of AI at the high-level clarifies the ideas of what and how data should be created and processed in each element of the pipeline. The typical five steps are:
- Representation of Environment
- ML/DL Process
Environment selects the targeted activity and set how the whole task of AI is defined against it. Representation is a process to the conversion of the targeted activity into the format that can be processed through the training of ML frameworks or deep neural networks. Hyperparameters are set and tuned for ML/DL process, and the dataset is ingested in the training process to process reasoning and develop knowledge while minimizing the penalty for error. The completed network is tested and then used to provide predictions on a new input dataset.
As the start step defines the following pipeline, selecting a right set of the environment in the most representative manner is crucial. Specifically, for AFIRI, fundamental investment AI, what must be done first is to frame the fundamental investment process explicitly and to identify the distinct characteristics of the process.
In our experience, this process should be very carefully and thoughtfully done. Contrary to the general notion of the advantage of large data size (i.e., the larger the dataset becomes, the better the results become), the right configuration at this step significantly reduces the time for the fine-tuning process of the hyperparameters that is requisite for developing a robust neural network. Data should be not expanded just because of the probable betting that there may be a potential and hidden causality. Those presumably ‘the larger, the better’ approach ends up with uncontrollable data lake.
As touched, the challenge of artificial intellectualization is a composition of the two challenges: the cognitive functions of artificial intelligence and the human intelligence as the corresponding role model. The road toward AFIRI from the legacy fundamental investment research accompanies a journey to rediscover the details of the current process of value-oriented investment decision.
As such, the targeted decision-making model needs to get understood and analyzed carefully and to give the intellectualization approach of AI a solid and consistent framework. The next step is to understand properties of the gap of the decision process between fundamental investment research and ML/DNN task. We will compare and contrast the two steps of the decision-making process that are similar but distinct from each other.
Difficult Judgement and Complex Decision
The two steps which must be understood are difficult judgment and complex decision. Apparently, there does not seem to be an obvious difference. As to AFIRI, the question to be answered becomes:
“Is fundamental investment research a difficult judgment or a complex decision?”
Describing the cases of AI is the best guide to get the ideas of what difficult judgment and complex decision are respectively.
First, investigate the task of image recognition and AlphaGo.
For image recognition task, the difficulty comes from building the pipeline that handles the parameters that represent the original data effectively enough to train the network. The two key developments were 1) convolutional neural network architecture and 2) other components of network structure including activation function, regularization, and residual learning. Two-dimensional image data is expressed digitally in the format of the vector of pixel data.
Why was a convolutional neural network required instead of a plain vanilla deep neural network? It is just because the network would get too big to get handled. 1920 × 1080 HD image has just over 2m pixels, and so the first layer of a plain vanilla DNN would need to have over 100k neurons (assuming 1/20 reduction from 2m to 100k in the first layer), which is followed by many layers of neurons (although declining number of neurons in general). If you train such a network and use millions of pictures as a training dataset, it becomes practically impossible to complete this job without unrealistically large computing capacity.
For a board game such as Go, the difficulty lies in the challenge that there are exponentially large numbers of legal moves of stones, equal to bd possible sequences of moves, where b is the game’s breadth (number of legal moves per position), and d is its depth (game length). GO has 250150 possible moves. Even the largest and the most advanced computer cannot deal with such a huge tree of moves.
The solution is to how to train the model without predicting every possible position of stones and to determine the best possible move. When the first article was published in 2016, AlphaGO used the unique hybrid approach of Monte Carlo Tree Search and trained neural network as an intuitive predictive engine for future legal positions of stones to overcome the difficulty. It also used a convolutional layer in which the board position was passed as a 19 × 19 image to construct a representation of the position, resulting in reducing the effective depth and breadth of the search tree.
In 2017 AlphaGo evolved into unsupervised version, AlphaGo Zero, which was further developed to become a generalized version for the board games: AlphaZero. The network is trained with random play and reinforcement learning with the knowledge of the game rules but without the domain knowledge.
When training the network, larger training data would result in better results of low variance and bias. For image recognition, the volume of training dataset is not always generated by the training algorithm, and the variety and volume of data could become a constraint. However, one of the special features of a board game is the simple and static rule in a closed and controlled environment. It is the perfect task for the training algorithm to execute unlimited numbers of self-play simulations and self-generate data to execute reinforcement learning. In other words, the unsupervised reinforcement learning has proven its great potential through AZ. However, it was not accomplished without the special and unique conditions that the board games have. The real world is so inconvenient and so complex that it does not allow simulations with true results for reinforcement learning.
What definition should be given to those tasks of difficult judgment?
As seen in the two tasks, the successful solution was achieved by cracking the key bottleneck of extremely large processing volume in addition to the benefits from a rise in the computing processing capability. The AlphaGo used the hybrid network architecture as well as a convolutional layer. The image processing used several new architectural approaches including ReLU activation functions, dropout, short-cut connections, and convolutional layer as the most important one. They were not general techniques of handling a large volume of data, however. New approaches were uniquely developed to suit the type of datasets in the pipeline. The difficult judgment was required when the process needed the massive volume of processing, and it was solved by an original approach in designing the pipeline.
Another key and common feature of these two cases is that their process ingested all the variables that have possible causality with the result. The process was successfully got isolated and became complete without any other data. Their success required neither finding and representing missing relevant variables nor adding them to the dataset.
In other words, the difficult judgment has a vector of the complete relevant variables that have been identified in a highly accurate manner. The execution of judgment was difficult, but the ingestion of the complete variables was not.
The third feature is that the dataset of difficult judgment was homogeneous. GO was represented by the set of locations of black and white stones. Image recognition used pixel data as its representation. There are 3Vs in big data, volume, velocity, and variety, and for the difficult judgment, the dataset had two Vs except for variety. Each of the two cases did not need to develop a new technique to deal with the heterogeneous composition of the dataset.
What definition should the difficult judgment for artificial intelligence be? We think that it is the judgment with the above mentioned three attributes of 1) extremely large processing volume, 2) complete representation, and 3) homogeneous dataset.
While 2) and 3) are given, 1) becomes the main challenge of the difficult decision. The challenge of large processing volume has been conquered by the impressive development of new pipelines and techniques on a new hardware/semiconductor.
Those development has been made not only by the advances in the purely mathematical approach but by the engineered pipeline and the new functions that were inspired initially by the cognitive process and the logical reasoning process of human being. We will touch on this later. Taking advantages of the unique closed and controlled condition was another driver. As such, understanding the approaches as well as the conditions that solved the difficult judgment successfully would guide what types and directions of new ideas can be promising in the future.
Complex decisions are different from difficult judgments. First of all, judgment is followed by a decision. That is a fundamental sequence of the whole decision process. It is also important to be cognizant that the judgment is a subset of all the input required to complete the decision process.
There are several essential characteristics of a complex decision in ML/DNN, which include the following elements:
- Incomplete Representation
For complex decisions, there are two types of incompleteness, and type 1 relates to the lack of explicit breadth of input data. Remember that the 2nd feature of difficult judgment is a complete representation and is antithetical to this incompleteness. As complex decision relies on the aggregated dots of fragmented information and views, the definition of explicitness is generally unwarranted, whereas the complete input data covers the entire information and views that have any causality with the final decision.
However, for the tasks that require complex decisions in the real world, it is impossible to gather data explicit and complete enough to cover everything and to process the dataset ideally so that the dataset can train ML/DNN model.
Type 2 relates to the unstructured and descriptive information that is difficult to get converted to a dataset with limited consistency and availability. Think about the impression of the interview with a person. If an interviewer is responsible for reporting and a template of the scores of stereotypical categories is used, how accurately and effectively can the stylized format capture all the relevant nuance of the impression for the best decision-making? Is there a proved system to measure the effectiveness and to improve it continuously for better results?
Some might have a question if the unstructured text data mining system of sentiment analysis such as LDA could be applied to type 2 to process the conversion. It does the conversion from analog and descriptive information, and they have already established successful cases of AI’s unstructured data processing. However, those technologies have limited scope of applications. As they are intended to classify large unstructured data or find some commonalities from it, significant information with infrequent appearances is hard to be identified as such. It is especially good at finding the popularity among big data. Human professionals with domain insight can identify a small clue of logic with a significance from a small piece of unstructured data in the big data lake very well through presumably connecting multiple sets of independent cognitive networks.
- The requirement for Body of Highly Intellectual Knowledge
The complex decision deals with uncertainty. To lessen the unfavorable uncertainty of the decision, a human has done the activity to improve the quality of decision, and it is called as an education. We build the system of intellectual knowledge from learning multiple subjects. For instance, in the investment management, the certified professional is required to equip the body of knowledge (BoK) of investment management and research, covering economics, accounting, financial management, and derivatives and corporate governance. The owner of BoK must integrate them and apply it to form predictive views.
BoK is also shaped by the professional experience, and they are not a static but a continuous system. Although there are more and more questions raised from academics about the actual cognitive ability of a human to learn from the experience and gain expert insight, the complex decision has the nature of advancing its network through intellectual reinforcement learning as professional work experience.
What does it mean from the perspective of neural network structure? There are a couple of possible impacts on the network. The first one is the existence of multiple neural networks, combined and working together. Each subject is processed by a specific network. Because the result requires integrating multiple subjects with a different and independent network, the neural network should be in the form of ensemble learning structure. It results in a new challenge, and the process pipeline gets complicated and difficult to set up and compute.
The network also needs to have a capability of self-improvement to learn from experience, dynamic nature of BoK. There are growing interests in the new architectural approach such as progressive and sequential machine learning to improve the neural network training.
Uncertainty is what difficult decision must handle. In the investment industry, investors know that risk is not the same as uncertainty. The risk is measurable. Price volatility is called as a risk. The major market crashes in the past have got triggered by the rise in fear of uncertainty. Nassim Nicholas Taleb developed the theory of black swan events, but we propose ‘Haunted House Theory’ to separate uncertainty from risk. In the haunted house, you are blind and unable to measure and predict what would happen even using all the available information and knowledge. Uncertainty is the status where one faces the events with the unmeasurable outcome, which is an antithesis to the deterministic viewpoint.
The complex decision is forced to take action to select what to do when it faces an uncertainty event with an unmeasurable nature and without a logical causality between incident and result. For the prediction using artificial intelligence, this is a fundamental challenge. Using an analogy, it is just like a situation when someone walks in total darkness and must decide which way to go where the floor may disappear. There are very little volume and reliability of data as input and a large margin of error. In the end, that is why there is a large market crash.
Back to the Original Question
“Is fundamental investment research a difficult judgment or a complex decision?”
It is time to review the actual fundamental research activity until the investment decision, a much easier task if you are an experienced bottom-up analyst or portfolio manager with a time horizon of investment, not trading.
While the final decision depends on the valuation and the reliability of forecast, the investment decision is neither science nor art. Just to make financial forecasts alone depend on many fragmented elements which include qualitative/subjective information and quantitative data and framework of the financial model. The qualitative information is descriptive and formed by combined insights from various activities including meeting with company executives, evaluations of the competitiveness and strategic risks of individual product/service with reference to their financial results. Each investment case requires a different mix of information and data.
As such, for investment judgment, it is virtually impossible to define a complete data representation, and even if someone could set a complete dataset sometime in the future, it would need to get cleansed significantly, or it may need to get updated to add the new information/data to the dataset. I believe it would not happen, because the cost outweighs the benefits. Investment research does not meet the conditions of type 1.
For type 2, think of fundamental investment case presentation to the team. The objective of the presentation is to articulate and share the recommended action (or rating), valuation, and risk assessment and to get approved by the team or the portfolio manager. However, the real input data ingested into the investment decision process is not only the figures but also (more importantly) the structured logic supporting the certainty of the figures.
The information is descriptive, unstructured, and combined to form the logic. They have lots of subtle nuances of ideas. Using the scores of stereotypical and standardized categories instead is an approximation that may force to nullify the uncommon distinctiveness of fundamental investment research and lose the invaluable essence of the descriptive information that substantiates intrinsic value. There is a conflict between being explicit and being nuance-filled, and the data representation is unreliable and inaccurate for the latter. As such, the incomplete representation of data is an inevitable and big challenge for the advanced fundamental investors to think and act explicitly when utilizing new cutting-edge information technology.
Note there is an application of LDA toward investment article and market news unstructured data analysis. In my view, this is a new application to enhance ATI, not AFIRI. It is ATI because the market herding activity gets analyzed by machine learning and then the trained algorithm tells what is influencing the market trend and predicts the move. This is nothing to do with the analysis of the intrinsic value. The data from herds is designed through ML to develop insight of the activities of beauty voting in the market not to build the AI perspective into the fundamental value.
BoK of investment research consists of many categories. It requires having both the knowledge and working experience of handling tools and analytical frameworks of investment research, in addition to the in-depth and latest insight into specific industries and business & financial models of companies. Continuous learning is required to understand the dynamic landscape of industry and company as well as the analytical techniques.
For uncertainty, investment decisions are made not only under a wide variety of uncertainty. More importantly, investors evaluate and investigate uncertainty (through a certain brain network) and use the results as the input vector of another decision network. It is not a condition of the decision but a target of analysis of the decision. For the substantial uncertainty in a case such as a major market crash, evaluation gets very difficult due to a lack of relevant information and data.
All in all, it is obvious that fundamental investment research is a complex decision. This has a profound implication for developing DNN/ML as AFIRI. While difficult judgment can be and has been answered by new AI which adopted a new network architecture and functions, the complex decision cannot be made easier by the same approach. Different approaches and expertise for development are required.
Is it an impossible target to develop AFIRI? If not, what needs to be done to ease or overcome the challenge?
To search for the clues, we can garner invaluable inspiration from the history of AI.
Catalysts for Success: Cognitive and Logical Process
David Hubel and Torsten Wiesel, who received the 1981 Nobel Prize in Physiology or Medicine for their discoveries concerning information processing in the visual system, found the key discoveries that led to the development to solve the difficult image recognition task of AI.
The first finding was many small local fields in the visual cortex, which independently work to get the information and recognize the whole image as the aggregate of small fields. The second finding was that visual neurons have a specific line orientation. Some react to one particular line orientation, and others do so to another line orientation. In addition, visual neurons have different functions, and they work together to recognize complex patterns.
Those discoveries together lead to the neocognitron which is a hierarchical, multilayered artificial neural network in the 1980s. It was used for handwritten character recognition. The model needed a time of further time for development. In 1998, LeNet-5 architecture, a much more polished model of convolutional neural networks, was introduced. Specifically, the use of convolutional layers, as well as feature maps from vertical and horizontal filters in LeNet-5 architecture, was influenced clearly by the findings of visual neuron mechanism.
The current CNN has its origin in LeNet-5. It is not exaggerated to argue that the image recognition AI has developed initially from the analysis inspired by a human cognitive process starting from the visual cortex as sensing devices.
In the case of the original Alpha GO which uses CNN, its development was built upon the discoveries from human logical process to a lesser extent. It used two independent networks to build the pipeline. The first network is the policy network which selects the next positions intelligently. The second network is the value network, evaluating the whole positions precisely. The dual process architecture of the original Alpha GO has a similar logical architecture that human players use to play GO.
Source: DeepMind Technologies Limited
Analysis of the Logical & Cognitive Process of Human
The LeNet-5 used the inspiration from human visual neurons and applied to DNN to solve the issue of “difficult judgment” successfully. The original Alpha GO has taken a similar structure as a human logical process.
Although these two tasks deal with the difficult judgment, the same principle is likely to help solve the complex decision problems. It is because the human logical process has been developed through evolution to process the tasks efficiently under the constraint of the human brain, which has tens of billions of neurons, still larger than AI systems.
The neural network is not an artificial replica of the human brain. That is, neurons of human brain do not work in the same way that neurons of DNN work. We do not know yet accurately what human brain neuron does.
However, learning from and inspired by what human does is a very effective approach which proved successful in solving the difficult judgment, one of the major issues of AI. As such, it is very reasonable to assume that the same approach should also work effectively for the complex decision. It should be even more so because the complex decision poses many challenges not only to AI but also to human decision. Compared to the tasks of image/voice recognition, complex decision even by the top experts has much lower accuracy. Moreover, each top expert does not make the similar decision. For one the situation looks like green, but for another expert, it looks like red just on a different objective. And in aggregate they form the market itself. As the general objective of ML/DNN is to achieve human-level accuracy, given these features of the complex decision, the development should be based deeply on the analysis of the logical process of human experts.
The implications for fundamental investment research is clear. BOTH fundamental investment managers and data scientists need to work together to architect and develop AFIRI. Alpha GO’s approach is not scalable to the complex decision, and thus the innovations of elements of ML/DNN computational framework alone do not mitigate the severe challenge.
There are ways to disentangle the fundamental investment research process in a manner that fits ML and/or DNN best, which require the skills of abstraction of fundamental research. Both sides of professionals of fundamental investment and programming community need to think strategically and acquire missing expertise from the other end to grow the algorithm so that it can advance the fundamental investment.
However, it may not be what the investment managers are good at, as the bottom-up research process works in an opposite manner of differentiated individualizations to discover the investment opportunities. Every element of qualitative investment decision is unique, and the abstraction does not work when applied to obtain the agreeable average (consensus). The challenge for modernizing fundamental investment is to migrate from being individually descriptive to being globally explicit.
It is also a challenge for data scientists with computer science as a background. Using the big data and developing new algorithms with an objective to predict price movement can be done under a certain condition but as discussed, a certain type of decision poses a new challenge. As stated earlier, value-orientation should be defined in the process to immunize the price-orientation so that the AI works as a pure fundamental focused process, never as a quantitative process tool with various market forecasting techniques. The domain expertise is essential.
As such, the dialogue of the merger between the two with different expertise may not be simple but is sure to be fruitful in the end considering our experience. In the end, like other innovations that require new and rewrite the recipe, the silo mentality is the major hurdle, which tends to be dominant at the business model optimized for the large investment operation.
The growth of new breed of professionals with in-depth investment research domain knowledge and technology expertise can advance the fundamental investment business. At the end of this preface, we note that we continue to research, development, present our capability for partners for the future of next-gen active fundamental investment.