A MARKOV MODEL FOR INVESTIGATING THE STOCK MARKET VOLUME BEHAVIOR

In recent decades, the stock market prediction has become a high research area due to its immense importance not only for every profitable industry, but also for shareholders and investors to hug out a self-assured decision for a good investment into the stock market. This paper provides a discrete time stochastic model for the behavior analysis of stock market volume, applying the Markov model. The proposed model is validated in terms of model assumptions to predict the stock market behavior. An illustration, the top ten largest global banks’ stock market behaviors through the steady-state distributions and expected number of transitions are discussed. Wherein the secondary datasets for 505 days of volumes from 1 of January 2014 to 31 of December 2015, 2 year duration are used in each bank.


INTRODUCTION
The stock market is a marketplace allowing everyone to participate in both national and international economies through the investments. The prediction of stock market movement has become a high research area due to its seminal importance not only for every profitable industry but also for shareholders and investors in taking a self-confident decision for a good investment in the stock market [1]. Owing to the fact that the prediction is a challengeable one from the past data that is publicly available, for long decades, we face the question: how can the past information be utilized to give the meaningful prediction on the future behavior of stock market? In the beginning, the solution provided by technical and fundamental analysis assuming that the past information is opulent to predict the future behavior [2]. In this circumstance, a statistician can distinguish that such techniques are used under the assumption that the successive changes are dependent. The technical analysis is utterly studied of historical price movement of securities utilizing charts of price patterns and price data in the different calculation to predict the future price trends (Turner 2007). On the other hand, the fundamental analysis is based on the traditional approach, the study of industry's fundamentals concerning profit, expenses, growth rates, company's performance and so on (Murphy 1999).
In the opposite direction, there is a moot point at the rapid incorporation of past information to predict the future behavior of the stock market. The theory "Efficient Market Hypothesis" (EMH) indicates that the degrees of association between the stock prices and all available information in a timely manner [3]. The pioneer of this concept is Fama (1960, cited in [4]), publishing the most comprehensive empirical study on Dow Jones Industrial stocks over the period of 1956 to 1961. The basis of degrees on stock prices reflection into information set, EMH may be assorted into the weak form efficiency, the security prices fully reflect the past information and future prices cannot be predicted from the past. In this situation, the EMH commonly known as a "no memory market" and also describes as a "random walk". The random walk theory says that the past information cannot be employed to predict the future behavior. That is successive price changes are independent and random, do not influence by past price movement. Many empirical studies have been performed so far into the investigation of random walk stock prices movement utilizing the statistical tools such as the autocorrelation and the run test on successive stock price changes of the same sign. The results of the empirical studies, in the past, educate us a considerable number of markets walk behind the random walk. For example, Choi (1999), Mlambo and Biekpe (2007) Cooray and Wickremasinghe (2007), Halime and Sevinc (2015), Singh, Leepsa and Kushwaha (2016) [5], [6], [7], [8], [9].
Owing to the significant feature of randomness on stock market behavior with a time frame, that is to say, behavior accordingly becomes a random process over the time, the principal goal of this paper is to propose a discrete time stochastic model in order to predict the long-term volume behavior of a stock market. In which, we quantify the behavior by the stock volume base on the concept of less and high commitments and this is the main difference between our work and previous researches [10]. Where at, the stock price was used. The application of the stochastic process in the stock market behavior exploration is not new. Various research papers have been published to predict and analyze the stock market behaviors in different times utilizing the stochastic model. Hassan and Baikunth Nath (2005) used hidden Markov model to forecast stock price for the interrelated market [11]. Doubleday and Esunge (2010) introduced a Markov chain model to determine the diverse portfolio of stocks and markets as a whole on Dow Jones Industrial Average (DJIA) using discrete time stochastic model, namely Markov Chain [12]. Huang (2015) developed a Markov model to analyze the stock price variation Taiwanese company HTC introducing an absorbing Markov chain [13]. The main difference between previous researches and our work is instead of share price we use stock volume based on the concept of less and high commitments.
The remainder of this paper is organized as follows. Section 2 discusses the structure of the stochastic model, description of the proposed model including how we estimate model parameters and evaluated model assumptions. Among the banks, four banks are in China, three from America and rest of three from Japan, France, and United Kingdom [14]. Finally, in section 4 we summarize our study.

Understanding the structure of the stochastic model
A stochastic process is a family of random variables. The index is time and is the state of the process at time . The set is called the index set of the process, might be countable set or an interval of the real line. When we define the transition probability from state to ; Such type of the stochastic process is known as a discrete time Markov Chain (Ross 2010) [15], the equation (1) can be interpreted that, conditional distribution of future state , given the past states and the present state is independent of the past states and just depends only on present state, known as discrete time Markov Chain.
Thus, the one-step transition probabilities of possible states can be depicted by a transitional probability matrix ; The probability of moving from state to in steps is denoted by = . It can be explicitly explained by Chapman-Kolmogorov equation (4) as follows; , Where represents the probability the process will go to state in transitions starting from state through a path which take it into state at the th transition.
Then the equation asserts that = and by induction , For , where dot represents matrix multiplication. If the state is reachable from , denoted by , then there exists an integer such that . If we say is reachable from and is reachable from , then state and are said to communicate, denoted by When the Markov Chain contains only one class, it is said to be irreducible, that is all states are communicated with each other. Let be the probability that starts in state such that the process return to state , for any state . Then, if state is called recurrent and if the process starts in state , the expected time until the process return to state is finite then it is called positive recurrent. In particular, the process is called transient if . The state is said to have period if whenever is not devisable by . In particular, if we call aperiodic. Positive recurrent and aperiodic states are called ergodic.
In the limiting distribution for an irreducible ergodic Markov chain exists and independent of and define , then the is unique nonnegative solution of and .
is the long-run proportion of the time that the Markov chain in state and }, then And also, expected number of transitions a Markov chain, starting in state , returns to that state, is given by

Defining and descriptions of proposed stochastic model
As it is mentioned in the objective of this study, the Markov Chain model is going to be applied to make the prediction into the stock market volumes. In the data collection, the secondary data on daily ending volumes of top ten global largest banks are gathered from Historical Prices of Finance Yahoo, for 505 days from 1 st of January 2014 to 31 st of December 2015, 2 year duration [16]. It is defined is the volume on th day, and is the change of the volume from the previous day on the th day such that .
Then the probability model that describes the evolution of a stock market behavior evolving randomly on a particular day is defined as follows, Where the threshold value is evaluated by absolute average of (daily changes of volume). So, we decide to treat until the average changing of volume as regular changing, not a striking one. Then its bin length is taken as " ". Wherein, it is considered in the 2 nd class of loss and increment part considerably significant one. By this definitions, would be considered as a striking one. The purpose of this implementation is to maintain the homogeneity between the banks' changing rates. Because, daily changing rate of each bank might differ from another bank and it leads us to compare banks' behaviors without bias. So, is a stochastic process that may have a value from 1 to 6 on th day. That is, is a discrete time stochastic process that can take finite nonnegative integer values and the state space is . Our focus is to find a probability model for the sequence of successive values , , and we assume that are independent random variables. Thus, the transition probability from state to ; , .
States from 1 to 6 are named as follows; State 1 (high increment) today's volume increment is greater than 2w from previous day's volume. State 2 (moderate increment) today's volume increment is greater than w and less than 2w from previous day's volume.
State 3 (small increment) today's volume increment is less than w from previous day's volume.
State 4 (small loss) today's volume loss is less than w from previous day's volume.
State 5 (moderate loss) today's volume loss is greater than w and less than 2w from previous day's volume. State 6 (high loss) today's volume loss is greater than 2w from previous day's volume.

Estimation and verification of the model
Transition probabilities are estimated by given formula using maximum likelihood criterion which was formulated by Anderson and Goodman (1957) assuming on a multinomial distribution with probabilities and outcomes .
Where is the number of observed transitions from state .
To verify the Markov property for our data set, first order Markov chain and homogeneity of the transition probability, the likelihood ratio test proposed by Anderson and Goodman (1957) was used [17].
1) The successive events are from the first order Markov Chain: That is, probability of getting increment or loss of volume does not depend on previous history, only depend on previous day's volume. To test this, the hypotheses could be defined as; . Where be transition probabilities of the process in state at , in state at and in at . For this test, the likelihood ratio criterion is given as; Where, : total number of observed transitions from to , : total number of states and : marginal probabilities for th column. The value is asymptotically distributed as with (m -1) 2 degrees of freedom.
2) Stationary assumption: estimated transition probabilities are constant throughout the study period. To test this, the hypotheses could be defined as; , and i, j = 1,2,3,4,5,6. For this test, the likelihood ratio criterion is; Where, : number of states, : number of sub-intervals and : frequency count for the transition from state i to state in the th sub-interval. The value is distributed as with (T-1)(m(m -1)) degrees of freedom. And also, classification and period of states are investigated to conform whether it is ergodic and irreducible Markov chain or not.

RESULTS AND DISCUSSION
In total, 505 daily ended volumes were collected from 1 st of January 2014 to 31 st of December 2015 in each bank. The comprehensive analysis into the stock market behavior exploration of BOA is given below. The table1 shows the counts of transition from each state to any other states and figure1 illustrates time series pattern of . In both, we are able to view that state 3 and 4 were highly occurred compared with others.     The test statistic value = = 113.1453, p-value= 0.4220 We don't have enough statistical evidence to reject at 5% significant level, the estimated transition probabilities in each of sub-period are not significantly different from pooled estimated transition probabilities which are estimated from entire 2 years observed information. That is estimated transition probabilities are constant throughout the study period. Thus we can conclude that the data of daily volumes satisfy the Markovian property and stationary assumption. In addition, the higher order transition probabilities is calculated to observe the behavior of the volume using Statistical software R and some steps are given table 7 to observe the distribution;    The behavior on getting type of increments and decrements of the volumes will eventually reach a steady state in which around 7% of the volumes are with "high increments": more than 47,493,124increased shares, 9% with "moderate increment": between 23,745,562 and 47,493,124 increased shares, 31% with "small increment": less than 23,745,562 increased shares, 34% with "small loss": less than 23,745,562 decreased shares, 12% with "moderate loss": between 23,745,562 and 47,493,124 decreased shares 7% with "high loss": more than 47,493,124decreased shares from the previous day. Expected times first return to "high increment" , "moderate increment", "small increment", "small loss", "moderate loss", and "high loss" are approximately 15 figure 2, it explicitly shows that all volume behaviors have a greater chance of transitioning, around 30% to states of "small increment" or "small loss", around 10% chance to transition to "moderate loss" or moderate increment" and just around 5 to 7% chance to transition to "high loss" or "high increment". Further, in average, the probabilities of losses are little bit higher than increments in all banks' volume behavior.

CONCLUSION
To sum up, the stock market behavior exploration is a challengeable one due to its unpredictable nature. So, we are unable to utilize any technique directly such as the chartist, proponent of fundamental analysis, and any statistical tool into stock market behavior investigating without any prior examination using the past information. In this research work, we are able to apply our proposed Markov model validating the model assumptions in the behavior exploration of eight global largest banks except two banks and get precise conclusions. Overall, the long run proportions of the time indicate that, in a given day, there is a considerably greater chance transitioning to a state of losses than a state of increments in each bank stock volume. When we focus on individual state, in a given day, the total shares exchanging to reach the high increment or high loss is very rare and chances are mostly same. What is the noticeable phenomena here, with high chance the volumes would be transitioning to the state of small increment or small loss in a given day? This result would encourage the investors who are looking for small share price movement in the way of the market will not affect by large loss or increment. Basically, all global largest banks' stock market behaviors would be mostly swing little up or down.