ESTIMATION AND FILTERING FOR STOCHASTIC VOLATILITY MODEL
Name
Course
Professor
University
City
Date
Abstract
In this study, the interest will be highly based and focused on the basic concepts in stochastic modeling. The exact topic under this specific study of stochastic modeling will be on the filtering and estimation of the stochastic volatility model. The basic concepts and understanding pertaining the stochastic modeling are going to considered seriously and intensively scrutinized in terms of stochastic volatility and the suitable methods or statistical techniques of filtering as far as the stochastic volatility model is concerned. Examples and definitions of the stochastic models with their applications in the field of financial services or marketing are also taken seriously in this particular project of study. A branch of a leaf is going to be borrowed from the time series processes and models to enhance proper and greater understanding as far as the study of filtering and estimation of stochastic volatility modeling is concerned. Time series, in conjunction with stochastic volatility modeling, is commonly applied to model the change in variance of the stock returns once—all these applications or uses of the stochastic volatility model. There is also an intensive exploration of all the concepts and understanding of the Markov chain process. They will involve the definition of the Markov Chain process and other related terminologies. The appropriate statistical criteria for carrying out the Markov chain in a very organized and systematic manner is also essential in this actual study. They will help to establish e research more significant and easy understanding of the estimation and filtering of stochastic volatility modeling.
Keywords: Stochastic modeling, time series, stochastic volatility model, stock returns, Markov chain
Table of Contents
The Moving Average model (MA) 14
3.2 Definition and a general understanding of a stochastic process 15
Independent Increment Processes 19
The Stochastic Volatility Model 23
Introduction
The exploration of the Markov Chain Monte Carlo is also considered necessary under this actual project. In addition to all these essential and critical aspects given in this particular research, and into an exploration of different statistical techniques of filtering and estimating stochastic volatility, modeling is also beneficial. The investigation of these different methods will be very assistive in terms of discrete and continuous time. There is also proper and effective monitoring of the applications of stochastic volatility in the field of financial marketing and other related fields. There is the software of the R studio program will also be used for the analysis and essential implements in this actual project (Giillet 2015). The structure of this particular research dissertation will Pearce well formulated in an organized and systematic manner. The project structure of this actual dissertation will be in terms of Introduction, the body of the study, summary, conclusion, Acknowledgements, References, and Appendixes in terms. All these essentials of the dissertation structure will be well captured on the table of contents systematically with their respective page numbers.
The Introduction will give basic definitions and understanding of the various number of terminologies as far as the study of the filtering and estimation of the stochastic volatility model is concerned. The body of the project dissertation body will practice a full understanding of the estimation and filtering techniques in stochastic volatility modeling in a broader perspective in the form of presentable drawings and mathematical and statistical calculations with their actual proofs where necessary (Beffetov 2018). The Acknowledgement sector will provide the acknowledgment and recognition of the supervisor of this real projector and other relevant Authors of books, which are very significant to carry the research of the study in the field of stochastic volatility modeling together with the time series at a go. The flow of the research project will be adjourned by including the necessary and relevant References and appendixes study on the estimation and filtering of the stochastic volatility modeling.
In this mathematical research project, a lot of was focus is put mainly on general knowledge of the stochastic process that includes the study of the statistical methods or techniques that are also significant to filter and estimate the stochastic volatility model. Some essential basic knowledge from the field of time series is also going to be studied and considered to enhance the smooth establishment of the stochastic volatility models. Stochastic process is a statistical discipline that is directly related to the field of probability theory (Roy 2020). Therefore the stochastic process can be simply defined as the mathematical or statistical object of the family of random variables, which are usually associated with a set of numbers that are generally important to act as points with the change in time. The aspects that are in relation with time can provide a useful interpretation of the stochastic process. The stochastic process is also relevant to represent the numerical values of a given system that changes randomly with time.
The stochastic process can provide essential statistical interpretation critical ideas like the movement of the gas molecule, the growth or change of the population of the bacteria, and the fluctuation or the variation of the electric current due to noise that from the thermal activities. In this particular study, we are also going to look at some of the applications of the stochastic process. Some of the uses of the stochastic process include solving and understanding of the mathematical models that are random. There are so many fields that the stochastic process plays a crucial role in sciences like physics, chemistry, neuroscience, biology, and ecology. The stochastic process can in engineering and the Technology related disciplines or fields. Some of the technology fields that can be easily handled by the stochastic process include Computer science, telecommunications, image processing, Information theory, and signal processing. The application of the stochastic process can also be relevant to solve the random changes with their observations in the field of financial marketing.
The Stochastic process also needs an interpretation as the random element of the function space; hence it can be referred to as the random function. These two terms are the stochastic process, and the casual service is usually used interchangeably without any mathematical space for random variables. The actual values of the stochastic process are not often numbers but can also be in the form of mathematical objects or vectors. Inspirations are also have got more excellent observations in the study of phenomena in connection with the application of the stochastic process (Ross 1996). Some examples of suchlike stochastic processes include the Brownian motion, which is in the discipline of physics, which we can refer to as the Wiener process and the Poisson process, which was applied by Erlang to study the phone call numbers that occur with a given period. These two stochastic processes independently and repeatedly, and they are essential and central as far as the study of the theoretical study of the stochastic process is concerned.
There are so many stochastic processes that are grouped and based on their mathematical properties. The various categories of the stochastic processes may include the levy processes, branching processes, random walks, Markov processes, Martingales, Renewal processes, and the Gaussian process. Markov chain is one of the main stochastic processes of interest of this particular research, which can have a simple definition as the sequence of a stochastic model of all the possible outcomes in an event where the probability of the given function is only dependent on the attained states of the previous game. Markov chain is one of the statistical models that has got so many applications in the real-world processes (Beffetov 2018). It involves the study of queues on the lines of customers appearing at the banks, supermarkets, and the airports. The Markov chain can also be applied to study the population of the animal dynamics, cruise control systems in vehicles, and the currency exchange rates. Markov chain Monte Carlo is one of the stochastic simulation methods in the Markov chain. The Markov chain Monte Carlo is usually applied to simulate samples from the probability distributions that are very complex.
The Markov chain Monte Carlo (MCMC) method consists of the algorithms for sampling, which is the probability distribution. A well-constructed Markov chain with the desired placement for its equilibrium distribution, the sample of the desired distribution can also be relevant in recording all the states from the chain. Various algorithms can also be necessary to construct a Markov chain that includes the Metropolis-Hastings algorithm. When more steps in the Markov chain are will result in the desired distribution that is more closely matching the samples.
The stochastic volatility models are fundamental aspects of the field of statistics. Stochastic volatility involves the random distribution of the variance of the stochastic process. The concept of stochastic volatility is widely used in the discipline of the mathematical finance of finance marketing to evaluate the derivative securities such as an option which gives the buyer or the consumer the right or the opportunity to purchase an instrument or underlying asset at a specified price which is before the specified date. A random process is from the treatment of the model of the underlying vitality from the security. This process is governed by the state variables that include the price configuration of the underlying security.
The shortcomings of the black-Scholes model can be useful to approach of the stochastic models. The models that are in the form of Black-Scholes have the assumption that the underlying derivative is constant. Therefore it cannot be affected by the price configuration of the underlying structure of security. When there is an assumption that the stochastic process is regular, it will be possible to come up with accurate derivative models.
There are various methods or techniques of filtering and estimation of the stochastic volatility model. There is a new technique which is known as the Gaussian Kalman and the Ellipsoidal filter model for the discrete-time in non-linear models (Blamberg 2020). This new technique is also relevant to come up with the estimations of the discrete-time of the non-linear systems in a circumstance where for some sections of the uncertainty, the probability distributions are well known. In contrast, for the other aspects of the change, only the bounds are known but not the corresponding probability distributions. Using the same technique or criteria of carrying out the Kalman filter, the algorithm will be iterative. On each of the iteration part, the state of the incoming moment of the time is under prediction (Blamberg 2020). The next step will be measuring the results to make some corrections of the estimates that correspond to it. On each revision, a convex optimization problem will provide a solution to find the optimal estimation of the state of the system. Testing this algorithm on various forms of non-linear issues will show that the new algorithm can perform the extended technique of the Kalman filter better.
It wider be also imperative and necessary to study the basic concepts in time series to get a proper understanding of the time series process. In simple terms, time series can have a definition as the sequence of data points that are graphed or listed regularly over time. The appropriate course of the time series can equal successive points which hare placed in time. Therefore these particular aspects clearly show that Time series is a sequence of data on discrete-time (Blomberg 2020). There are several examples of time series; the examples include the actual counts of the sunspots, the regular closing value of an industry, and the heights of the ocean tides. There are various models of the time series models; the models include the Additive model, the multiplicative model, and the mixed model. In statistics, especially in the field of econometrics, there are also various numbers of time series processes. The time series processes include the Moving average (M.A.) model, Autoregressive model, and a purely random model, which is the White Noise (W.N.) (Blomberg 2020). The Autoregressive-moving average (ARMA) without forgetting the Autoregressive integrated moving average (ARIMA) models.
The Time Series
Time series simply refers to the series or sequence points of a given set of data that are in an orderly manner concerning time. The time series concept is mainly applied to get the dependent relationship between the observations. The dataset involved in time series usually consists of a series of a given set of views that are in an equal systematic manner in the forecasting or prediction of future values. Time series can further be in two main basics; that is the irregular time series and the regular time series. In Regular time series, there is an assumption that the sequence of the data points is typically generated and aligned at a space of regular interval of time like the temperature variable (Beffetov 2018). The Regular time series has got an assumption that the data points of the observations are generated or aligned at the space of regular intervals with the time series. The data that occurs due to the varying time or burst and the withdrawal or deposits made on a given bank account are examples of Irregular time series. In simple terms, the irregular time series does not have a uniform distribution of the interval of time.
Time series consists of one or more variables that change deliberately concerning the time. Time series can also be further, namely the multivariate time series and the Univariate time series. The univariate time series has got only one variable that varies with the aspect of time. In comparison, the multivariate time series consists of more than one variable that changes over time. The multivariate time series can have three variables of the on-axis that can coincide over time. There are two main aspects as far as the study of time series is concerned; that is the analysis and the modeling of the data, which is from the given set of observations. The main objective of Time series analysis is to summarize and characterize the essential features of the given series in connection with time. The study of time series can either be in the frequency domain or the time domain. The frequency-domain highly focus on the movements of the cycles, which are under study.
In contrast, the Time domain seriously focuses on the relationships between the given set of observations over the given time. The main objective of time series analysis is to enhance proper forecasting or prediction of the future values of one or more variables (Beffetov 2018). Time series can be in groups of three models, namely the additive models, multiplicative models, and the mixed time series models. These models can also have an explanation and elaboration, as shown below:
Multiplicative Models
The time series has got four main components. The components include trends, cyclical variations, Seasonal variations, and variations. The trend is one of the elements in times that involves the change or fluctuation of the low, medium, and high frequencies over a given time. The maximum duration of the trend time limit variation is eight years. The cycle is the periodic variation with time. The cyclic variation is usually considered in at least two years to carry out some time series analysis (Roy 2020). The cyclic difference occurs when there is rising or falling of the observed data points. The Random variation component involves the irregular variations or variations that do not follow a given sequence. The Random difference experience for short periods and they are unpredictable since they have got Irregular movements.
All these components of the time series have usually analyzed for easy identification and proper estimation of the impacts caused by seasonal variations, trend as well as random variation. There the multiplicative model gives the product of the components of variation. The multiplicative model can provide the data of the time series in terms of the seasonal effect, Trend, Residual, and cycles. An example of Multiplicative model is as shown below:
Data = T * St * ε
Where T is the trend,
St is the seasonal variation on a given time and,
ε is the Error term or the residual of the time series.
Additive Time Series Model
Unlike the multiplicative time series model, in additive time series model, their addition of the given components of the time series. There the additive model is very different from the multiplicative time series model since the additive model involves the addition of the components of the time series. In contrast, the Multiplicative model involves the multiplication of the time series models. The multiplicative models are usually applied when there is an increase in the seasonal variation with time. At the same time, the additive time series model is useful when the variation of the seasons remains the same over the given time. An example of the additive time series model expression e is:
Data = T + St + ε
Where T is the trend,
St is the seasonal variation over time t and,
ε is the value of the residual or the error term.
The Mixed Model
The mixed model in time series involves both the addition and multiplication of the components of the time series. This statistical model involves both the random effects and fixed effects. The mixed model of the time series can be in various fields or disciplines like in biology, physics, and the social sciences. Mixed models can also be relevant in the statistical approach of the repeated measures of the ANOVA. An example of the mixed model expression is as shown below:
Data = T*St + ε
Where T is the value of the trend,
St is the value of the seasonal variation at a given time t and,
ε is the value of the residual or the error term.
It was the basic concept of the time series in terms of the components of the times series. The ingredients include the trend, seasonal variation, cyclic variations, and random variation together with the time series models (Additive models, mixed models, and multiplicative models). It will also be helpful to handle and discuss a good number of time series processes. This process will enhance easy understanding as far as the study of the stochastic process is concerned. Some of the examples of the Time series processes include:
Purely Random Process
The purely random process is also known as the white noise process (W.N.). A stochastic process {εt} with a random variable Xt is said to be a purely accidental process when it has a zero mean and constant variance of δ2 where all the random variables independent and identically distributed and follows a normal distribution. The series of the stochastic processes are also uncorrelated. That is, the random variable
εt ~ N(0,δ2).
Random Walk process
The Random walk process is in discrete, which consists of a sequence of random variables Xt where Xt = (X1, X2, X3, …, Xn), which is simply indexed by the use of the natural numbers. Therefore Random walk process is simply defined as the process where the value of the variable that consists of the previous values in addition to error term εt. The residual is the purely random process where which follows a normal distribution with a mean of zero and a constant variance of δ2. A simple random process yt is given by:
yt =yt-1 + εt
The above expression implies that the random walk process cannot allow the forecasting or the prediction of the change, which is given by yt – yt-1. It can also have proof that the mean of this particular random walk process is a constant, unlike its variance, which is not a constant.
When considering the model, the value of the drift random walk model will be given by:
yt = y0 + at +
The value of represents the trend of the stochastic process, where the value of y0 represents the deterministic trend of process. The random walk model is fundamental since most of the time series in the economy has got a pattern that is similar to the trend model. On top of this, Random walk process relevancy, if two-time series processes of independent random walk processes, therefore, the relationship between do not have any impact or meaning to a given state of the economy. If the estimation on the regression model between two random walk processes, there will be will a very high squared correlation coefficient, which is usually represented by R2, high t ratio on the coefficient of the slope. All these aspects will indicate that the regression will be very spurious. Changes in the conducted regression can give evidence against the earlier spurious results. This change can occur only when the coefficient of the regression is by:
∆yt = a0 + β1∆xt + Vt
And when the value of ∆yt = a0 + β1∆xt + νt is not significant, hence this will be a clear indication that the value of the relationship between variable X and variable Y is spurious.
The Moving Average model (MA)
A process of moving average order is yt = εt +θ1εt-1 + …,+ θn εt-q,
Where the value of t is equivalent to t = 1,…, T.
yt represents the moving average process at time t,
θ represents the value of the population parameter,
q is the order of the conventional moving method, which can be 1, 2, 3, …, n.
εt Is the value of the error term or the residual of the regular moving process.
The moving average model with order q is M.A. (q). If an ordinary moving process is finite, then the moving average model will always be stationary. The covariance of the moving average process can be without, putting in place any restriction on the population parameters that are represented by θ1, θ2, θ3, …, θn. In some situations of the time series processes, it will be here vital to express to transform a moving average process M.A. (q) to the operation of Autoregressive form. When all these stated conditions, then the Moving Average process is said to be invertible.
Autoregressive Process
A simple time-series expression of the Autoregressive process is given by;
yt = ϕyt-1 + εt and an expression of the Autoregressive process of order p which is denoted by A.R. (p) is given by:
yt = ϕyt-1 + … + ϕpyt-p
Where the value of t = 1,…, T,
ϕ is the value of the population parameter of the Autoregressive function, and
yt is the actual value of the Autoregressive service at a given time t.
The autoregressive process technique is known to be the most commonly used method than the mixed processes and the moving average process simply because it easier to estimate and interpret as far as the applications of the time series concept are concerned. The stationarity of the first order of the Autoregressive model is:
yt = ϕyt-1 + εt where t = 1,…, T
Even though the observations in the first series at time t =1. The process is at some previous time in the process. When the process repeatedly, then the lagged values of yt will be given by:
yt =
3.2 Definition and a general understanding of a stochastic process
The stochastic process is also known as the random process, which is in the discipline of probability theory and other related statistical fields. Therefore the stochastic process can be simply defined as a mathematical or a statistical object which consists of a family of the s population random variables, which are index with a given set of binary numbers of the data points that are usually as data points over time (Ross 1996). In simple terms, the stochastic process refers to a family or group of random variables Xt that are on the probability space. The random variable Xt has got the values of t that are greater than zero {Xt: t > 0}. The probability space has time t from the population parameter T. The process of the state space can be S where all the possible values of the random variable Xt lie.
The population parameter T can be in the form of discrete population distribution or continuous population distribution. The parameter space S is distinct when the value of the population parameter T is countable. When the amount of the parameter space is not countable, then the process is continuous. There are four main classifications of the stochastic processes. These classifications of the stochastic process are
- in various ways.
- A continuous
-time stochastic process is of the very critical aspect in the field of statistics ad probability theory. It involves a sequence or a series of infinite random variables. The values of the continuous process usually have a range from the negative infinity to infinity. The mathematical or the statistical technique integration of distribution functions is very important to handle the constant time stochastic process. This type of stochastic process only takes continuous distribution sets of random values. The stochastic discrete-time process refers to a distribution function of random variables that covers distinct points of a given collection of data. Another essential feature in the discrete stochastic process is the continuous-time random walks. The Continuous random walk simply refers to the construction from the stochastic continuous-time random walk through the waiting time of the distribution.
The stochastic process has got so many uses and applications in different or various fields. The stochastic process plays a vital role in finding the relationship between variables in connection to time aspects. The random variables consist of the random phenomena that usually develop and arise with time aspect. The stochastic process controls the probability and some laws in probability distribution from a broader perspective. Therefore the stochastic process is a dynamic section of the probability theory. The study of the stochastic process concept enhances the rapid development of a large number of fields. Some examples of the domains include the astronomy, operations research, biology, study of noise and fluctuations in the physical elements and other related areas. The stochastic process is also very relevant in carrying out some quantitative research since it involves the study of models in the field of mathematics and statistics.
This concept of stochastic process modeling is very assistive in the mobilization and monitoring of the queue. The actual study on the queue has got some waiting line that needs some critical analysis with the aspect of time. There is usually a large number of individuals who end up finding themselves on the strings or queue for them to get some essential services. The groups may include patients arriving at the hospitals, customers that enter the bank to get some financial assistance. Students were waiting at the cashier to clear with fee issues, and customers are at the supermarket to purchase some commodities and other related circumstances. A queue is one of the significant random variables since it involves the different number of customers entering a line to get some services, other customers on the actual chamber of the servers while others are leaving the track.
The stochastic process enhances the study of the flow of customers in the queue to get some essential services. Three main aspects are very critical and vital to the study of the nature of a queue. These aspects include the discipline of the queue, the service mechanism at the queue, and the input process pattern arrivals at the line. The service mechanism at the queue gives a full description of the service available in the queue. The service mechanism provides the actual number of customers, the approximate number of customers that can get some assistance at a time, and the time limit for the given function. Some examples of the service mechanism processes include the distribution of the constant service time and the distribution of the exponential services. First come first serve is of the technique in the service mechanism of a queue. The queue discipline refers to the criteria for the selection of the customers of the line. The choice of the customers is usually in the available service number of customers that await for that particular service. The arrival pattern or the input process refers to the probability law that includes the statistical arrival pattern of the customers and the average rate of the customers in the queue. The theory of queuing theory has got great concern on the three aspects of the quantities of services, such as the distribution of the service time, the average waiting time of customers, and the several amounts of interest.
The stochastic process can also be in the full description under the relationship that exists among the family members. The special relationship between the stochastic processes can be in the form of:
Martingales Process
Consider a stochastic process with the random variable Xt where t is greater than zero {Xt: t > 0) with a finite mean. The stochastic process is Martingales if the conditional mean of X(tn+1) and the values of X(t1), X(t2), X(t3), …, X(tn) are equal to the observation values{X(tn)}. Therefore;
t1 < t2 < t3 < … < tn, E[X(tn+1)/X(t1), X(t2), …, X(tn)] = X(tn) = X(tn)
And a stochastic process{X(tn)} where n is an element of 1, 2, 3,…with a finite expectation is martingale when the integer n, E[Xn+1/ X1, X2, X3, …, Xn] = Xn
Stationary Process
Consider the stochastic process Xt where t is an element of population parameter T. The stochastic process {Xt} is stationary when the observation on different intervals has got the same distribution and length. This concept implies that any value of S, t € T, X(t+s) – X(t+s) has got similar distribution as X(s) – X(0).
Independent Increment Processes
A stochastic process {Xt} is an independent increment process (additive process). When the derivative of time t is then the value of time t itself and independent of the random variable {Xt}., This illustration implies that t′ > t, X(t′) – X(t) that only depend on t′- t( The difference between the population time parameter and its derivative).
Markov Process
A stochastic process {Xt} is a Markov process when the event X(t) at time t and future event X(s) where S is more magnificent than t only depends on the previous intermediate but not on the past remote. This characteristic is the Memoryless or the Markov property.
Point Process
Considering the events that occur in continuous with only one-dimensional time, the process is a point process. This process occurs if the interest of the process concentrates on several occasions with individuals of these actual events.
Markov Chain
Taking into consideration the stochastic process {Xn} where n = 1, 2, 3, … that takes all the countable or finite values of possible numbers, the set of all the possible values of the stochastic process will be the set of positive integers (0, 1, 2,3, …,n). When Xn is equivalent to ‘i,’ there will be a fixed probability distribution of Pij where the next state will be j. The illustration implies:
P{Xn+1} = j/Xn = I, Xn-1 = in-1, …, X1 = i, X0 = i0}
= Pij
For the state above, that is i0, i1, …, in-1, i, j where all the integers of n are equal or greater than zero (n ≥ 0). Therefore this type of stochastic process is a Markov chain process. The above equation can have an interpretation which states that; the Markov chain process and the conditional state of Xn are not dependent on the previous states. The past stochastic states are in the form of X0, X1, X3, …, Xn. Therefore this process is Markov property. The probability distribution Pij is a clear representation of the probability that the stochastic process is in state ‘i,’ which is next to the transition of state j. Since the values of the probabilities are non-negative (n = 0, 1, 2, …, n) and the stochastic process must create some transitions on the states, then the state Pij will be:
Pij ≥ 0, Where i, j ≥ 0;
∑ Pij = 1, where I = 0, 1, 2, …,n
Suppose that P denotes a matrix with a transition that has got a single step in probabilities Pij, the value of P will be as shown below:
P00 P01 P02 … P0n
P10 P11 P01 …. P1n
P0 = ………………….
Pt0 Pt1 Pt2 …., Ptn
Let us consider an example of Markov on the Queue. One of the standards is:
The M/G/1 queue
This ideal is an M/G/1 Queuing system. The parameter M represents the arrivals and the distribution of customers in the form of an exponential process. G stands for the delivery of services, and 1 gives a clear indication that there is only one server. Let us consider an example of customers that arrive at the service center following a Poisson distribution process with a rate of λ. There are only one server and the customers that find the unoccupied server move directly to the service chamber (Sinyavskiy 205). The service time of the successive customers has got an assumption that the random variables are independent with a typical distribution of parameter G. The time service of the consecutive customers also has an assumption that they are of the arrivals.
When the random variable Xt denotes the customers’ numbers within the system at time t, the variable Xt {t ≥ 0} will not have the Markov property that the conditional distribution of the future events only depends depend on the present circumstances. When there are several customers within the system at time t, then the prediction of future behaviors will not consist of the previous arrivals. The focus will be on the time that the customer is taking in the service chamber (Sinyavskiy 205). This serial occurs simply because the distribution of parameter G is arbitrary and hence a Memoryless property,
The idea of at the moment systems when the customer is departing will be the suitable technique of handling this mathematical problem. Therefore let Xn represent the number of customers on the nth departure where n is more significant than one (n > 1). Also, the parameter Yn represents the number of customers that arrive when the service is taking place on the (n + 1)st customer. Therefore when the setting Xn is more significant than zero (X > 0), the nth departure will leave after the customers where in this case are in the form of Xn. Therefore the number of customers entering the service chamber and the other Xn-1 will not wait in the queue. Consequently, the next departure on the line together with any other customer that arrives when the (n + 1) customer is on the service.
Therefore:
Xn+1 = Xn -1 + Yn, If Xn ≥ 0
Yn
Having the fact that Yn (n ≥ 1) stands for the number of arrivals in non-overlapping intervals of the service provided. The stochastic process will follow a Poisson distribution process, as shown below:
P {Yn = j} =
Where j = 0, 1,…, n.
Therefore the above two expressions, the parameter (n = 1, 2, 3,…, n) is a Markov chain process with probability transition as shown below:
P0J = , s ≥ 0
Poj =
Where j ≥ I -1, i ≥ 1, Pij = 0, else otherwise.
Suppose that there is a situation where there is an arrival to the single-server service chamber with a renewable arbitrary with an inter-arrival distribution of G. Let us have an assumption that the rate of the service which follows an exponential process is µ. When Xn denotes the population of customers within the system, it will clearly show that the process Xn (n ≥ 1) follows a Markov chain process. For computational purposes of the transition probabilities of Pij in this Markov chain, the concept of the Poisson distribution process can apply. This actual method is beneficial when there is an existence of customers to be served on the service chamber. Therefore the length of the number of services with time t follows a Poisson distribution process as shown below:
P1i+1-I = , j = 0, 1, …, i
If the customer arrives find ‘i’ in the system, then the next person to enter will find ‘i’ + 1 excluding the number of people who earlier people. Therefore there is a probability of j to be in service, which is equivalent to the right-side elements of the above expression.
The Stochastic Volatility Model
The stochastic process is also known as the random, which is in the discipline of probability theory and other related statistical fields. Therefore the stochastic process can be simply defined as a mathematical or a statistical object which consists of a family of the random population variables, which are index with a given set of binary numbers of the data points that are usually as data points over time (Ross 1996). The Stochastic modeling volatility independent of the time series concept, which is an appropriate financial field, especially in exchange rates and stock rates. The stock returns and exchange rates experience some changes in variance with time.
The changes in the stochastic volatility model usually occur concurrently or at the same time. The variations of the stock returns and exchange rates in exchange rates in the financial market can lead to uncertainty. When the puzzle is due to the international crisis, it will need some time to stabilize the market price. There are various ways of modeling the variance changes. The sequence of interests is generally in terms of a series of independent and identically distributed random variables {εt}. The unit of the variance is δ2t. Therefore the standard deviation is δt, that is
yt = δt et, εt ~ IID (0,1)
One of the possible ways is to carry out the direct approach to δt that follows the model of the stochastic process. The diagram below shows the difference in the daily exchange rate against the dollar in October 1985:
The approach to stochastic modeling has got an assumption that the variable is not on the observation and follows a stochastic process. The Stochastic model fits significantly in a theoretical framework where there is a development of the finance theory. The model enhances the generalization of the black-holes in optional pricing. The critical challenge in the stochastic volatility technique is to write down the exact function of likelihood. The model can overcome the drawbacks of the model of GARCH. The stochastic process for variance δ2t is not direct. The model formulates an algorithm that ensures that the variance δ2t is positive.
Yt = δtet, δ2t = exp(ht) where t = 1, …, T
The value of ht follows an Autoregressive process of order one ( h ~ AR (1)).
ht = Y + Φht-1 + ηt, where ηt ~ NID (0, δ2n)
The value of ηt can either be dependent or independent of the error term εt. The suitable way of handling this model is in the form of state space. Before getting to the point of implementing this method, we have to derive the stochastic properties of the variance models. There must be an assumption that the value of the error term εt and the ηt are independent to simplify the method of handling this model. Let us consider a stochastic probability process below:
When we know the value of /Φ/ < 1 using the standard theory, the ht is stationary with the variance δ2h = δ2t / (1-Φ2) and the mean ɤh = ɤ (1-Φ). Having the fact that the value of yt is the product of the stationary process, it must, therefore, be stationary. Hence we need to have a significant restriction to enhance the consistency of yt, which will also ensure stationarity in generating the process of yt. Since yt follows a white noise process (yt ~ W.N) when there is independence between ηt and εt, the mean is zero. Therefore:
E (ytyt-1) = E (etεt=η) E [exp (ht/2) exp(ht-η/2)] = 0
Since E (εtεt-η) = 0
All the moments of the yt values are equal to zero when the error term is significant. We can make the use of the result of the standard log-normal distribution to obtain the even moments when the value of the error term εt is Gaussian. Where exp(ht) is a log-normal and the jth moment is exp{j/h + 1/2j2δ2t}
Where Var(yt) = exp (e2t) E{exp(ht)}
= exp{ɤh + 1/2δ2h}
Hence, δ2h is the positive kurtosis of exp{ δ2h } will be bigger than three. Therefore the stochastic volatility model exhibits kurtosis in excess than the normal distribution. When the logarithm of the model yt we get:
Log y2t = ht + log ε2t
When we know the variance and mean of log ε2t, therefore:
log ε2t = -1.27 + ht + et where et = log e2t + 1.27 which follows y2t which follows y2t hence the Autoregressive process of order one, white noise and the Auto-correlation function is:
P (t: log y2t) = Φr / (1 + 4.93/ δ2h), where t=1, 2,…
Because the logarithm of y2t is equivalent to a combination of the Autoregressive and the moving average process with order (0,0), the properties of the process will be similar to that of the Generalized Autoregressive Conditional Heteroskedasticity of order (1,1). Therefore the Autocorrelation function will be deduced from log y2t = -1.27 + ht + et and the property of ht.
We will generalize the Autoregressive conditional heteroskedasticity model to make the error term εt follow t-distribution. The generalization of the model is essential because the kurtosis in the series of finance is greater than the kurtosis from the Gaussian process. We can clearly show that the value of ht is stationary, white noise process and follows the t-distribution with unconditional variance. The generalization of the model will be in the form of {V / (v-2)} exp (yn + 1/2δ2h) V represents the value of the degree of freedom. Therefore the kurtosis of the process is 3{(v-2) / (v-4)} exp (δ2h).
Now that the error term εt is t-variable, we denote the residual as:
Εt = ζt / Xt/2t where ζt follows a standard normal variate distribution which is independent with V degree of freedom. Therefore;
log ε2t = log ζ2r – log k1 and
E (log k1) = Ψ (v/2) – log (V / 2)
Var (log k1) = Ψ′ (v/2) where;
log y2t = -1.27 – Ψ (v/2) – log (V / 2) + h1 εt where;
et is has a mean of zero and a variance of 4.93 + Ψ (v / 2)
The diagram below shows the daily pond to dollar exchange in October 1985:
Markov chain Monte Carlo
Two major categories of numerical problems in statistical inference include statistical problems and optimization problems. Integration needs the Bayes estimators even though many Bayesian inferential calculations need integration. In general, the estimation of Bayes with the loss function L (θ, δ) and π is the prior solution on the minimization program. Where;
Min ∫L(θ, δ)π(θ)f(x/θ)dθ.
This expression applies when the quadratic function is a loss function. Stationary state in dynamic systems like in economics and physics, require simulation of the successive states in a given system. An example of loss L, where L(θ, δ) = / θ-δ / and the association of π is the posterior median of π(θ/x), δ2 that provides a solution to the equation:
=
The Markov chain Monte Carlo (MCMC) enhance the simulation of direct draws from the interest in complex distribution. This method can generate random numbers for the computation of integrals. The concept of the Markov chain process is fundamental since it can enhance the creation of samples from continuous random variables with a known function of the probability density proportion. The samples from the continuous distribution can enable the evaluation over the variable integral or even the variance (Harvey 1989). This method of the Markov chain Monte Carlo consists of a class of algorithms that is necessary to carry sampling from the continuous probability distribution. Suppose that we need to compute an integral function, the integral function will be:
∫ h(x) dx, where the value of x ranges from a given point ‘a’ to point ‘b.’
We can therefore decompose the function of h(x) to produce a function f(x) together with a probability density function f(x) with a definition interval (a,b). Hence;
∫ h(x) dx = ∫ f(x) p(x) dx a ≤ x ≤ b
= Ep (x) [f(x)]
Therefore this integral function will enable the expression of the expectation of f(x) on the density function P(x). Hence we can use the density function P(x) to extract large numbers (x1, x2, x3,…, xn) of the random variables, then:
∫ h(x) dx = EP(X) [f(x)] = 1/n This is therefore a Monte Carlo integration process.
The Monte Carlo integration is essential in the approximation of Marginal distribution for the Bayesian analysis. Let us consider an integral function I(y) where;
I(y) = 1/n
The value of xi represents the number of draws from the probability density function P(x). The estimate of the Monte Carlo standard error will be:
SE2 [Ῐ(y)] = 1/n )
Monte Carlo integration is beneficial in the sampling process. Suppose there is a density function P(x) with an approximate of q(x), then;
∫ f(x) q(x) dx = 1/n
Therefore the sampling criteria will be from the expression:
∫ f(x) q(x) dx = 1/n , where we can draw xi from the distribution P(x).
An example of the Quadratic loss function and piecewise linear as below:
Suppose L (θ – δ) = ώi (θ – δ)2
Where θ – δ ϵ [ ai, ai+1), ώi > 0
Let us differentiate the function L(θ, δ) in order to the association of of the Bayes estimator:
Therefore
δ^2 =
The evaluation of the performance of best unbias estimators, moment estimator, maximum likelihood estimator estimator can be in the from the inferential classical decision theory. Comparisons of the estimators can be through the expected losses of the estimators. Considering the situation of the variable X that follows a normal distribution where X ~ Np (θ, Ip) and θ ~ Np ( μ, λIp). Let us also take the hyperparameter μ to be equivalent to zero ( μ = 0). Using the the empirical method of Bayes approach, the replacement of the hyperparameter λ will be through the maximum likelihood estimator λ (Robert 2013). The estimation of estimator λ is done through the marginal distribution where x follows the normal distribution of X ~ Np (o, (λ + 1) Ip). Therefore the maximum likelihood estimator of parameter λ = (//x//2 / p-1)+. This is simply because the posterior distribution of θ when λ exists is Np (λx / (λ+1)), λIp (λ + 1). When there is evaluation, the quantity of interest // θ //2 by the use of the quadratic loss, the Bayes estimator of the empirical will be:
δeb(x) = E( //θ //2 / x)
= (λ/λ+1)^2// x //2 + λp/λ+1
= (//x//2 – P)+
We can also apply the simulation technique by developing some essential properties in order to enhance the evaluation of the integral.
E[ h (x) ] =
Suppose the probability density function of generates the sample ( X1, X2, X3, …., Xn) using the method of empirical average, then;
Ћm = 1/m
This is simply because there is a strong degree of convergence of ћm to Ef [h(x)]. If function f has got a finite expectation of h2, the variance of the value of ћ will be:
Var (ћm) = 1/m
Suppose that H0 is the null hypothesis that corresponds to independent variable r with a constraint parameter θ ϵ Ɍk the maximum likelihood estimator ℓ (θ0 \ x) will satisfy the expression:
Log [ℓ (θ \ x)/ ℓ ( θ0 \ x)] = 2 { log ℓ ( θ\x) – log ℓ (θ0\ x)} → χr2
Considering the algorithm of the Markov chain Monte Carlo, the construction of the Markov chain is done using the kernel transition k. The conditional density probability function Xn+1 will follow a number distribution ( Xn+1~ K ( Xn + Xn+1). Therefore the random walk process will consist a sequence of random variables Xn that satisfies:
Xn+1 = Xn + ɛn
Where ɛn generates Xn, Xn-1, … independently.
The Kalman Filter
Suppose ‘at’ represents the optimal estimator of the state vector ‘αt.’ We can also denote the association and estimation error by the covariance matrix. The assumption of the definition is as follows:
Pt = E [(α1 – B1)(α1 – a1)′]
This function is equivalent to the mean square (MSE) matrix of the parameter ‘at’; hence it cannot be a covariance matrix of ‘at’ which includes the random variables (Robert 2013). Let us have an assumption existence of time t – 1 and that at Pt-1 and at-1 are available. Therefore the prediction equation of the optimal estimation of ‘at’ will be:
at/t-1 = T1at-1 + c1 hence;
Pt/t-1 = TtPt-1T′t + RtQtR′t where t = 1, 2, …, T.
When considering the estimator yt that corresponds to the equation:
Ӯt/t-1 = Zt at/t-1 + dt1 t = 1, …, T
Therefore the predicton vector of the mean square error will be:
Vt = Ӯt/t-1 = Zt(α1 – at/t-1),
Where t = 1, …, T
Ft = Zt Pt/t-1 Z′t + Ht
Now that the value of the observation is available, we can change the state estimator equation. That is the:
at = at/t-1 + Pt/t-1 Z′t F-1t (yt – ztαt/t-1 – dt)
where Pt = Pt/t-1 – Pt/t-1ZtPt/t-1, t = 1, …, T
The parameter vector error Vt has got a very important role as per the update on the observation. The change of the state estimator increases with increase in deviations of the predictor. Therefore these equations establish a kalman Filter (Kf). The Kalman filter can provide an optimal estimation of the state when the initial conditions of P0 and a0 are existing.
Metropolis Sampling
One can demonstrate the Metropolis-Hasting generation of samples using a Markov chain process with a density function of P(x). We can have a sample of q(x,y), which is equivalent to Pr (x → y\q). This sample can accept the probability α (x,y) (Robert 2013). Therefore the probability kernel distribution will be:
Pr(x → y) = q(x, y) α (x, y)
= q(x, y). min [ ]
Hence the Metropolis-Hasting Kernel satisfies the condition.
P(x → y) P (x) = P(y → x) P (y)
Therefore the stationary distribution for all values of x and y from the metropolis kernel corresponds to all the draws from the target distribution. Hence the metropolis can provide clear evidence that:
P (y, x) P (x) = q (x, y) P (x) and P (x, y) P (x) = P (y, x) P (y)
The integral approximate of distribution of function f is :
J =
The algorithm of the Markov Chain Monte Carlo that has got a stationary distribution of the function f produces a chain of ( X(t)). Where;
JT = 1/ T
The target or objective of density function f is the key thing in Metropolis Hastings algorithm. The implementation of Metropolis Hastings algorithm can be done when there is an easy simulation of the q (.\x) from the symmetric function of independent variable X. Therefore the ratio of the target density function of the function f is:
F (y) / q (y\x) which represents the constant value of random independent variable X. When we need to generate the value of the random variable Yt ~ q ( y \ x (t)), we take:
X (t+1) = Yt having a probability value of P (Xt, Yt)
X (t) with a probability value of 1 – P (x(t), Yt ),
Where P (x, y) = min [f (y) q (x/y)/ f(x) q (y\x), 1]
Therefore q is the distribution of the proposal and the probability density function ( x. y) is the Metropolis Hastings of the probability.
Gibbs Sampling
The Gibbs sampling technique is a different special case within the Metropolis-Hastings sampling that accepts the random variable (α – 1). The Gibbs sampling technique only considers the univariate conditional distribution (Robert 2013). The simulation of the univariate conditional distribution is elementary as compared to the simulation of complex joint distributions.
Consider a function of Bivariate random variable (x, y) for computing either one or both functions of P (y) and P (x). It will; be straightforward to consider the sequence or series of conditional distributions of P (x / y) and P (y, x) as compared to obtain marginal integration of the joint density function P (x, y) (Carlo 2004). The function P (x) is equivalent to ∫ P (x, y) dy. The initial value of y0 from y to obtain the value of y0. The drawing can be from the conditional distribution in the form of x0 and P ( y/x = x0). Therefore the samples will be as follows:
Xi ~ P ( x \ y = Yi – 1)
Yi ~ P (y \ x = xi)
We can repeat the process K times to generate the Gibbs sequence of the length k. The subsets of points xj and yj for 1 ≤ j ≤ m < k are the simulation draws. Therefore one can obtain the sample M size (points) after sufficient removal of the effects of sampling values at the beginning. Let us have an assumption of the random variable of the random variable and random variable Y (Robert 2013). There is also a joint probability function f (x, y). The simulation of the joint distribution function will be: ƪ (f) = {(x, y, μ ): 0 ≤ f (x, y)}
This is simply because there is an existence of three critical components having the starting point (x, y, μ). These three components include:
- Y which is along the y-axis which is the dependent variable with a uniform distribution of { y: μ ≤ f (x′, y)}
- X is on the X-axis which is the independent variable that has got a normal distribution of { X: μ ≤ f (x, y)}.
- U on the u- axis that follows a uniform distribution of [0, x′, y′].
There is also a uniform generation of fro the uniform generation function { x: f x\y (x\y) ≥ μ / fY (y) }.
Where fY denotes the marginal distribution of Y and f X/Y denotes the conditional distribution of variable X and variable Y.
Let us consider the example below, where the there is a distribution of x where x = 0, 1,2, …, n and variable y which is equal or greater than 0 or less than or equal to 1. Therefore the function P (x, y) is:
P (x, y) =
X represents a discrete random variable while Y represents a continuous random variable. Having the fact the marginal the conditional density is simple, the joint density function is complex. The density proportion of the binomial random variable is:
P (z \ q, n) α Za-1(1 – z)b-1) for the value, 0 ≤ z ≤ n
The value of q is greater than 0 but lesser than 1 where n represents the number of traits while q is the value of successive parameter (Robert 2013). The value of Z follows a binomial distribution with parameter P and parameter n (Z ~ (n, p)). The concept of Gibbs sampling is assistive on the computation of a sequence of univariate conditional functions of the random variables.
Summary and Conclusions
As per the study, the concept of stochastic volatility has played a vital role to play in the field of financial marketing. The stochastic volatility model can enhance the evaluation and monitoring of essential parameters in commercial marketing. This process monitors the changes in variance, stock returns, and exchange rates of finance at the market over the given time. This process is also applicable in the generalization of the black-scholes using the optional pricing. There are four estimation and filtering methods in stochastic volatility models (Carlo 2004). The models include the Kalman filters, Metropolis sampling, Gibbs sampling, and the Markov chain Monte Carlo.
The Kalman filtering technique is beneficial in handling the population that follows the discrete-time distribution. The method of Kalman filtering makes use of the predictor error vector Vt to establish the relevant estimate of a random variable that tracks the discrete-time. The Markov chain Monte Carlo is also very significant in terms of simulating the compound distribution. The Markov Chain Monte Carlo involves the decomposition of integral functions from the given two points of distribution (a, b). The Monte Carlo integration is applicable in the approximation of the Bayesian analysis (Robert 2013. The estimation of the Monte Carlo is done based on the standard error term. The sampling process using Monte Carlo consists of the drawing of Xi’s from the probability density function P(x).
Gibbs sampling is another essential method that is very important in the sampling of the given random variable distribution. The Gibbs method is another sampling technique the population distribution. This technique is a unique technique within the Metropolis-Hastings sampling since it accepts the random variables (α = 1). The Gibbs sampling method is a substitute technique of marginal integration since it involves the simulation of the univariate conditional distribution (Robert 2013). The selection or drawing of samples is in form Xi and Yi with a dependent distribution P (y/x = x0).
Metropolis sampling is another significant method of handling density functions with a density probability of P (x). The metropolis involves the kernel transition of the probability value α (x, y). The Metropolis sampling method has got a probability kernel that satisfies the condition P(x → y) P (x) = P(y → x) P (y).
The analysis and implementation of the Gibbs sampling are done using coding in R programming. The output of the program of the Gibbs sampling process on the R-script is as follows:
> gibbsBinBeta=function(n,trials,alpha,beta)
+ {
+ mat=matrix(ncol=2,nrow=n)
+ x=0
+ y=0
+ mat[1,]=c(x,y)
+ for(i in 2:n){
+ x=rbeta(1,alpha+y,beta+trials-y)
+ y=rbinom(1,trials,x)
+ mat[i,]=c(x,y)
+ }
+ mat
+ }
> bvn=gibbsBinBeta(10000,80,2,3)
> par(mfrow=c(3,2))
> plot(ts(bvn[,1]))
> plot(ts(bvn[,2]))
> hist(bvn[,2],40)
> acf(bvn[,2],100)
> par(mfrow=c(1,1))
The diagonosis and analysis of the data on the Gibbs sampling data was in the form of plotting the general distribution of the data, histogram, auto-correlation function of data and the parametric distribution of the data.
Figure 8.3
As per the analysis of the Gibbs sampling, together with the display of the output of the Implementations are in figure 4.3, which consists of the four diagrams. The statistical production clearly shows that the Gibbs sampling technique of this model follows a normal distribution under the display on the histogram. The histogram has got two variables, which are the frequency variable and the bvn variable. The frequency is the dependent variable, while the bvn is the independent variable. The frequency variable has got the highest level of 400 units, while the bvn variable has reached the highest value of 80 units. The data distribution on the histogram follows a standard distribution curve since there an average distribution of the Gibbs sampling data at the peak that has got a frequency range of 300 to 400 units and a corresponding distribution data at both positive and negative skew of the histogram. The Gibbs sampling also follows an Autocorrelation function distribution where the data depreciates uniformly from the dependent variable to the independent variable of the plot in figure 8.4. The displays of the R-code command plot (ts (bvn[,1])) and scenario (ts (bvn[,2])) shows that the data of the Gibbs sampling is a random distribution over time. The two diagrams that are figure 8.4 and figure 8.5 represent the screenshot of the entire output of the Gibbs sampling on the R-script.
Figure 8.4
Figure 8.4
Acknowledgments
I would like to take this opportunity to thank the instructor for having a very high degree of willingness and goodwill to provide us with correct measures, guidelines, and directions to handle this assignment appropriately. My sincere appreciation also goes to the lecturer, who was always there to provide us with all the necessary support in our daily academic endeavors. The lecturer did a lot to ensure that the unit under study is covered adequately and sufficiently with all the essential skills and mathematical concepts. My gratitude also goes to all the authors who are behind the researching and writing of the academic contents and articles in connection to stochastic volatility modeling other related fields. These professional materials on the internet and other written mathematical books have got great importance in the research of stochastic volatility modeling together with the different methods of estimating and filtering in both the discrete-time and the continuous-time distribution.
List of References
Blomberg, S.P., Rathnayake, S.I. And Moreau, C.M., 2020. Beyond Brownian motion and the Ornstein-Uhlenbeck process: Stochastic diffusion models for the evolution of quantitative characters. The American Naturalist, 195(2), pp.000-000.
Bufetov, A.I., Dabrowski, Y. and Qiu, Y., 2018. Linear rigidity of stationary stochastic processes. Ergodic Theory and Dynamical Systems, 38(7), pp.2493-2507.
Carlo, C.M., 2004. Markov chain Monte Carlo and Gibbs sampling. Lecture notes for EEB, 581.
Gillet, N., Barrois, O., and Finlay, C.C., 2015. Stochastic forecasting of the geomagnetic field from the COV-OBS. x1 geomagnetic field model, and candidate models for IGRF-12. Earth, Planets and Space, 67(1), p.71.
Harvey, A.C., and Fernandes, C., 1989. Time series models for the count or qualitative observations. Journal of Business & Economic Statistics, 7(4), pp.407-417.
Ross, S.M., Kelly, J.J., Sullivan, R.J., Perry, W.J., Mercer, D., Davis, R.M., Washburn, T.D., Sager, E.V., Boyce, J.B. and Bristow, V.L., 1996. Stochastic processes (Vol. 2). New York: Wiley.
Roy, D.P., and Yan, L., 2020. Robust Landsat-based crop time series modeling. Remote Sensing of Environment, 238, p.110810.
Sinyavskiy, O. and Coenen, O.J., Brain Corp, 2015. Stochastic apparatus and methods for implementing generalized learning rules. U.S. Patent 9,104,186.
Wang, B., Liu, L., Huang, G.H., Li, W., and Xie, Y.L., 2016. Forecast-based analysis for regional water supply and demand relationship by hybrid Markov chain models: a case study of Urumqi, China. Journal of Hydroinformatics, 18(5), pp.905-918.
Robert, C., & Casella, G. (2013). Monte Carlo statistical methods. Springer Science & Business Media.
Wu, H., and Noé, F., 2020. A variational approach for learning Markov processes from time-series data. Journal of Nonlinear Science, 30(1), pp.23-66.
Appendix
gibbsBinBeta=function(n,trials,alpha,beta)
{
mat=matrix(ncol=2,nrow=n)
x=0
y=0
mat[1,]=c(x,y)
for(i in 2:n){
x=rbeta(1,alpha+y,beta+trials-y)
y=rbinom(1,trials,x)
mat[i,]=c(x,y)
}
mat
}
bvn=gibbsBinBeta(10000,80,2,3)
par(mfrow=c(3,2))
plot(ts(bvn[,1]))
plot(ts(bvn[,2]))
hist(bvn[,2],40)
acf(bvn[,2],100)
par(mfrow=c(1,1))