Archive for the ‘Teaching: PDS’ Category
BlackScholes PDE – I: 1st (Original) Derivation
The original CAPMbased derivation of the BlackScholes PDE
Ingredients required:
 Ito’s Lemma: Given stochastic process for the stock price , Ito’s lemma gives stochastic process for a derivative as:
 CAPM: Expected return from a stock is sum of the reward for waiting (the riskfree rate ), and reward for bearing risk over and above the risk free rate , i.e.:
…
Given CAPM, the instantaneous return on the underlying follows:
And, similarly, the instantaneous return on the derivative follows:
Rewriting Ito’s Lemma in terms of and dividing by gives:
Dividing and multiplying by in the last term, and writing and respectively as and implies:
Canceling on both sides and noting that the only random term on the RHS is , plus the fact that for any three random variables and , implies allows us to write:
Finally, dividing both sides by variance of market returns gives the following relationship between the option beta and the stock beta:
Coming back to Ito’s Lemma, we can take expectation on both sides of the expression for to write:
Using CAPM expressions for and in the above gives:
The last step now is substituting expression for in terms of , and cancelling terms to show that:
which is the BlackScholes PDE.
…
[PDS] Probability in Finance – Key Ideas: IV
Having defined random variables using the measuretheoretic language, to complete the basic setup, we can now define familiar things like ‘expectation/expected value’ and ‘variance’ of a random variable. Expected values are understood as weighted averages, or simply sums or integrals, and in the world of probability, it turns out we need a specific kind of integral, called the Lebesgue integral.
…
Measurable Functions can be Integrated
Like Riemann integrals, the intuitive way to understand Lebesgue integrals is to think of them as ‘area under a curve’. The way Lebesgue integral differs from its Riemann counterpart that it calculates area by dividing along the range of the function. Recall that Riemann integral works by taking limits of the ‘lower sum’ and the ‘upper sum’, where the lower and upper sums are calculated as sum of the area of rectangles formed by considering intervals along the domain (the xaxis). The following pictures borrowed from shows the difference:
[Source: Steven Shreve, Stochastic Calculus in Finance, Vol II, Chapter 1; Click to zoom]
Extending the intuition from the Riemann integral then allows us to write area under the curve taking intervals along the range (yaxis) as:
where for some . Note that the above Lebesgue sum is defined iff one can talk about meaningfully – that is one can ‘measure’ the inverse image of the function .
More formally, then, one can write the area under the curve in the Lebesgue sense iff inverse image of is measurable. It is in this sense that Lebesgue integrals are defined.
The need for Lebesgue integral arises when finding things like ‘expectation’ and ‘variance’. Finding expectation or expected value involves summing over values a random variable takes weighted by probability. Now recall that probability is defined for events in the sample space, but random variables are function defined on sample space. So find this sum is like integration of a function, i.e. values taken by the random variable (yaxis) over probabilities (measure) defined on events in the sample space, i.e. field (xaxis).
So the requirement that measurable functions is a natural requirement when talking about random variables. We can find probabilities (measure) of only those value of the random variable which \textit{can} happen, i.e. belong to the field generated by the sample space. Also, note that argued this way it is clear (why?) that there is no obvious way to partition the xaxis ala Riemann (probabilities of events in the sample space corresponding to the values taken by the random variable), and the only way one can integrate random variables is by starting on the yaxis (values taken by the random variable).
A formal definition of Lebesgue integral is more than what we need at this stage, so with the intuition in place we can now move to defining expected value.
…
Expected Value as Lebesgue Integrals
Expected Value: Given a random variable on the probability space the expected value is defined as:
and it can be shown that it is equivalent to our familiar notion:
and if is continuous this changes to the familiar formula:
where is the probability distribution and is the probability density function associated with the random variable .
At this stage a natural question is how do we compute Lebesgue integrals in practice. Well, as it turns out for most ‘nice’ and ‘welldefined’ functions, value of a Lebesgue integral is same as that obtained by finding he integral the Riemann way (relieved?). So for most practical purposes nothing needs to change as far as our intuitive notion of expected value is concerned.
[PDS] Probability in Finance – Key Ideas: III
When we do elementary probability, one of the most common setup used is the cointoss game with the outcomes being or . While it remains one of the most useful thought experiments to think systematically about chance, with the abstract outcomes as “” and ““, there is not much one can do with that.
For example, if one were to toss the coin many times it would be good to get a sense of “expected outcome” and “variations” in outcome from the cointoss game. But with the abstract sample space such as and , it is not possible to do so.
From elementary probability, however, we also know how to get around that. The way is to assign the abstract outcomes and some numbers. Say, whenever comes, assign the number to it, and whenever comes assign the number to it. This way, because the abstract outcomes have been converted to numbers, one can now do math with it and find things like “expectations” and “variance” of outcomes. And these are useful things to have, as they help to summarize more complex experiments/models.
Mathematically, one can think of assigning numbers to abstract outcomes as “carrying out a function” – that of mapping abstract outcomes to “real” numbers. It turns out there is a name for this kind of “function”. Mathematicians call it random variable. (Yes, “variable” is perhaps not the the best word for “carrying out a function”, but that’s how it is for historical reasons, and we have to live with it.)
In the world of Lebesgue measure that we have been considering, it turns out random variables in probability are just an example of what are called measurable functions .
…
Random Variables
If the sample space is known, knowing all possible numbers associated with an experiment (random variables) is equivalent to knowing the field, the converse, however, does not hold. That is, while assigning numbers to outcomes of experiments is useful, knowing just the random variables associated with an experiment is not the same as knowing the field. Consider the following examples.
Example 1 (Cointoss): The associated sample space and field are respectively and . Let random variable assign numbers to outcomes of a single coin toss game such that and . Knowing the value of the random variable in this case is enough to tell us about everything about the underlying game.
Example 2 (Dietoss): The associated sample space is and the associated field is . Let random variable assign numbers to outcomes of the dietoss outcome such that if the outcome is odd numbered, the random variable assigns the value to it and otherwise. That is, the random variable is such that and . Clearly, knowing the value of the random variable in this case is simply not enough to tell us about the underlying experiment, because there is no way to distinguish between, for example, outcomes and . The random variable is just too “coarse”.
Not only that, random variables and are indistinguishable from each other. If only random variables are reported there is no way to know if the underlying experiment is a cointoss game or a dietoss game.
That said, in both examples, however, one thing is clear – values of the random variables must correspond to some elements in the field. This is the idea behind “measurability” – that random variable values must correspond to “something that can happen” (Englishspeak for members of the field).
Now we are ready to introduce the idea of random variables and measurability more formally.
…
Random Variables as LebesgueMeasurable Functions
Measurable Functions: Definition
Given a measurable set , a function is said to be measurable if for any interval :
That is, a function is measurable if it’s inverse image belongs to the collection of Lebesguemeasurable subsets of . Put simply, a measurable (“nice”) function is one which is obtained from a measurable (“nice”) set.
In probability, instead of , as mentioned earlier, typically we encounter Borelmeasurable sets. So if , we call as a Borelmeasurable, or simply a Borel function.
Random Variables: Definition
Random variable is a Lebesguemeasurable function such that given a probability space , and an interval :
What this says is that random variable are obtained from sets that belong to the field – or alternatively, values of a random variable are obtained by assigning numbers to “all possible things that may happen in a game” (Englishspeak for subsets/elements of field ) – of course, this is simply the act of assigning numbers to abstract outcomes as in the examples above. This definition formalizes this notion.
When there is no confusion about the underlying field , Lebesguemeasurable functions are simply often referred to as measurable.
…
field Generated by Random Variables
The fact that random variables can often be “coarse” (like assigning only odd/even numbers to outcomes of the dietoss) gives rise to the notion of field associated with a random variable.
field associated with a random variable is the collection of subsets that can be identified by the random variable. So for the random variable described above, field generated by would be – that is, the random variable can only identify outcomes upto whether they are odd or even numbers.
We can make this idea more formal by technically defining the notion of field generated by a random variable.
field Generated by a Random Variable: Definition
Given a probability space and a random variable , the family of sets such that for some :
is a field; where is a Borel field.
In Englishspeak field generated by a random variable is the smallest subset of (i.e. the smallest field) that describes the random variable.
The last piece of formalization we need now is to describe systematically the probabilities associated with different values of the random variable.
…
Probability Distribution
Probability distribution of a random variable is the probability of elements in the field generated by the random variable. (Remember that probabilities are assigned to events and not directly to random variables. So, probability of random variable taking some value or lying in a certain interval must correspond to some events in the field.)
Consider the random variable in our example above. The field generated by it is , and the probability associated with those are , i.e. probability when the random variable takes the value , , and
It turns out one can summarize the distribution of these probabilities associated with different values taken by the random variable simply as:
where is any member of the Borel field .
For the random variable , then we can use this concise definition to again write the distribution of probabilities associated with values taken by the random variable as:

If the Borel set containes both and :

If the Borel set containes neither and :

If the Borel set containes only but not :

If the Borel set containes only but not :
which is what we argued intuitively.
[PS: Definitions above taken from Capinski and Kopp]
[PDS] FeynmanKac Representation of BSM PDE
In practice, most people price financial derivatives by Monte Carlo simulation. However, when BlackScholesMerton (BSM) gave their famous (or notorious) option pricing formula they came up with that after solving a Partial Differential Equation (PDE). So, just from that point of view, it is not immediately obvious that price obtained via Monte Carlo simulation should be the same as achieved by solving the PDE.
One can, of course, come up with the same result by approaching the option pricing problem from a probabilistic point of view – what is known as the ‘riskneutral’ method – according to which option price is the discounted expected payoff (which ultimately justifies pricing options by Monte Carlo simulation, in turn relying on the Law of Large Numbers).
So while there are these two very theoretically sound approaches to option pricing, it would be good to know if there is an underlying mathematical connection between the two approaches. It turns out there is, and in fact the result that there is a connection between the two precedes much of the development of option pricing theory.
This result was given by the famous physicist Richard Feynman and probabilist Mark Kac who showed that solution to parabolic (read ‘nice’) PDEs is intimately related to conditional expectations. In what follows we lay out that connection for the specific case of BSM PDE.
…
FeynmanKac Representation of BSM PDE
Given the the drift rate and the volatility , the Geometric Brownian Motion (GBM) for the stock price process is given by:
where represents the increment of a standard Brownian Motion . The above SDE for the stock price process can be said to be in the ‘real world probability measure’.
Then, given a financial derivative, say, a Call Option, , Ito’s lemma gives us the Stochastic Differential Equations (SDEs) for as:
Setting up a hedging portfolio with one unit in and units of the stock with gives us BlackScholesMerton PDE with the drift replaced by the riskfree rate , i.e.:
After delta hedging has ‘removed’ the risk of the portfolio of one unit in and units in stock , the ‘right’ stock price process to consider is the one in the `riskneutral’ measure, as:
(Girsanov theorem implies that the diffusion term in the GBM for stock prices does not change when we move from the `real world’ to a ‘riskneutral world’.)
FeynmanKac representation of SDEs tell us that PDEs of the BSM kind have an equivalent probabilistic representation. That is, FeynmanKac assures that one can solve for the price of the derivative by either discretizing the BSM PDE using Finite Difference methods, or by exploiting the probabilistic interpretation and using Monte Carlo methods.
With this as the backdrop we are now set to write the FeynmanKac representation for the specific case of BSM PDE.
We start with considering the following functions:
and their differentials:
Recall that since in BSM PDE the risk has been `hedged away’, our stock price process is in the riskneutral world, and that is why the drift term is (instead of ) in the SDE for .
Next we consider the differential of the product , i.e. :
where the last step in the above equation follows directly PDE for from Ito’s lemma. Now BSM PDE tells us that:
With this we can simplify as:
That is, the change in the function is a driftless SDE. We integrate both sides to give:
Then taking expectations of both sides w.r.t the filtration at time and using the fact that stochastic integrals are martingales (RHS of the equation below) gives:
Substituting back the value of original expressions for and at time and , i.e. and gives:
and we are done!
That is, BSM PDE implies that the price of the derivative at time is equivalent to the discounted value of the expected payoff at expiration (time ). This is the famous FeynmanKac representation.
And this is why pricing derivatives via Finite Difference methods (by discretizing the PDE) is mathematically equivalent to pricing them using Monte Carlo methods (taking expectations).
[PDS] Probability in Finance: Key Ideas – II
field on
Just like the collection of Lebesguemeasurable sets represented ‘nice’ subsets of , for a general sample space (as in probability ‘experiments’), one can also think of collection of ‘nice’ subsets of in a similar vein.
The analogue of on is on . has the same properties as , i.e. it is closed under ‘complements’ and ‘finite unions’ as:
1.
2.
3.
and defines what is called a field on .
A field need not comprise all subsets of – it could be even for a subset of as long as it satisfies all the above properties.
For example in a ‘die toss’ experiment with sample space , is an example of a field generated by subsets .
Needless to say, one can come up with other fields on which are ‘larger’ than . For example, consider the collection and the field generated by that subset. Clearly it would be a larger field than because it would not only contain the elements contained in but also some more.
There is a nice result pertaining to fields which says that for any given collection of subsets of , there exists a smallest field that contains . For example, above describes the smallest field containing .
The smallest field containing is then referred to as the field generated by .
…
Borel Field and Borel Sets
Although is a collection of ‘nice’ subsets of , it turns it is often still too large for our purpose (of measuring probabilities). What is often required is not , but some nice field like , but perhaps smaller than , that contains all intervals (closed, open, semiopen/semiclosed – all kinds).
As pointed out earlier, contains all intervals and also all null sets. If we apply the result mentioned above that given a collection of subsets (all intervals), then we know that there must exist a smallest field containing all intervals. Since all intervals are part of , if that field exists, it is ‘included’ in .
Indeed, such a field exists, which is the smallest field containing all intervals, called a Borel field . The elements of are called Borel sets.
For most purposes in probability, the Borel field , it turns out, is good enough. So, we may get by defining measures on this ‘smaller’ field instead of .
…
Restricting Lebesge Measure
We have so far defined measures of the kind , but we know from our intuitive understanding of probability that a probability measure must lie between . So the last piece of machinery we need is something that allows us to ‘restrict’ Lebesgue measure to any Lebesguemeasurable subset .
Given the measure space , the following construction ‘restricts’ Lebesgue measure to a Lebesguemeasurable subset :
such that
The triple is then a (complete) measure space.
…
Probability Space
The fact that the above ‘restriction’ of Lebesgue measure results in a measure space now allows us to define probability measure over arbitrary spaces without worrying about if we can ‘restrict’ that measure to .
Probability Space: Definition
A probability space is a triple , where is an arbitrary set (‘sample space’), is a field of subsets of (i.e. elements of are all possible ‘events’), and is a measure on :
called probability measure or simply probability.
(Definitions in this post, as in the previous one, are taken from Capinski and Kopp.)
With an abstract measure space, one can always assign the measure to lie between depending on the nature of the experiment.
In the case when is a Lebesguemeasurable subset of where the measures/lengths may indeed be larger than . But the fact that one can ‘restrict’ measures to Lebesguemeasurable subsets of affords us a way out. This is done by writing probability for any subset as:
where
The measure as defined above is a restriction of to and we are guaranteed that is a measure/probability space.
Note that as defined above, probability measure need not have any physical meaning attached to it. But the construction above ensures that it can handle all kinds of events and sample spaces that we may encounter when dealing with arbitrary (and often infinite) sample spaces.
…
In the following we take a look at the idea of Lebesguemeasurable functions which will take us to the important notion of random variables.
[PDS] Probability in Finance: Key Ideas – I
Need for a Mathematical Theory of Probability
One of the reasons we need a mathematical (read measuretheoretic) foundations of probability is that when dealing with infinite sample spaces (choosing a number at random in for example), there is no immediately obvious way of assigning probabilities to ‘not very likely events’. Let me elaborate.
While it is not difficult to understand that when selecting a number at random from , the probability that it lies between would be , it is not immediately obvious what would be the probability that the number selected would be one of or, say, a rational number .
Both the sets and are countable and, while seemingly big, are yet ‘too small’ compared to all the points in . Also, the set of rational numbers is not is not even an interval in the way, say, is.
The fact that there are subsets on the real line which are not intervals or not ‘nice’ subsets (e.g. the Cantor Set) means that the notion of length as distance between two points is not enough. We need a ‘better scale’ ifyouwill that allows us to measure the length of sets like and identify the ‘smallness’ of countable sets.
This ‘better scale’ that we are looking for is known as the Lebesgue Measure. (Of course, there are many other advantages of using the idea of measure than to just assign probabilities, but that needn’t concern us for now.)
But before we lay out the properties of this scale, we need some new machinery.
…
Null Sets
Just like the development of numbers begins with defining the number , development of a theory of measure begins with defining sets which are negligible, or alternatively, null sets.
Null sets are those which have a measure according to our new scale. Knowing sets that have a measure then allows us to identify sets that have a ‘finite length’.
Null Sets: Definition
A set is null if it can be covered by a sequence of intervals of nearzero total length, i.e. is null if given an arbitrary small :
The definition says that an arbitrary set is null if the total length of sets which cover it is ‘very small’. This implies that any countable set is null as:
Since length of the closed interval is , the length of set must also be zero (it is a countable sum of zerolength closed intervals). This suggests that given the way we have defined null sets, all countable sets, including the set of rational numbers , are null sets – that is have a measure .
…
Outer Measure
The notion of covering/approximating a set by a sequence of intervals turns out to be an important and very useful step in constructing a theory of measures. Continuing with the notion of covers, first we define what is called the Outer Measure.
Outer Measure: Definition
The Outer Measure of a set is denoted by and is given by:
Intuitively this definition says the length (Outer Measure) of a set is the smallest total length of all intervals that cover the set.
The reason for developing a ‘new scale’ (measure) was so that we could measure all kinds of arbitrary subsets on . But having done so the least we would expect is that for intervals measure gives us the same answer as its ‘length’. Indeed – all the expected/intuitive properties of length are preserved by Outer Measure. Below we list its important properties (taken directly from Capinski and Kopp, including the notation):
Outer Measure: Properties
1. is null iff
2. and
3. Outer Measure of an interval equals its length, i.e.
4. Outer Measure is countably subadditive, i.e. for a sequence of sets (not necessarily disjoint)
While this is great, the fourth property of Outer Measure gives us some cause for concern. The problem is that it doesn’t guarantee that measure of a union of disjoint sets does not necessarily addup to the sum of measure of individual sets. And what do we mean by that?
Consider two intervals with measure and with measure which are disjoint, then we should expect that length of the set should have a length . While for this example value of Outer Measure respects our intuition of length, in general this additivity property is not true for Outer Measure for all subsets of . And we do not like that!
Ok, admittedly, this is only the case for really nasty sets, but for now (and in probability) we do not want to work with such nasty sets. So we are looking for those subsets of for which the Outer Measure is additive for disjoint sets, i.e. we only want to work with those subsets of for which if then:
Collection of all subsets of for which this additivty property holds (and all other properties of Outer Measure) are called LebesgueMeasurable Sets and is denoted by .
…
Lebesgue Measure
The collection of subsets is important enough to warrant a different notion of measure for which the additivty property holds. Outer Measure with additivity property for sets in is called the Lebesgue Measure on . We use the notation for Lebesgue measures and write:
Technical definition of Lebesgue Measure is more than what we need at this stage and can be found, for example, in Capinski and Kopp. For us, thinking of Lebesgue measure as an Outer Measure with additivity property is about enough. Needless to say, all the other properties of Outer Measure carry through to the Lebesgue Measure.
…
Lebesgue Measurable Sets
The collection is clearly a ‘nice’ subset of . Other than the additivity property of the Lebesgue measure for subsets of , its subsets have some other ‘nice’ notable properties:
1.
2.
3.
The above properties just say that when one does simple operations (taking ‘complements’ and ‘unions’) on sets belonging to we remain in , i.e. doing simple operations on sets in does not take us ‘out of’ – we remain in the ‘nice’ world of .
(Those familiar with the notion of fields would recognize that forms a field on . )
…
Measure Space
We began with the set . We considered ‘nice’ subsets of which were Lebesgue measurable (in the sense above) and called the collection of all such subsets as the Lebesguemeasurable sets . So we have three things now: the underlying set , collection of Lebesguemeasurable subsets , and the Lebesgue measure . For the sake of brevity we often call this ‘triple’:
as the measure space.
At this stage it is also useful to explicitly identify the Lebesgue measure as a ‘scale’ that assigns each subset in a ‘length’ (measure). Given our understanding of a function, then this is what a Lebesgue measure is. is a function that assigns to each subset in a number between (for null sets) and (for uncountable sets like ), i.e.:
…
Next we extend the idea of measure space to abstract spaces replacing by , by (field on ) and the Lebesgue measure by a measure called the probability measure. The resulting space is then called a probability space.
[PS: Much of the discussion in this post summarizes the treatment of measuretheoretic ideas as in Capinski and Kopp]
[PDS] Chain Rule in Ito Calculus
Chain Rule in Ito Calculus
Given two stochastic processes and driven by different Brownian Motions and as
or alternatively, writing them in ‘shorthand’ (their SDE form) as:
Then Ito’s lemma in 2D tells us that function will satisfy:
Given , the following will hold:
With this we can now simplify the expression for as:
This describes the Chain Rule in Ito calculus.
…
We can, of course, further simplify the above and write:
where is the correlation between the two Brownian Motions and .