It is assumed that the reader more or less understands the continuation-passing style. The first thing that the Cont monad does is to abstract away the need to pass continuations explicitly in continuation-passing style.
The Cont monad abstracts away the handling of continuations, by creating a series of “partially applied” (I’m using this quite loosely) functions that are all waiting for their continuation. The add function was originally add x y, and add_cps became add_cps x y k. This meant that add_cps explicitly took a continuation function k.
What happens if we have this?
-- add add :: Int -> Int -> Int add x y = x + y add_cps :: Int -> Int -> (Int -> r) -> r add_cps x y k = k (add x y) add_abstracted_cps :: Int -> Int -> ((Int -> r) -> r) add_abstracted_cps x y = (\k -> k (add x y))
We may see add_abstracted_cps as a “partially applied” function of add_cps, and the missing argument is the continuation function. Hence, we can define a bunch of such “abstracted” functions which are partially applied functions.
Notice that the type signature for the “abstracted” version of the function is exactly the same as that of the cps version, only that due to partial application, it returns a function that takes a continuation and returns a type r. This is made explicit using parentheses, though it’s not necessary (the parentheses part). Put another way, we are just converting between two equivalent forms using lambda expressions, as follows:
f :: Int -> Int f x = x+1 -- This is equivalent to this, both f = \x -> x+1 -- in the function body and in the type signatureThe above are equivalent.
Now, back to partial application, suppose we have the following:
add :: Int -> Int -> Int add x y = x + y add_cps :: Int -> Int -> (Int -> r) -> r add_cps x y k = k (add x y) add_abstracted_cps :: Int -> Int -> ((Int -> r) -> r) add_abstracted_cps x y = (\k -> k (add x y))
Now, add_abstracted_cps is just add_cps partially applied, and is just missing the continuation before it can be evaluated (i.e. it returns a function that expects k, the continuation). However, so far the use of the words “partially applied” here is a bit loose, in the sense that you’ll notice that add_abstracted_cps still has the variables x and y, and those have certainly not been applied. We’ll see why I keep saying partially applied in just a moment. However, for now, let’s do the same for all of the functions and put them all together:
-- add
add :: Int -> Int -> Int
add x y = x + y
add_cps :: Int -> Int -> (Int -> r) -> r
add_cps x y k = k (add x y)
add_abstracted_cps :: Int -> Int -> ((Int -> r) -> r)
add_abstracted_cps x y = (\k -> k (add x y))
-- square
square :: Int -> Int
square x = x * x
square_cps :: Int -> (Int -> r) -> r
square_cps x k = k (square x)
square_abstracted_cps :: Int -> ((Int -> r) -> r)
square_abstracted_cps x = (\k -> k (square x))
-- pythagoras
pythagoras :: Int -> Int -> Int
pythagoras x y = add (square x) (square y)
pythagoras_cps :: Int -> Int -> (Int -> r) -> r
pythagoras_cps x y k = square_cps x (\x_squared ->
square_cps y (\y_squared ->
add_cps x_squared y_squared (\sum_of_squares ->
k sum_of_squares)))
pythagoras_abstracted_cps :: Int -> Int -> ((Int -> r) -> r)
pythagoras_abstracted_cps x y = (\k ->
square_cps x (\x_squared ->
square_cps y (\y_squared ->
add_cps x_squared y_squared (\sum_of_squares ->
k sum_of_squares))))
Now, suppose that we fully apply the *_abstracted_cps functions, we are going to get a function that expects k, the continuation. If we apply k to that, then we can “evaluate” the function body. Hence, *_abstracted_cps when fully applied, is just *_cps that has been partially applied just up to (but not including) the continuation parameter, k. Hence my use of the words “partially applied” earlier. Thank you for tolerating my loose use of those words.
Hence, to reiterate, the *_abstracted_cps functions, after that have been evaluated (by applying all the required parameters), returns a computation (a function) that requires a continuation function in order to fully evaluate. That continuation is received as the first (and only) paramter to the function. We’re going to be seeing this kind of function a lot, hence, we’ll give it a more intuitive name. Let’s call them waiting functions. Thus, waiting functions are functions that are waiting on a continuation, as the first and only parameter.
The Cont monad
Clearly, the fact that we have to handle the continuation function explicitly can be abstracted cleanly away using a monad. This is implemented in the Cont monad. Let’s take a look at the definition first. We are going to see some very big similarities. However, the first thing that we need to constantly keep in mind, is what exactly the monad represents. Here goes: the Cont monad “stores”, as the inner value of the monadic value, the fully applied *_abstracted_cps functions. That’s our waiting function. Here is it again:
The monadic value “contains” the computation that is just waiting for its continuation (the waiting function).
The Cont monad is defined in Control.Monad.Cont, hence the following line is necessary:
import Control.Monad.Cont
Here’s the data definition of Cont:
newtype Cont r a = Cont { runCont :: ((a -> r) -> r) }
First off, the type of the monadic value of the Cont monad represents ((a -> r) -> r), which looks very familiar, since it’s the type signature of the waiting function. It’s pretty much the rear end of all our cps function type signatures. The continuation itself is given by the type (a -> r), and as you can see, represents the first and only parameter of the waiting function. The waiting function, with a continuation as its argument evaluates to a value of type r, since the continuation returns a result of type r. Thus, a Cont “contains” a waiting function.
Hence, this would work:
Cont (pythagoras_abstracted_cps 100 200)
return and >>=
Now let’s take a look at how the Cont monad defines return and >>=.
instance Monad (Cont r) where return a = Cont (\k -> k a) mv >>= f = Cont (\k -> runCont mv (\a -> runCont (f a) k))
return is simply the continuation version of identity. Consider the following:
id :: Int -> Int id x = x id_cps :: Int -> (Int -> r) -> r id_cps x k = k x id_abstracted_cps :: Int -> (Int -> r) -> r id_abstracted_cps x = (\k -> k x)
That’s exactly return in the Cont monad, with the addition of applying the monad’s constructor to the result, of course. It creates a computation that is going to return just plain x (its argument), once it has its continuation supplied. Sounds familiar? :)
>>= is just a little bit more complicated. Remember that the monadic values contains the computations just waiting for their continuation. Now suppose we have more than one of these functions. For instance, we want to combine the continuation versions of both add and square together.
square_cont 4 >>= add_cont 200
Before examining how >>= works in the Cont monad, remember that mv >>= f embodies the concept of extracting the inner value of mv, and somehow relate f to the inner value. The inner value is the computation waiting for its continuation (call this c), and f is a function that is going to take the result of the computation of c (which is bound to a, by the definition of continuations), and produce a new computation that is waiting for its continuation (call this c’).
Okay that was a mouthful. Let’s go through this step by step. First of all, remember that the mechanism, or the convention if you will, of functions written in continuation-passing style, is that we have a function that contains some code, and that function takes, as an “extra” argument, a continuation, to which it passes the result. This means that the function’s code itself, intuitively, will not contain any references to the continuation function passed into it, with the exception of the last “line” of the function which calls that continuation function. Look at the following function yet again:
add :: Int -> Int -> Int add x y = x + y add_cps :: Int -> Int -> (Int -> r) -> r add_cps x y k = k (add x y)
The only use of “k” is to call the continuation, and this is the last action. Continuations are defined as such, since the idea is to pass control to the continuation function. We can thus think of the function body being completely independent of the continuation, except for the final pass.
Again, by convention, the continuation is going to take the “result” of the function body as its argument. In this case, k is going to take a single argument, with the result of add x y.
Now we are ready to look back at the definition for >>=:
newtype Cont r a = Cont { runCont :: ((a -> r) -> r) }
instance Monad (Cont r) where
return a = Cont (\k -> k a)
mv >>= f = Cont (\k -> runCont mv (\a -> runCont (f a) k))
mv contains the computation (waiting for its continuation), which we extract using the runCont helper function. Hence, runCont mv produces that computation function (as we just said). It’s waiting for its continuation, so we pass it the continuation (\a -> runCont (f a) k). Remember that the definition of continuations is such that body of the computation can be “evaluated” to produce a “result”, independent of the continuation? That “result”, again by definition, is going to be passed into the continuation, which takes it in as a single argument. In other words, the “result” of the body of the computation is going to be bound to a in the continuation. This represents the notion of “evaluating” mv to produce its “result”.
Then we have the monadic function f. We recall that monadic functions are going to take a value somehow extracted (and perhaps tweaked a little) from the monadic value mv, and produce a monadic value. In our case of the Cont monad, what f is expecting is the result of the computation in mv. We hence give it exactly that, and that result was bound to a. The output, and therefore the result of (f a) is a monadic value, which is a new computation waiting for its continuation, again. We extract that computation using runCont (again), give it a continuation, k (which won’t evaluate because it’s waiting for k, see the lambda), and that completes the process. This represents the notion of “evaluating” f to produce the final “result”.
Note that we’ve been using the word “evaluate” and “result” again quite (okay, very) loosely. Nothing gets “evaluated” in mv >>= f, because we’re dealing with computations, or the notion of computations, depending on how you want to see it. Remember, Haskell is lazy. All that’s there is a promise to evaluate, when it’s required. Hence, the final “result” is nothing more than a large computation that is still waiting on a continuation — the top level continuation. Once we supply this does the entire computation have what is required to finally and truly evaluate, and when it does, it actually produces a result.
Let’s finally use the Cont monad:
-- add
add_cont :: Int -> Int -> Cont r Int
add_cont x y = return (add x y)
{- This creates the computation waiting for its continuation:
Cont (\k -> k (add x y)) -}
-- square
square_cont :: Int -> Cont r Int
square_cont x = return (square x)
{- This creates the computation waiting for its continuation:
Cont (\k -> k (square x)) -}
-- pythagoras
pythagoras_cont :: Int -> Int -> Cont r Int
pythagoras_cont x y =
square_cont x >>= (\x_squared ->
square_cont y >>= (\y_squared ->
add_cont x_squared y_squared >>= (\sum_of_squares ->
return sum_of_squares)))
-- pythagoras (with do-notation)
pythagoras_cont':: Int -> Int -> Cont r Int
pythagoras_cont' x y = do
x_squared <- square_cont x
y_squared <- square_cont y
sum_of_squares <- add_cont x_squared y_squared
return sum_of_squares
test1 = runCont (pythagoras_cont 3 4) print
test2 = runCont (pythagoras_cont' 3 4) (+100)
So far, we've understood how to write in continuation-passing style, and how the Cont monad takes care of turning functions into continuation-passing style by turning all functions in the Cont monad into computations that expect a continuation. Hence, all the functions that we define in the Cont monad are such expect-a-continuation functions. It also defines the >>= operator to compose such computations, and the >>= operator takes care of passing the right continuations through the composed functions, making everything nice and implicit.
A note on the "right" continuations
Observe that the "right" continuations change (obviously) as the computations "progress". Because we are composing computations using the monadic >>= bind operator, and because of how the Cont monad defines >>=, the "right" continuation is always going to contain the computations on the right hand side of >>=. In other words, given mv >>= f, when "evaluating" mv, the continuation to mv is going to contain the computation represented by f. Given mv >>= f >>= g, then the continuation to mv contains the computation of f (and not g), and the continuation to f contains the computation of g. What then is contained in the continuation of g? It's going to be the top-level continuation, the one that we pass in when we say runCont (...) print. In this example, the top-level continuation is print.
Note that if we write our code slightly differently (though equivalently), we get something a little bit different (but equivalent). Consider the code:
mv >>= (\x -> (f x) >>= (\y -> (g y)))
Now, the continuation to mv is going to contain the computation of (\x -> (f x) >>= (\y -> (g y))), and the continuation to (f x) is going to contain the computation of (\y -> (g y)). In this way, we can see things in a slightly different light. The continuation at any point is always the current-continuation, where the current-continuation represents "the rest of the program". In the case of Haskell, because we're trapped in the Cont monad, it represents "the rest of the monadic computation". However, the two forms are no different in practice.
Gaining even more control
Now, can we obtain even more control over our continuations, "jumping" to continuations whenever we want (and not just, as we've seen, at the end of a function under the Cont monad), and pass whatever we want into the continuation (and not just, as we've again seen, the currently computed value)? Indeed so, after all, what's stopping us from writing code that calls the continuation, k, at any point that we like? This is trivially simple in the continuation-passing style, for we have direct access to the continuations itself. For instance, we could call k under a condition, and do something else when the condition is false, as follows:
pythagoras_cps' :: Int -> Int -> (Int -> r) -> r
pythagoras_cps' x y k = square_cps x (\x_squared ->
square_cps y (\y_squared ->
add_cps x_squared y_squared (\sum_of_squares ->
if (sum_of_squares > 100) then k (sum_of_squares-10)
else k sum_of_squares)))
test3 = pythagoras_cps' 10 10 print
test4 = pythagoras_cps' 1 1 print
Hence, a close analogy in imperative languages is the return statement (not the monadic return function), whereby we use return to "exit" a function, and pass return an argument to, well, return. Similarly, in continuations, since we are "returning" to the continuation, we can simply call the continuation to exit, and pass the continuation an argument to, well, continue with.
However, how about when using the Cont monad? Remember that the Cont monad "hides" the continuation, by turning the computation (e.g. the pythagoras function) into what we called a waiting function. That means that we don't have direct access to the continuation, which is the whole purpose of the monad in the first place. How then can we access the continuation so that we have the power and flexibility we just saw?
We do that using a helper function called callCC (okay, whether you see it as a helper function or not is your choice). What is callCC? It's the commonly called call-with-current-continuation. This is a queer little fella, and we should get some intuitive idea about callCC first. So, what is it?
callCC takes a waiting function, and it is going to create a Cont (which contains that waiting function), and it's going to run it. But with what? It's waiting for a continuation. The obvious answer is that it's going to run it with the continuation it's given, as it's being composed via the >>= operator. See the section on "A note on the 'right' continuations" (above) for why. And the "right" continuation is -- the current-continuation. So what's the big deal? Well, the big deal is that without callCC, all the "right" continuations were implicit. Your code could not access the implicitly created continuations (all the \a -> ... stuff in the definition of >>=). With callCC, it exposes the continuation, by passing the current-continuation explicitly into the waiting function, and that being a waiting function, has an explicit argument that is expecting a continuation. First consider this code:
addOne :: Int -> Cont r Int addOne n = return (n+1) doSomething :: Int -> Cont r Int doSomething n = return n >>= (\x -> return (x-2)) test5 = runCont (doSomething 5) print
Nothing special above. Now let's put in a callCC, but not invoke the current-continuation passed to callCC.
doSomething' :: Int -> Cont r Int doSomething' n = return n >>= (\x -> callCC (\breakOutOfHere -> return (x-2))) test6 = runCont (doSomething' 5) print
Looks good, but doesn't do anything cool. Now let's invoke it conditionally.
doSomething'' :: Int -> Cont r Int
doSomething'' n = return n >>= (\x -> callCC (\breakOutOfHere ->
if (x>5) then breakOutOfHere 100000
else return (x-2)))
test7 = runCont (doSomething'' 5) print
test8 = runCont (doSomething'' 10) print
Now, we literally get to break out when we want, and pass (or throw) out whatever value we want when we break out. Now we have a fully hidden and implicit implementation of continuations, with the option to expose the continuation whenever we want, via callCC.
Thus, you saw that invoking the current-continuation (it has a name now) at any point in time inside the code of the waiting function, breaks out of the waiting function. Break out to where, you may ask. Again, the continuation is the current-continuation from the point of the callCC call. Hence, it breaks just straight out of the waiting function, to the point right after.
Type and definition of callCC
Now let's take a look at the type and definition of callCC:
arg = waiting function result
| |
v---------------------------v v------v
callCC :: ((a -> Cont r b) -> Cont r a) -> Cont r a
callCC f = Cont $ \k -> runCont (f (\a -> Cont $ \_ -> k a)) k
So, how does callCC do what it claims to do? First, we see that it takes f, a waiting function. Like all waiting functions, it needs to take in a continuation. We already figured that out. It then says that it wants to "run" f, so it has to call runCont on f. Again, that is quite simple. However, what does it pass to f? Normally (not in callCC), we would pass in the "right" continuation. However, for callCC, it doesn't want to do that. It wants to let the programmer break out of the waiting function, and where is that point exactly? It's the in "outer" continuation, which is in k. Now the function f is a waiting function that, when "evaluated", generates a result. As per our normal way of passing results into continuations, the "a" in our example here is going to take the result of f. But this time, callCC explicitly passes that result into k, which is the "outer" continuation. It breaks out with the result of f.
Now, the type. The argument to callCC is of type ((a -> Cont r b) -> Cont r a), which we can see is a function, f. We recall that this function is a waiting function, and that it takes a continuation. The continuation therefore has the type (a -> Cont r b). Why is the type different from the type of the previous continuations that we saw? That's because callCC is going to pass the current continuation, and the current continuation is going to represent the rest of the program. Remember also that we're in the Cont monad. So what does the "rest of the program" return? Well, it has to return something in the Cont monad! Hence the type of ((a -> Cont r b) -> Cont r a) for the function f.
This article is on what is loosely but oft referred to as “monadic functions”. It contains my personal and intuitive way of thinking of monadic functions, and it may not suit you well. However, for who out there who this may help, I hope it makes that little difference for you.
I notice that in most monad tutorials, there is tremendous detail in how monadic composition works, as well as examples on how the >>= bind operator is implemented in various commonly seen monads (Maybe, [], State, etc), but there is very little what exactly are monadic functions. The key idea that can be gleamed is that monadic functions are the stuff that act on monads, through the help of the >>= bind operator (which essentially finds a way to meaningfully “extract” the value that the monad contextualizes), and sometimes the return function as well. Also, we often see that monadic functions, say f and g, can be suitably composed as follows:
f, g :: a -> m b f >>= g
We also come across examples, usually just before introducing the beautiful do-notation, that look like this:
f, g :: a -> m b
someFunction x = return x >>= (\a ->
... some other code ...)
And then, somewhere in our journey, we come across code that looks like this (this example is in the IO monad, but is not peculiar to the IO monad):
someFunction x = do a <- getLine b <- getLine putStrLn (show x ++ a ++ b)
And then we may, perhaps, wonder: getLine is a function, but it's type signature is not that of a -> m b, or for the IO monad, a -> IO b. In fact, its signature is:
getLine :: IO String
So how is that a monadic function? And yet our intuition tells us that it's fine. Firstly, it works. Secondly, it's producing a monadic value, which gets "assigned" to a. We then mentally translate this to the de-sugared notation:
someFunction x = getLine >>= (\a ->
getLine >>= (\b ->
putStrLn (show x ++ a ++ b)))
And then things become clearer, because the type signature that we're looking for is embodied in the two anonymous functions. I don't know about you, but this bothered me for a while. Not from the formal correctness of it all, as it's obvious that's not a problem, but the intuitive sense of a "monadic function".
So let me try to sort of resolve the issue in a structured way in this article. If all this is already crystal clear to you, and you're thinking "this guy doesn't get it", then you may want to not read this, as this contains my personal and intuitive way of thinking of monadic functions, and it may not suit you well. However, for who out there who this may help, I hope it makes that little difference for you.
First off, some terminology:
- A monad is a typeclass that defines >>= and return (minimally).
- A monadic value is a value contextualized by a type constructor.
(Examples would be (Just 4), or [100, 200])- The inner value of a monadic value is the actual value that was contextualized by a type constructor.
(Examples would be 4, and each of (100, 200))- A monadic function is a function that produces a monadic value.
(Note that we said nothing about its input type)
Given mv >>= f, the main reason for requiring that >>= has the type of >>= :: m a -> (a -> m b) -> m b, and hence, f having the type of f :: a -> m b, is because we want a means to pass the inner value of mv to the monadic function f. However, this doesn't make everything apparent.
Hence, creating an example out of the Maybe monad, suppose we have the following monadic functions:
mf1, mf2 :: Int -> Maybe Int mf1 x = return (x*2) mf2 x = return (x*3)
And we want to compose them:
cmf :: Int -> Maybe Int cmf x = Just x >>= mf1 >>= mf2
Now suppose that we have some more "elaborate" monadic functions:
mf1, mf2 :: Bool -> Int -> Maybe Int mf1 b x = if b then Just (x*2) else Just (x*4) mf2 b x = if b then Just (x*3) else Just (x*9)
Oh now these functions take some extra stuff, and are no longer nice and simple like the original mf1 and mf2. Certainly (and very obviously so), direct composition does not work, as the type signatures are all messed up:
cmf' x = return x >>= mf1 >>= mf2
However, it doesn't mean that this doesn't work:
cmf x = do a <- mf1 True x b <- mf2 False a return (a*b)
Which is the same as this:
cmf x = mf1 True x >>= (\a ->
mf2 False a >>= (\b ->
return (a*b)))
The issue here is that our "definition" of a monadic function is a tad loose. Often, you will find that where the words monadic function is used, it can either refer to:
- Functions of the form f :: a -> m b, where a is the type of the inner value of the monad. (Call these classic monadic functions)
- Functions of the form f :: anything -> m b, where the input of the function really doesn't matter. (Call these loose monadic functions)
We note that for composability, as in cmf' x = return x >>= mf1 >>= mf2, our monadic functions mf1 and mf2 needed to be in classic form, where the input is only the type of the inner value of the monad.
However, any function that produces the monadic value can indeed be used "inside" the monad. By "inside" I mean code that acts within a monad, which is the code enclosed in the do-notation, and its corresponding de-sugared form. We can do this by making a wrapper around this loose monadic function, whereby the wrapper is a classic monadic function itself. I.e. a wrapper that takes in a single value of the type of the monad's inner value.
In our example with our two loose monadic functions mf1 and mf2, we successfully "composed" them as follows:
cmf x = mf1 True x >>= (\a ->
mf2 False a >>= (\b ->
return (a*b)))
which is:
cmf x = mf1 True x >>=
(\a -> mf2 False a >>=
(\b -> return (a*b)))
Again, the anonymous functions are the wrappers which are, themselves, classic monadic functions. How so? Take a look at the anonymous function that has a parameter "a". It takes a single input value, named "a", and the type of "a" (by type inference) is the inner value of mf1 (by definition of >>=). It's result is the result of the call of mf2, which we know is a loose monadic function and hence will produce a monadic value. Thus the result of the anonymous function is a monadic value. Hence, (\a -> mf2 False a) :: a -> mb. The exact same argument goes for the second anonymous function.
This leads to the following conclusion. The fundamental reason for requiring the f in mv >>= f to be of the type f :: a -> m b, with emphasis on the "a", is because in general, this function is designed to receive the value produced by another (may be "loose") monadic function, or simply stated by some monadic value, or created by a monadic function that does "nothing" but return a monadic value (think getLine :: IO String). Hence, all of these work (we're in the Maybe monad):
classicMonadicFunction x = Just (x+2) looseMonadicFunction x y z = Just (x+y+z) looseMonadicFunction' = Just 4 f = Just 4 >>= classicMonadicFunction -- Needs a wrapper f = Just 4 >>= (\x -> looseMonadicFunction x 100 200) -- Don't need a wrapper f = do classicMonadicFunction 4 f = do Just 4 >>= classicMonadicFunction f = do a <- looseMonadicFunction 10 100 200 classicMonadicFunction a f = do a <- classicMonadicFunction 4 looseMonadicFunction a 100 200 f = do a <- Just 4 classicMonadicFunction a f = do a <- looseMonadicFunction' looseMonadicFunction a 100 200
And you can work out all the rest of the combinations yourself. This gives more flexibility in designing and implementing monadic functions, as they may be loose, classic, or just monadic values themselves (Just 4).
Hence remember, the only true constraint in designing monadic functions (or when using them), is that they have to return a monadic value. Failure to do so will result in a type error under all conditions.
Why you can forget about loose and classic monadic functions:
Because the do-notation creates the necessary "wrappers" for you, you often don't have to think about classic or loose monadic functions. Any function which is going to throw back a monadic value (and do whatever side-effects it wants to), is going to work inside the do-block. It's after all, the same as you creating wrappers around every single monadic function (classic or loose), and thus guaranteeing that the f in mv >>= f is going to be of the correct type, and then calling your monadic functions (classic or loose) with the correct set of parameters. Hence, the wrappers take care of the fact that the assumption of the >>= bind operator is that it needs to find a meaningful way to pass in the "extracted" inner value, leaving everything else you write nice and simple. Put another way. since the do-notation handles the creation of all wrappers, you can use any monadic function (loose or classic), without ever thinking of whether they're loose or classic anymore (meaning you can forget about this article! ;)). Hence, all you have to know, intuitively, is that any function that returns a monadic value is a monadic function, and is good to go!
Going back to our getLine, a real example of where all these occurs is in indeed the getLine function of the IO monad. Recall that the type of the getLine function is as follows:
getLine :: IO String
This function is of the same type signature as our looseMonadicFunction'. It takes in nothing, because all of its functionality is in its side-effects: Grabbing a line from the terminal. It's only result is a monadic value, which contextualizes the string that it grabs inside the IO monad.
The putStrLn function, on the other hand, is a classic monadic function. It should be obvious why, from it's type signature. It returns a monadic value, and takes a single parameter of the type of the inner value:
putStrLn :: String -> IO ()
The previous post on Fundamentals of Monads suggested that monads empower the programmer with abstraction of context (algebraic data type) details, as well as order of evaluation, with the latter bringing us one step closer to the model of imperative languages when we need it, while still maintaining all the beautiful properties of the functional language that Haskell is. If you haven’t read the previous post, you may want to, as some of the terminology and intuition described there is used here.
However, for Haskell to be truly useful, not only do we need to guarantee order of execution (when it’s needed), but we also need a way to maintain state (again, when it’s needed). Too many problems are stateful (i.e. needs side-effects) by nature, and pure functional languages don’t quite handle the notion of state (as least in the primitive language constructs) very intuitively.
A simple solution that is often given to Haskell beginners, is that state can be captured by literally passing the state around functions, whereby each function has access to the state, and can modify the state.
However, that approach is tedious, and has a whole bunch of boilerplate. Again, can we find a way to abstract away that whole notion of state, so that a programmer can ignore the fact that he needs to pass state around, while still having the side-effect’ing notion of a state?
Indeed we can. Enter the State monad.
Note: At this point, because we’re introducing the state monad, we’re going to focus purely on the state monad. Haskell offers other ways of handling state, as well as more powerful monads that do this. This is just the very beginning.
As usual, a monad is characterized by the data type definition, and the definition for >>=. Here’s what the state monad looks like:
data State s a = State s -> (a, s)
Alternatively, to get a “getter” function for free, and using newtype instead, we can define it as follows (this is the formal definition):
newtype State s a = State { runState :: s -> (a, s) }
Now, this looks very different from the Maybe monad, and the List monad. The key difference is that the so-called monadic “value” (or rather, a more general and precise word would be “computation”, and since that’s what it truly is we’ll use that here, properly) is not a “value”, as in the intuitive sense that 4, or 100, is, but rather a function. See why it’s a computation? By the way, a computation can result in a value, but in this case, it’s a function.
In short, remember that monads give context to it’s value (or computation). In this case, our value (or computation) is a function, and we’re going to give it a context. That context is the state. Also, since the function acts on a state, it has to receive the state, logically. Hence, the function takes a state, of type s, and produces a result and a new state, represented in a tuple of types (a, s). Hence, remember, the state monad gives a function a context of state. Since those functions act on state (hey, it’s a state monad), we’ll call them state-transformation functions.
So, we have a state monad, that contextualizes a function with the context of state. This implies that the function we’re talking about manipulates state in some way. We’ll call them state-transformation functions.
runState gives us a “reference” to that function for free. In other words, runState “extracts” the function for us.
Is State a Monad? Not quite.
Looking at the type of State, we notice that it is:
State s a
That’s two free variables. If you were alert while figuring out monads, you’d realize that monads take only one free variable. Look:
class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b (>>) :: m a -> m b -> m b fail :: String -> m a
Everywhere, you see “m a” and “m b”, and by type inference, that means that m is a type constructor that takes one parameter. So how can State be a monad?
Indeed, State itself is not, but (State s) is. And that is why the type declaration for State seems to reverse the a and s. It needs to partially apply and get (State s), so that (State s) is a monad, and the parameter is a. What this means is that the type of the state, s, is constant, while the type of the result, a, is not, and indeed by change as we >>= stuff. Put another way, from the Monad typeclass above, m is (State s).
As such, for the state monad, the type definitions looks like:
class Monad (State s) where return :: a -> State s a (>>=) :: State s a -> (a -> State s b) -> State s b (>>) :: State s a -> State s b -> State s b fail :: String -> State s a
This is actually more important than it looks. The monad is (State s), which means that the monadic function expects a type of a (which is not a function, nor a state, but a resultant output value), and acts on that value, and returns a state-transformation function. Yours truly was, for a while, confused because I thought that, like simpler monads, the monadic function takes a state-transformation function (wrapped in the monad), acts on it, and returns a state-transformation function — just like how a monadic function for Maybe takes a value, acts on it, and returns a Maybe type. It does not!
The monadic function for state monads take a resultant output value, acts on it, and returns a state-transformation function.
Getting an Intuitive Sense of State
At this point, it would do us some good to revise the intuitive concept of a monad, and see exactly how it applies to state. To recall, a monad, in general, comprises of the following ideas:
- A monad gives a context to a value.
- It also allows us to abstract away details of the context.
- We also had the concept of a monadic value, which is the value which the monad contextualizes
- And we also had the concept of a monadic function, which is a function that acts on a monadic value.
How does that specifically apply to the state monad?
- A state monad gives a context to a function.
- It also allows us to abstract away details of the context.
- We also had the concept of a monadic value, which is the function which the state monad contextualizes
- And we also had the concept of a monadic function, which is a function that acts on a monadic value.
The tricky bit is the last point. For simple monads like Maybe, recall that the monadic function is a function that acts on the inner value of the monadic value (the 4 in Just 4), does something with it, and outputs a monadic value. Hence, this would be right:
monadicFunction :: Int -> Maybe Int monadicFunction x = Just (x + 1) Just 4 >>= monadicFunction ==> Just 5
How about for our state monad? Our monadic value represents a function. Hence, what is our monadic function? We’ll use exactly the same words here, to see the parallel.
Our monadic function is a function that takes a resultant output value, does something with it, and outputs a state-transformation function.
Yes, I’m belabouring the same point as in the previous section, because it’s important.
Exploring return
Here’s the definition of return, for the state monad:
return a = State (\s -> (a, s))
Quite simple. The constructor expects a function that takes a state and returns a result and a modified state, as explained earlier. When we give a “simple” value to return, we wrap it up in a state, and generate a function that, when given some state, just throws back the simple value, with no state modification. Can’t be simpler than that.
However, a common point of confusion here is what is the “state” that the function expects? We’re so used to wrapping values that sometimes we forget this is a function, so I’ll say it here again, this is a function. The state that the function expects, we do not know. It expects that state as a parameter so that it can act on the state! If this is obvious to you, please ignore it.
Exploring >>=
How about the >>= bind operator? At this point, let’s not confuse ourselves with the official definition of >>=. That will come very soon. But let us get some more intuition first: the >>= operator is a mechanism for composing state-transformation functions. That’s the outcome we want out of >>=.
So what, when dealing with state, what do we want the bind operator >>= to mean? Remember, monads are all about giving some context, and >>= is all about abstracting away that context (plus extracting values for sequenced computations). So, again, what does >>= mean?
Intuitively, and very accurately so, it’s just function composition! Yes, like head . tail composes a function that returns the second element of a list. But things are a bit special here: we are composing state-transformation functions, not ordinary (pure) functions like head and tail. Since state is sequential, we need to be careful to produce the correct intermediate state when composing functions.
Once again, we must remember not to get confused between state-transformation functions, which lives inside the state monad as part of the monadic value, and the monadic function, which is the f in m >>= f.
The state-transformation function, f’, looks like this:
f' :: s -> (a, s)
This state-transformation function is, again to repeat, wrapped up in our state monad, as the “value”. So, if we really have such a function, f, as defined above, we could certainly do this:
State f'
and happily wrap f up in a state monad. But that’s beside the point here.
The monadic function, f, is up to what the programmer wants to do with the resultant value. An example may look like this:
f :: a -> State s a -- which you can think of as f :: a -> (s -> (a, s)) f x = State (\s -> (x*2, s))
Now, remember the type of the >>= operator:
m >>= f :: m a -> (a -> m b) -> m b
What the bind operator >>= is doing, is it’s going to take a monadic value (containing our state-tranformation function), strips out the context, get the result of the state-transformation function, feed that as the resultant output value into the monadic function (f), which is going to produce another monadic value (containing a state-transformation function). It’s then going to give the new state-transformation function the new state produced by the first state-transformation function.
If that sounds like a mouthful, it is. Let’s go through this step by step.
First and foremost, here’s what >>= wants to achieve:
- >>= wants to compose state-transformation functions.
- It wants to take a state-transformation function wrapped in a state monad, and a monadic function.
- It knows that the monadic function is going to act on the result of the state-transformation function, and produce yet another state-transformation function
- It knows that at the end, it has two state-transformation functions to handle
- It wants to join the state-transformation functions, and correctly thread the state through them.
- It produces a new state-transformation function, which is the composition of the two functions, with properly threaded state.
The >>= bind operator looks like this:
m >>= f = State (\s ->
let (a, s') = (runState m) s
in runState (f a) s')
Let’s unpack that. m is the monad that contains a state-transformation function. Remember that runState basically pulls out that function. we can see it as a helper function that just accesses the inner value of the monad). Certainly, we could have patterned matched as well, like this:
State f' >>= f = State (\s ->
let (a, s') = f' s
in runState (f a) s')
And here’s how it achieves what it wants to achieve:
- (runState m) extracts the state-aware function, and applies a state to it. Whatever that state-aware function was supposed to do, it was waiting for a state, and hence was (and is still) just a computation (think of it as a piece of code, a function). We apply some state to it, and hence we can “run” it, but not quite. The state that we are applying to it is still a parameter of the new function that we are composing. So (f’ s), or (runState m) s, still remains a computation (a function) that is waiting on some state.
- Anyway, assuming that at some point that gets evaluated (when we provide the state), we bind that result to a, and its production of a new state, to s, in the tuple (a, s’).
- We then apply the monadic function, f, to the result of the previous computation, a. Remember that the monadic function is expecting the resultant output value. It’s also going to produce, as output, a state-transformation function.
- Hence, the new function produced by (f a) is yet another state-transformation function, waiting on a state, wrapped in a state monad. Hence we use runState to pull that out. However, going by the way we need to thread states, we give it the new state, s’, produced by the previous computation.
- The final result is a composition of the function wrapped in m, and the new function, f.
We can continue along this path, “composing” functions together which each modify the state (s gets modified to s’), into one huge function that contains all the modifications (think many nested lets), all waiting on intermediate states, whereby the outermost function is waiting on the initial state. We provide that initial state, and the whole gigantic function evaluates, and produces a new state, and a result.
Also, remember that monads provide a guarantee of order of execution, by means of lambda functions (see the post on Fundamentals of Monads. Note also that our >>= operator is doing exactly that. Let’s see what happens when we compose more than one function.
m >>= f >>= g
==> (State (\s1 ->
let (a1, s1') = (runState m) s1
in runState (f a1) s1')) >>= g
==> (State (\s2 ->
let (a2, s2') = (runState (State (\s1 ->
let (a1, s1') = (runState m) s1
in runState (f a1) s1'))) s2
in runState (g a2) s2'))
Notice how the lambda expressions are forcing the correct order of evaluation, and how the state is correctly threaded through the final expression? s2, in this case, is the initial state that, when provided, triggers the whole evaluation of the composed state-aware functions inside m, of f and of g:
runState (State (\s2 ->
let (a2, s2') = (runState (State (\s1 ->
let (a1, s1') = (runState m) s1
in runState (f a1) s1'))) s2
in runState (g a2) s2')) initState
Every (algebraic) data type is created for a very specific purpose. A data type isn’t made because you feel like making one (even if you can). Those data types which exist for no reason, in general, will have difficulty being generalized (into monads, as we will see later). However, let’s assume we’re all keen and astute programmers who won’t do that for the heck of it, and that we’re actually trying to accomplish something.
With that, let’s take a whirl around the idea of algebraic data types, and what they really mean. As mentioned, when a data type is created, it’s because it is trying to represent a particular concept or property.
Assumptions: You understand the Haskell type system (typeclasses, algebraic data types, type declarations, etc), and Haskell syntax in general.
For instance, suppose we want to represent failure. Haskell’s Maybe type does that. The very reason we type something as a Maybe, is because we want to represent failure on that something. As an example, if a function f decides whether or not an integer is a perfect square, the intuitive result that the function should return is a boolean — true or false. Either something is a perfect square, or is not. This should be simple.
Hence, a C function may look something like this:
bool isPerfectSquare (int i) {
if (code_to_check_for_perfect_square)
return true;
else
return false;
}
However, what if we pass that function a negative number? By definition, over integers, negative numbers cannot be perfect squares. Hence, we would want to say that the function failed.
In most imperative languages (C/C++, Java, etc), we would use a failure value to represent failure. Perhaps we would also like to abort the “execution” of that function. However, we cannot use a boolean here anymore, because we essentially need a third result, and there are only two possible values for a boolean. Hence, a common (and ugly) solution in C would be to “upgrade” (essentially a downgrade) the return type to an integer. Hence, a C function may look something like this:
int isPerfectSquare (int i) {
if (i < 0)
return -1;
if (code_to_check_for_perfect_square)
return 1;
else
return 0;
}
There are similar strategies, but basically they all involve using a "magic" value to represent failure. That may be -1, NULL, 0, or whatever other value pleases the programmer.
However, such strategies place the implicit meaning of the "magic" value on the programmer -- the human. The programmer says that -1 means failure, and humanly enforces it. The compiler (or the code) doesn't know that, and hence cannot enforce it. If another programmer comes along and takes isPerfectSquare() for a ride, and decides to alter it and state that -1 means something else, and -999 now means failure, then any code that relies on isPerfectSquare() is going to break. This is not a good approach.
Haskell solves this by allowing the programmer to define new custom types that gives a context to various things (values). For example, in the isPerfectSquare function, Haskell lets a programmer state, through a type, that the function has failed, or succeeded, and if it succeeded, what the result is. The custom data type here is the Maybe type, and is defined as follows:
data Maybe a = Just a | Nothing
Hence, suppose we have a result of True (or False), we can type the result (a boolean) with the Maybe type, by saying "Just True" (or "Just False"). Note that here, it is assumed that you understand the Haskell type system, at least where type constructors, value constructors and polymorphic/parameterized types are concerned. And if the function failed, we can just return a Nothing.
Giving a Context
The idea here is not Haskell's type system, but the idea of typing a value. Alternatively, you can see it as encapsulating, or wrapping, a value. Regardless of how you see it, the key thing is that we are now placing a value (the boolean, in this example), into a context. The context is failure. Hence, in the example above, we just placed the resultant True or False into a context. There is always a very specific reason for creating a particular custom type (Maybe, in this case), and for Maybe, the reason is to represent failure.
If there is another data type, then it must have a reason to have been created.
For instance, if you created a data type called Person:
data Person = Person String String String deriving (Show)
Then the reason for the Person data type is that you, the programmer, has decided that a Person has three string fields (maybe first name, last name, middle name).
If you create a data type called Tree, as follows:
data Tree = Empty | Tree a (Tree a) (Tree a) deriving (Show)
then a possible reason is that you are representing indecision, whereby the tree illustrates all possible decisions -- all outcomes for a given input.
Similarly, for a list:
data List = Empty | List a (List a) deriving (Show)
For all these data types, whenever we type a value as that data type, we are giving that value a context. And that context has a meaning in which you can articulate. For instance, a Tree 4 has type Tree Int, and what it means is that you have a decision with input 4. Since the tree has no branches, there are no decisions and outcomes yet. If you have a Tree 4 (Tree 5) (Tree 6), it represents a decision with input 4, that can yield a 5 or a 6 (if you "traverse" the tree). Similarly, if you have a list like List 5 (List 6 (List 7)), you may be representing non-determinism, whereby a value can take either 5, 6 or 7.
Note: it is important to note that data types have a reason for their existence. If not, it becomes difficult to see how they can be generalized into a monad.
Abstracting Away (Each and Every Different) Context
Now, what happens with our isPerfectSquare function? Let's write the proper Haskell version of isPerfectSquare, complete with the Maybe type to represent failure. We also modify it a bit, because we don't just want a boolean, we want the value of the function succeeds. In other words, if it's not a perfect square, we return Nothing, if it is, we return the number itself. Let's also add another similar function that decides if something is a big number a not. We decide that numbers larger than 1024 are "big":
isPerfectSquare :: Integer -> Maybe Integer
isPerfectSquare n =
let sq = floor (sqrt (fromIntegral n :: Double)) in
case (sq * sq == n) of
True -> Just n
False -> Nothing
isBigNumber :: Integer -> Maybe Integer
isBigNumber n
| n < 1024 = Nothing
| otherwise = Just n
Great, nice and simple. Now what if we want to treat those two functions as primitives, and develop a isBigAndPerfectSquare function, which tests if an integer is larger than 1024, and is a perfect square? One way to do that would be to check if an integer is a perfect square (isPerfectSquare), and if it's bigger than 1024 (isBigNumber). Not the most efficient, but just for illustration purposes. Hence we can write this:
isBigAndPerfectSquare :: Integer -> Maybe Integer
isBigAndPerfectSquare n =
case (isPerfectSquare n) of
Nothing -> Nothing
Just x -> case (isBigNumber n) of
Nothing -> Nothing
Just y -> Just y
Again, there are a ton of better ways to write it. It's for illustration. Let's go further.
isMagicNumber :: Integer -> Maybe Integer
isMagicNumber n
| n == 262144 = Just n
| otherwise = Nothing
isBigAndPerfectSquareAndMagic :: Integer -> Maybe Integer
isBigAndPerfectSquareAndMagic n =
case (isPerfectSquare n) of
Nothing -> Nothing
Just x -> case (isBigNumber n) of
Nothing -> Nothing
Just y -> case (isMagicNumber n) of
Nothing -> Nothing
Just z -> Just z
By now something is becoming very apparent. The reason for the custom data type is worming its way into your code. Everytime you have a "primitive" function (such is isPerfectSquare) which returns a result (or computes to a value) of a data type that has a context (i.e. Maybe), you have to check that context (is it a Just x, or a Nothing?) should you wish to use the result.
To state that again, when a result has a context (i.e. Maybe, to represent failure), using that result means unwrapping the context to (a) extract the value out so that you can actually act on the value, and (b) checking the entire context to see if you need to handle special cases. The special case is failure, the value is the x in Just x. We performed the value extraction by pattern matching against Just False and Just True, and we checked the special case by pattern matching against Nothing. Once again, in this case it's simple, because Maybe is quite simple. If you have a more complex data type, you have to check and extract more stuff. Very quickly, this becomes a pain. Also notice one more thing: The point at which things became a pain was when two "primitives" were sequenced together. If we have a function that just calls isPerfectSquare once, it would be fine. The function could declare a return type of Maybe Bool, and call isPerfectSquare, and return what isPerfectSquare returns, which is a Maybe Bool. All's fine and dandy. Things got annoying the moment we sequenced calls to isPerfectSquare. In other words, when we want to act on the value of isPerfectSquare, and maybe call isPerfectSquare again, or even another function that will take the value we extracted and act on it, that all that special case and extraction stuff occurred. Therein lies a key point: when we sequence these calls, things get messy.
Key idea:
Hence, whenever we have a context that we build up or assign, because we want to represent something, it means we have to deal with the context. Since every context is different (represents different things), the manner in which to:
(b) Handle special cases
(a) Extract the value
is going to be different.
There are two ways we can handle (a) and (b). We can make the programmer (who uses the type) responsible for value extraction and special case handling. Or, we can make the type creator responsible for it. Of course, they may be the same person, but that's the fundamentals of abstraction/generalization anyway, right?
What we've seen above is making the programmer (who uses the type) responsible for it. He had to write code that checks for the special case (Nothing) and the value extraction (True and False). Not only is that a pain (lots of typing!), but it also distracts away from the real thing the code is doing (check for power of 4 and power of 8). It's hard to read, annoying to debug, and so on. Basically, it violates the principle of abstraction.
So, how do we make the type creator responsible for all that? Guess what? We abstract the process of sequencing, and define how (a) values can be extracted, and (b) special cases can be handled, when such functions (Integer -> Maybe Bool, in this example) are sequenced. It's very important to note that the idea of abstraction here acts on sequenced functions.
Hence, we define a special operator, which we call >>=, which represents sequencing. Hence, for the above example, when dealing with Maybe, we define >>= for Maybe as follows:
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b (>>=) Nothing f = Nothing (>>=) Just x f = f x
Responsibilities of >>=:
The bind operator >>= has a few jobs to do. As alluded to above, it certainly has to define how to extract the value, and handle special cases. However, more generally, this is what is has to do or can do:
(a) Handle special cases
(b) Extract the value
(c) Modify the value (if it wants to)
(d) Find a way to meaningfully interact with function f
So, the operator takes a Maybe type value, and a function, and somehow feeds the value to the function. The value is simple: It's just that -- Just 16, Just 4, Nothing. The function f is interesting. It represents the function(s) that we want to use in sequence. A possible function, in our example, is isPerfectSquare. Note that this is not limited to sequencing the same function.
Hence, the idea is this:
isBigAndPerfectSquare :: Integer -> Maybe Integer
isBigAndPerfectSquare n =
Just n >>= isPerfectSquare >>= isBigNumber
isBigAndPerfectSquareAndMagic :: Integer -> Maybe Integer
isBigAndPerfectSquareAndMagic n =
Just n >>= isPerfectSquare >>= isBigNumber >>= isMagicNumber
What a MASSIVE reduction in code and improvement in readability. All that value extraction (Just x) and special cases (Nothing) were all taken care of by the beautiful >>= operator. Sequencing is now easy, and clean.
Notice also that we "fed" Just n into the various functions, and not plain n. That's because the >>= operator abstracts away the unwrapping (value extraction) process, and it needs something that is wrapped to unwrap. The type signature of >>='s first parameter confirms this. It's a Maybe a. Hence, we have to wrap our first n. This is good, if you think about it. n should exist with it's context, not floating around. The only time it "floats" as a plain value, is when it's inside a function, such as isPerfectSquare and isBigNumber. At all other times, it is safely wrapped in it's data type -- Maybe. That keeps things clean.
However, having the programmer deal with wrapping things up is all fine and dandy, but would it be nice if there was a generic way to just wrap things up, no matter what data type the programmer is using? Would it be possible to make the type creator responsible for that aspect as well? Turns out there is. And since we're making the type creator responsible for more and more of these abstraction stuff, maybe all of this behaviour can be made part of a typeclass, and again, indeed it is. That's your monad typeclass, which we are ready to introduce.
class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b (>>) :: m a -> m b -> m b (>>) m m' = m >>= \_ -> m' fail :: String -> m a
First thing to note: The return types of all the functions defined in the typeclass are monads. We've already introduced >>=, and the meaning here is exactly the same (intuitively at least, since we haven't formally introduced it) as what we've been doing above. Basically, it takes a monadic value (our Just x, above), and feeds it into a function that expects the actual value (the inner value, or x, in our example), does something to the value, and spits out yet another monadic value.
Hence, Just x >>= isPerfectSquare feeds Just n into isPerfectSquare, and the definition for >>= results in the extraction of x and calling of isPerfectSquare x (remember when we made the type creator do all that extraction work and special case handling work?). isPerfectSquare x produced either a Nothing or a Just y, which is of our monadic type. The type creator must define >>=. In other words, what it means to extract and handle the edge cases (yes, again! this is important!)
>> is a convenience function, which does exactly what >>= does, but it doesn't care about the input. It just spits out the second parameter, a monadic value, m'.
fail is a function, which takes an error string and gives a monad, which represents failure, possibly including the given error string. We'll ignore this because we hardly define it.
Hence, if we wanted to make Maybe a monad (it already is), then we would do the following:
instance Monad Maybe where return x = Just x Nothing >>= f = Nothing Just x >>= f = f x
We should, by now, understand how and why >>= is defined. Return is where we said we wanted to make the type creator responsible for wrapping something into a type (a monad). In the case of Maybe, the simplest way to wrap something (actually the only way, for Maybe) is to put it into a Just. Nothing is not a good idea, because Nothing means failure. If we want to wrap the value 4, we don't want to generate failure, we want Just 4.
Hence, writing monadically, we should do this:
isBigAndPerfectSquare :: Integer -> Maybe Integer
isBigAndPerfectSquare n =
return n >>= isPerfectSquare >>= isBigNumber
isBigAndPerfectSquareAndMagic :: Integer -> Maybe Integer
isBigAndPerfectSquareAndMagic n =
return n >>= isPerfectSquare >>= isBigNumber >>= isMagicNumber
Creating Order Of Evaluation
Remember that in a lazily evaluated language like Haskell, there is no guarantee if and when expressions (or computations) are evaluated. For instance, given the following code:
myFunction :: Int -> Bool myFunction x = if x > 100 then True else False
We have no guarantee if and when True, or False, will be evaluated. Everything is lazy. Similarly:
myFunction :: Integer -> [Integer] myFunction x = take 10 [x..]
does not generate the entire list, only the first ten elements. In short, we cannot determine when evaluation occurs.
However, while this has a LOT of benefits (performance, conciseness, infinitity, etc), when dealing with the real, external world, many times evaluation order matters, and matters a lot. For instance, it matters when writing to a file, and reading from one. Same with databases (they can be viewed as fancy files). And so on. Hence, we need a way to guarantee that the order of evaluation can be predictable when we need it to be predictable.
How can we do this in a purely functional, lazily evaluated language? By the observation that we can force evaluation order by ensuring that a computations X require a computation Y to compute. Hence, Y must evaluate before X.
If we express X and Y as anonymous functions, we can do something like this:
((\y -> y / ((\x -> x + 1) 3)) 12)
This guarantees that the (x+1) function must run first, because the (y/x) function relies on the computed result (the value) of the (x+1) function for it to be evaluated. Hence (x+1) returns 4, given its argument, which is fed to the (y/x) function to becomes (y/4). the (y/4) function takes the parameter 12, and computes the result of 3.
Note how similar this is to assignment in imperative languages! In fact, this is the exact equivalent of the imperative code:
x = 3 x = x + 1 y = 12 y = y / x
Imperative languages determine order of execution simply by order of listing. The compiler MUST generate code that evaluates those 4 statements in that order (optimizing compilers can rearrange, but the meaning must not change). Since declarative languages cannot have such multi-statement "programs", and can only consist of functions which consist of expressions which evaluate, we then observe that execution "flow" can be simply expressed by a bunch of interdependent nested functions, exactly as above. In essence, we are forcing the compiler (or the evaluator) to evaluate the (x+1) computation before the (y/x) function, because we created a scenario in which it simply had no choice but to do so, no matter how lazy it wanted to be.
How does this relate to monads? Or rather, how do monads help to create order of evaluation in a nice and simple way? Make the following observation:
This function (exactly the same as above):
isBigAndPerfectSquareAndMagic :: Integer -> Maybe Integer
isBigAndPerfectSquareAndMagic n =
return n >>= isPerfectSquare >>= isBigNumber >>= isMagicNumber
is computationally identical to this function:
isBigAndPerfectSquareAndMagic n =
return n >>= (\x ->
isPerfectSquare x >>= (\y ->
isBigNumber y >>= (\z ->
isMagicNumber z)))
The ONLY difference is that now the "fed" and "unpacked" values (from the contextualized monadic value) are given nice names (x, y and z), and the functions are called with those names, instead of being implicit. To be obvious:
return n >>= isPerfectSquare ==> Just n >>= isPerfectSquare -- by definition of return ==> isPerfectSquare n -- by defintion of >>=
which is exactly the same as:
return n >>= (\x -> isPerfectSquare x) ==> Just n >>= (\x -> isPerfectSquare x) -- by definition of return ==> (\x -> isPerfectSquare x) n -- by definition of >>= ==> isPerfectSquare n -- by function application
Hence, when we sequence expressions using monads, via >>=, we are in effect, guaranteeing their order of evaluation! Hence:
return n >>= (\x ->
isPerfectSquare x >>= (\y ->
isBigNumber y >>= (\z ->
isMagicNumber z)))
is identical to the imperative code:
x = construct(n) y = isPerfectSquare x z = isBigNumber y return(z)
neglecting the Maybe stuff, and construct means the monad's "return", and return means the imperative style "return" of actually returning a value from a function explicitly.
Since we now have order of evaluation guarantee, we can essentially "write" imperative looking code, in Haskell! Now this is not going to break referential transparency, because at this point, we are just forcing order of evaluation, not dealing with state, or variables. In other words, it would be nice if we could write the above Haskell code something as follows, right?
x = return n y = isPerfectSquare x z = isBigNumber y isMagicNumber z
Since the = operator in Haskell already has a meaning, we give it another operator instead. How about:
x <- return n y <- isPerfectSquare x z <- isBigNumber y isMagicNumber z
Looks good. Haskell provides exactly that in its do-notation, as in the following real Haskell code:
isBigAndPerfectSquareAndMagic n = do x <- return n y <- isPerfectSquare x z <- isBigNumber y isMagicNumber z
Wonderful looking syntax. It's just syntatic sugar, so it really and precisely means the previous looking code with all the lambda expressions (anonymous functions). One thing to note, is that the last expression in the do-notation cannot have an "assignment" type thing. That's because, if you translate the original code with all the lambda expressions, you realize that the last lambda is the expression to be evaluated and returned. Making an assignment there does not make sense, and is not permitted. If you try, Haskell tells you this:
The last statement in a 'do' construct must be an expression.
So what if we want to return something else? Say, for some weird reason, we want isBigAndPerfectSquareAndMagic to always return 1000. Then we do this:
isBigAndPerfectSquareAndMagic n = do x <- return n y <- isPerfectSquare x z <- isBigNumber y w <- isMagicNumber z return 1000
Of course! We use return to pack up 1000 as a monadic value (Just 1000, in this case), and since it's the last expression, we return it. See the complete abstraction over the monad (Maybe, again, in this case) at work here?
And of course, all this is possible only because Maybe is a monad, which has the wonderful >>= operator implemented by the conscientious! With >>=, we get:
Properties of Monads:
1. Abstraction from context (unpacking, special cases)
2. Guaranteed order of evaluation
That's the fundamental property of monads (there's loads more).
A Queer Example
However, just to clarify things and hopefully get a better mental picture of how Haskell really treats types, let’s play with a really weird typeclass, and find a way to make an instance of it.
class Weird p where weird :: j k -> p k j
The weird function takes one parameter, and returns a single, concrete type. We look for -> to determine parameters and return values. However, the type of the argument and the return value is a little more complicated. Let’s work it out.
Since (j k) must be a concrete type, we can derive the kinds of j and k quite easily, because k itself must be a concrete type, and j is clearly a type constructor, that takes a single concrete type (represented by k).
k :: * j :: * -> *
Since we have derived the kinds of j and k, we can figure out p, since p is also clearly a type constructor, taking two types, k and j. And we know the kinds of k and j.
p :: * -> (* -> *) -> *
What this Weird typeclass is saying, is that instances of it, represented by p, must be of kind: * -> (* -> *) -> *. This is no different from having a typeclass as follows:
class Eq a where (==) :: a -> a -> Bool
In this case, a is obviously a concrete type, and has kind of a :: *.
Similarly,
class Functor f where fmap :: (a -> b) -> f a -> f b
In this case, f is clearly a type constructor taking one concrete type, and hence f :: * -> *. Hence, instances of the Functor typeclass, as you are probably already familiar with by now, expects a type constructor taking one concrete type. Examples of those are [] and Maybe.
Hence, instances of Weird must be TYPE CONSTRUCTORS, taking one conrete type, and another type constructor (that takes one concrete type). Hence, the kind is
p :: * -> (* -> *) -> *.
Let’s construct everything ourselves.
Since k is a concrete type, we can construct one trivially. Note that we don’t really have to do this, we could just as well use any existing such type, such as Int. We can test this using:
:t SomeK
data SomeK = SomeK deriving (Show)
Since j is a type constructor that expects a concrete type we can also construct j quite easily. Again note that we don’t really have to do this construction, we could just use an existing such type, such as Maybe (I’m using this loosely). We can test this using:
:t SomeJ SomeK
data SomeJ a = SomeJ a deriving (Show)
And finally, we know that p is a type constructor that expects two parameters: first a concrete type, and then a type constructor that is going to take a single concrete type. We construct that as well. This we probably have to do, because there isn’t an easily available type that looks like this. In order to ensure that “b” is treated as a type constructor, we “apply” it to “a”. We also reverse the resulting type (constructing with SomeP b a results in type SomeP a b) because that’s what the weird function expects. We want to match it. We can test this using:
:t SomeP (SomeJ SomeK)
data SomeP a b = SomeP (b a) deriving (Show)
Now let’s make an instance of Weird. Remember that the weird function takes one argument, of type (j k). However, in the function itself, we don’t have to specify the type (since it’s already been specified in the class, obviously). We’ll just call that argument x, in Haskell style. Next, we want to create the function’s body such that it returns a (p k j). We don’t really care what the function body does, because this is a play on types. We just care about the return type.
class Weird where weird :: j k -> p k j
The easiest, immediate thing that is going to return a (p k j) is the SomeP that we constructed (again obviously because we were basing the construction on the requirements of weird. Hence, the SomeP value constructor is going to create a type of SomeP a b, where a represents k, and b represents j. In order for the SomeP value constructor to produce SomeP a b (or equivalently SomeP k j), it’s going to take SomeP b a (or equivalently SomeP j k). (j k) is exactly the type of x. Hence, SomeP x will trigger the type constructor of SomeP (j k), and create a TYPE of SomeP k j. That’s exactly (p k j).
instance Weird SomeP where weird x = SomeP x
We have an instance of Weird! Not saying it’s useful, but well, it’s just to help understanding.
Types of Types
We’ve seen, at an observer level, how Haskell deals with types and type classes. But how does it actually formalize that so that it can get the checking right? While values have types (and expressions generate values), and we can use types to guarantee that functions which expect a type gets exactly that type. How then does Haskell guarantee that type constructors get the type it expects? The answer is elegant and beautiful — Haskell gives types, types. That means that each type, has a meta-type, and it’s called a “kind”.
:k StringString ==> StringString :: * :k StringAnything ==> StringString :: * -> *
What this means is that StringString is going to return a concrete type all on its own (remember that constructors with no type variables create concrete types directly). StringAnything, on the other hand, is a type constructor that takes a concrete type and returns a concrete type. That means that plain old StringAnything is not a concrete type (again, similar to what we discussed earlier). Similarly, Maybe is a type constructor that takes a concrete type to produce a concrete type, and Maybe itself is not a concrete type.
We’re used to seeing functions as follows:
myFunction1 :: Int -> Bool
And also functions that take type variables that represent a concrete type
myFunction2 :: a -> Bool (usually with class constraints added) myFunction3 :: a -> b (usually with class constraints added)
What if we have a function that takes a type constructor?
myFunction4 :: c a -> Bool myFunction4 x = False
Look at this function. It takes one parameter, of type “c a”. Not just “a”, but “c a”. It’s obvious that “c” is being used as a type constructor, and in fact, it takes exactly one concrete type, “a”. An example for “c” would be “Maybe”. And we could apply myFunction4 as follows:
myFunction4 (Just 4) ==> False :t myFunction4 ==> c a -> Bool :t myFunction4 (Just 4) ==> Bool
What’s the kind of “a” and “c” then? We know that (c a) must produce a concrete type, because function parameters have to take a concrete type, not some airy-fairy type. Hence:
(c a) :: *
We also know that “c” is a type constructor that takes one concrete type, hence this follows:
c :: * -> * a :: *
Let’s try one more:
myFunction5 :: d c a -> Bool myFunction5 x = False
Can you derive the following?
Since (d c a) :: * and a :: * Therefore c :: * -> * and d :: (* -> *) -> * -> *
Well, that’s my summary of Haskell types.
TypeClasses
And then you have typeclasses, which is basically a class (absolutely nothing to do with OOP classes in C/C++ or Java or similar languages) to which types may belong, and the typeclass defines certain behaviors that types belonging to it will exhibit. For instance, in the standard library, Eq is a typeclass, and the behavior that Eq defines is that two values can be “equated”, whatever the meaning of equating two values of a type may mean. Int is a type. Since Int belongs to Eq, it means that all values of Int (e.g. 100, 423, -200, etc) have a meaning when they are equated. This meaning is familiar to all of us, 100 == 100 is true, but 100 == 423 is not. The manner in which typeclasses define behaviors, is through a set of one or more functions that the typeclass defines, that all members of the typeclass must implement. For instance, in our venerable Eq typeclass, the operators (which are functions) “==” and “/=” are defined. Hence all members of the Eq typeclass must be able to be “==”‘ed or “/=”‘ed.
And now let’s make a typeclass for our strings:
class ReturnSame a where returnSame :: a -> a
This is a rather useless typeclass, and the behavior is that any member of which must implement returnSame. As the name suggests, we are going to return exactly what we get, and just that. Note that “a” here is similar to the above, it’s a type variable. In this case, it represents the instance that we are going to create, and how that instance behaves/is used. To make that clearer, let’s use an example.
Let’s make our StringString and StringAnything instances of the typeclass:
instance BecomesFalse StringString where turnToFalse x = x instance BecomesFalse (StringAnything a) where turnToFalse x = x
Let’s play with it:
turnToFalse (StringString "John" "Peter") ==> StringString "John" "Peter" :t turnToFalse (StringString "John" "Peter") ==> StringString turnToFalse (StringAnything "John" ["Peter"]) ==> StringAnything "John" ["Peter"] :t turnToFalse (StringAnything "John" ["Peter"]) ==> StringAnything [[Char]]
There we go, we have our own type, and our own typeclass.
Understanding the Haskell type system goes a long way, and this is just touching on the tip of the iceberg. Hence, this is not meant to be an utterly complete and formal introduction to typeclasses, or types. For that, there are great resources out there. This is one of them. There are also missing bits, such as type subclasses (type constraints), and so on.
Once again, remember:
- Value constructors are functions, and they take parameters, and produce a value
- You cannot write StringAnything Int, in code, that’s a type constructor. You can only write StringAnything 4, that’s a value constructor.
- The value has a type, and is defined by the type constructor
- Type constructors that have no variables are concrete types by themselves (e.g. StringString)
- Type constructors that have variables are polymorphic types, and are NOT types by themselves (e.g. StringAnything)
- For these, you have to give the variable a value (a concrete type), and then the type constructor creates a concrete type (e.g. StringAnything Int, StringAnything [Char])
The Basics
Haskell has an incredibly expressive and powerful type system, but in understanding it there needs to be some mental concepts that one needs to get right. The key one (for me) is the very conscious and internalized distinction between what is a type constructor and what is a value (or data) constructor. I’ll use value constructor here though both terms represent the meaning just as well to me.
Here’s a very simple introduction. Basically, when we define a new data type, using the familiar “data” keyword, we are defining both a type constructor and a value constructor.
Haskell lets us define our own typeclasses and types.
Types
Here’s a very simple example of a custom type (just ignore the deriving (Show) bit if you don’t already know it):
data StringString = StringString String String deriving (Show)
Let’s play with it:
:t StringString "aaa" "bbb" ==> StringString :t StringString "John" "Paul" ==> StringString
Note that the StringString on the right of the = sign is the value constructor. Hence when we say StringString “aaa” “bbb”, we are calling upon the value constructor to create a value, which “calls” (I’m being loose here) the type constructor at compile time to construct the type StringString String String. Hence, StringString is NOT a type, it’s a type constructor. StringString String String is indeed a type (often called a concrete type), which was constructed via the type constructor StringString. Also, note that we purposely chose to give the type constructor and value constructor the same name. It’s style, and it’s not necessary. It’s usually done only when there is only one value constructor for a given type.
We can also define a polymorphic type, where “a” can mean anything. The term for “a” is a type variable. That means it’s a variable, which holds a type — any type:
data StringAnything a = StringAnything String a deriving (Show)
Let’s play with it:
:t StringAnything "aaa" 123 ==> Num a => StringAnything a :t StringAnything "John" ["Jane"] ==> StringAnything [[Char]]
Note (this is important): The value constructor references the type constructor to generate the type, using any type variables that are present in the value constructor to “pass” to the type constructor. Note that the value constructor need not use all of the type variables that are in the type constructor. But any type variables used in the value constructor must be seen by the type constructor, else it is out of scope and will result in a compile-time error. We actually saw some of these stuff in the first type variable in StringAnything already. Notice that the type of StringAnything is StringAnything a, where a is the type of the second parameter to its value constructor.
To make this really obvious, these are possible, and valid:
data SomeType1 a b c = SomeType1 deriving (Show) data SomeType2 a b c = SomeType2 a deriving (Show) data SomeType3 a b c = SomeType3 a b deriving (Show) data SomeType4 a b c = SomeType4 b a deriving (Show) data SomeType5 a b c = SomeType5 b a c deriving (Show) data SomeType6 c = SomeType6 Int String c deriving (Show)
I’ve been thinking about the way Apple presents itself to its users, and their “it just works” philosophy. Basically, whether that’s marketing, or is it true to some extent? Now there’s already been much thought, books even, on the matter of why Apple is so successful as a company, so this isn’t some attempt to rehash those same things.
So, Apple is famous for not releasing something when the product isn’t complete, or up to their satisfaction. I mean, come on, we all remember the copy-and-paste “feature” on iOS 3 back in 2009. That’s copy and paste functionality for you, in the third version of the OS offering. For those who want to reminisce, there’s a link to engadget.
However, here’s the thing, and since this is casual, forgive the generalization. Most other companies have a tendency ship products that are in “beta”, or half-done, following the strategy of who-ships-first-gets-the-first-catch. As Jeff Atwood says in his blog, Coding Horror, in this article.
And as a result, they put a place-marker on the expectations of their users. However, they make the fatal mistake of then keeping absolutely quiet, while they go about their internal business of fixing things, and this is also where Jeff makes his key point — it’s what you do after you release that matters orders of magnitude more than the release itself. To quote:
“Instead of spending three months fixing up this version in a sterile, isolated lab, you could be spending that same three month period listening to feedback from real live, honest-to-god, dedicated users of your software.”
- Jeff Atwood, Coding Horror
So the bottom line is this, if the first beta is a piece of junk, and you don’t engage your users and show them and convince them that you are taking them seriously and that you are working hard to make the next release great, then many people are going to get instantly turned off. Now, it would take a whole new level of encouragement and even goading to make those users try the product again. This is even after the product is now at version 3, and is pretty close to fantastic.
Apple, on the other hand, gives the impression that whatever they touch will become gold, not because it really does (no offence, the guys at Apple are probably some seriously smart guys), but because of the whole concept of expectation management. They hold back, they tempt, and when they deliver, it’s just right, it just works — true to their tag phrase. The result of that? People are willing to place their bets on Apple products which they haven’t seen.
Windows has long moved to 64-bit, and the kernel is 64-bit. However, we also notice that 64-bit Windows can run 32-bit code with seemingly no issues at all. The reason for this is that Microsoft implemented a translation layer between the 32-bit code and the 64-bit OS, and this is (largely) implemented in wow64cpu.dll and Wow64.dll.
Put very simply, the core task of wow64cpu.dll is to save the 32-bit state, convert everything to 64-bit, and call Wow64SystemServiceEx(), which is found in wow64.dll. Wow64SystemServiceEx() then re-packages the call into a 64-bit version (converting the parameters to 64-bit), making the actual system call into the kernel, grabs the 64-bit returned results, re-packages them back to 32-bit, and passes it back to the 32-bit program.
This is made more obvious by comparing how system calls are made in a native x86 code (32-bit Windows), versus a Wow64 piece of code.
Native x86 (e.g. ntdll.dll in 32-bit Windows):
mov eax, 30h ; Some system call number mov edx, offset SharedUserData!SystemCallStub call dword ptr [edx]
Wow64 (e.g. ntdll.dll in 64-bit Windows):
mov eax, 0A5h ; I used ntdll!NtCreateThreadEx xor ecx, ecx ; It's a 64-bit register lea edx, [esp+4] call dword ptr fs:[0C0h]
The key difference is the call, where in Wow64, it’s call fs:[0C0h]. Since fs:[0] points to the TEB, examining the TEB reveals the following:
0:001> dt ntdll!_TEB +0x000 NtTib : _NT_TIB +0x01c EnvironmentPointer : Ptr32 Void +0x020 ClientId : _CLIENT_ID +0x028 ActiveRpcHandle : Ptr32 Void +0x02c ThreadLocalStoragePointer : Ptr32 Void +0x030 ProcessEnvironmentBlock : Ptr32 _PEB +0x034 LastErrorValue : Uint4B +0x038 CountOfOwnedCriticalSections : Uint4B +0x03c CsrClientThread : Ptr32 Void +0x040 Win32ThreadInfo : Ptr32 Void +0x044 User32Reserved : [26] Uint4B +0x0ac UserReserved : [5] Uint4B +0x0c0 WOW32Reserved : Ptr32 Void <----- +0x0c4 CurrentLocale : Uint4B +0x0c8 FpSoftwareStatusRegister : Uint4B +0x0cc SystemReserved1 : [54] Ptr32 Void ...
The name is WOW32Reserved, and its the piece of code that was described above (re-packaging, kernel call, re-packaging).
System calls are but one reason where Wow64 code differs from the 32-bit native code. There are others, including code that communicate with 64-bit drivers, and so on.