The Blog

  • Lens-indexed lenses

    In many functional programming languages, lenses are an increasingly popular, highly composable, way of structuring bidirectional data access, i.e., operations to both read and functionally update parts of a composite object. There are many introductory tutorials about lenses on the web: one that I found to be particularly gentle is by Gabriel Gonzalez. I’ll borrow some of his ideas to introduce lenses briefly here, although, of course, I’ll work in F* rather than Haskell.

    Doing it in F* raises a couple of interesting challenges. First, programming with mutable references and destructive updates is common in F*: so, unlike some other lens libraries, ours must support mutation. Second, like everything else in F*, our lens library focuses on verification: we need a way to specify and prove properties about lenses in a way that does not compromise the inherent composability of lenses—composability being their primary appeal. My solution, the lens-indexed lens, is analogous to the monad-indexed monad, a structure at the core of the design of F* itself.

    Updating nested records: Clumsy, even with primitive syntax

    Consider the following simple representation of a circle:

    // A simple 2d point defined as a pair of integers
    type point = {
      x:int;
      y:int;
    }
    
    // A circle is a point and a radius
    type circle = {
      center: point;
      radius: nat
    }
    

    Using the record notation to define our types gives us some primitive syntax to access the fields of a circle and point. But, despite the primitive support, it’s not very convenient to use. Here’s some code to move the circle in the x-direction:

    let move_x (delta:int) (c:circle) = 
        { c with center = {c.center with x = c.center.x + delta} }
    

    Not pretty. But, by default, that’s basically what it looks like in most ML-like languages, including OCaml, Haskell etc. We’d much prefer to write something like c.center.x <- c.center.x + delta. Lenses are a clever way of designing a library of abstract “getters” and “setters” that allow you to do just that.

    Pure lenses for composable, bidirectional access to data structures

    A lens a b is pair of a getter and a setter: given an a-typed value v, a lens a b allows you to

    1. get a b-typed component out of v

    2. create another a-typed value by updating the same b-typed component in v

    // A `lens a b` focuses on the `b` component of `a`
    // It provides a `get` to access the component
    // And a `put` to update and `a` by updating its `b` component
    type lens a b = {
      get: a -> b;
      put: b -> a -> a
    }
    

    For instance, given a point, we can define two lenses, x and y, to read and write each of its fields, and lenses center and radius to focus on each of the fields of a circle.

    let x : lens point int = {
        get = (fun p -> p.x);
        put = (fun x' p -> {p with x=x'})
    }
    let y : lens point int = {
        get = (fun p -> p.y);
        put = (fun y' p -> {p with y=y'})
    }
    let center : lens circle point = {
        get = (fun c -> c.center);
        put = (fun p c -> {c with center=p}}
    }
    let radius : lens circle int = {
        get = (fun c -> c.radius);
        put = (fun r c -> {c with radius=r}}
    }
    

    This four definitions are rather boring: one could imagine automatically generating them with a bit of meta-programming (maybe we’ll have a future post about that).

    But, now comes the fun part. Lenses are easily composable: l |. m: composes lenses, building an “access path” that extends the focus of l with m. (Note #a, #b and #c below bind implicit type arguments.)

    let ( |. ) #a #b #c (m:lens a b) (l:lens b c) : lens a c = {
      get = (fun x -> l.get (m.get x));
      put = (fun x y -> m.put (l.put x (m.get y)) y)
    }
    

    We can now define:

    let move_x (delta:int) (c:circle) = 
        (center |. x).put ((center |. x).get c + delta) c
    

    That may not look like much of an improvement, but even with F*’s poor support for custom operators, it’s easy to re-define a couple of common infix operators to make it better.

    // `x.(|l|)`: accesses the l-focused component
    let ( .(||) ) #a #b (x:a) (l:lens a b) : b = l.get x
    
    // `x.(|l|) <- v`: updates the l-focused component of x with v
    let ( .(||)<- ) #a #b (x:a) (l:lens a b) (v:b) : a = l.put v x
    
    

    This lets us write:

    let move_x (delta:int) (c:circle) = 
        c.(| center |. x |) <- c.(| center |. x |) + delta
    

    which is pretty close to what we were aiming for.

    Lenses on mutable data

    Just to be clear, despite appearances, c.(| center |. x |) <- c.(| center |. x |) + delta does not actually mutate c. It creates a new circle that differs from c in the x field of its center, leaving the original c unchanged.

    That’s nice, but we would also like a way to access and update in place the mutable fields of a record. In a language like OCaml, an object’s fields may contains mutable references to heap-allocated values. In such a setting, it’s easy to define a lens like:

    let deref : lens (ref a) a = {
        get = (fun r -> !r);
        put = (fun v r -> r := v)
    }
    

    But, this won’t do in F*: for starters, the get and put fields of a lens are expected to be pure, total functions and the fields of deref are certainly not pure: they read or write to the heap. F*, like Haskell, forces us to confess to our impurities.

    If we were in Haskell, we could define the type of a stateful lens (in F* pseudo-syntax) like so:

    type st_lens a b = {
         st_get : a -> ST b;
         st_put:  b -> a -> ST a
    }
    

    And we could give deref the type st_lens (ref a) a. But, this still won’t do in F*. With verification in mind, stateful computations must

    1. be accompanied by a precise specification, and

    2. whatever specification we give a stateful lens, it needs to be stable under composition: the composition of stateful lenses needs to also be a stateful lens (e.g., we should be able to define a double dereference to read an write an a from/to a ref (ref a), by composing lens from ref (ref a) to a ref a and again from a ref a to an a).

    A precise specification for the deref lens

    Let’s look again at the get and put of the deref lens.

    let get (r:ref a) = !r
    let put (v:a) (r:ref a) = r := v
    

    These two functions are the most primitive operations available on references and, of course, we can give them precise specifications in F*.

    val get (r:ref a) 
        : ST a
        (requires (fun h -> True))
        (ensures  (fun h0 x h1 -> h0==h1 /\ x == sel h0 r))
    
    val put (v:a) (r:ref a)
        : ST unit 
        (requires (fun h -> True))
        (ensures  (fun h0 () h1 -> h1 == upd h0 r v))
    

    Here ST a (requires pre) (ensures post) is the type of a stateful computation whose pre-condition on the input heap is pre, and whose post-condition post relates the initial heap h0 to the result and the final heap h1.

    Viewing the spec of a stateful lens as a pure lens

    The specification of get describes the return value by getting it from the initial heap h0 and the reference r; the spec of put describes the final heap h1 by putting a new value into the initial heap h0. Viewed differently, the specification of get and put is itself a lens, a lens between a pair of heap * ref a and an a. Let’s call such a lens an hlens.

    let hlens a b = lens (heap * a) b
    let ref_hlens #a : hlens (ref a) a = {
        get = (fun (h,r) -> sel h r);
        put = (fun v (h, r) -> upd h r v)
    }
    

    We can then specify stateful lenses that perform destructive updates based on their underlying hlenses.

    // st_lens l: a stateful lens between a and b
    type st_lens #a #b (l:hlens a b) = {
       st_get : (x:a
             -> ST b
                (requires (fun h -> True))
                (ensures  (fun h0 x h1 -> h0==h1 /\ y == l.get (h0, x))));
    
       st_put : (y:b 
             -> x:a 
             -> ST a
                (requires (fun h -> True))
                (ensures  (fun h0 x' h1 -> h1, x' == l.put y (h0, x)))
    }
    

    Given an l:hlens a b, an st_lens l is a stateful lens between a and b that can perform destructive updates and whose behavior is fully specified by l.

    And we can finally specify the deref stateful lens:

    let deref : st_lens ref_hlens = {
        st_get = (fun r -> !r);
        st_put = (fun v r -> r := v)
    }
    

    Composing stateful lenses

    Happily, stateful lenses compose nicely: the composition of an st_lens l and an st_lens m is fully specified by l |. m, the composition of l and m.

    let ( |.. ) #a #b #c (#l:hlens a b) (#m:hlens b c)
                         (sl:st_lens l) (sm:st_lens m)
        : st_lens (l |. m) = {
          st_get = (fun (x:a) -> sm.st_get (sl.st_get x));
          st_put = (fun (z:c) (x:a) -> sl.st_put (sm.st_put z (sl.st_get x)) x)
    }
    

    And any pure lens l is easily lifted to a stateful lens whose specification is the trivial lifting of l itself to an hlens.

    let as_hlens #a #b (l:lens a b) : hlens a b = {
        get = (fun (h, x) -> x.(|l|));
        put = (fun y (h, x) -> h, (x.(|l|) <- y));
    }
    
    let as_stlens #a #b (l:lens a b) : stlens (as_hlens l) = {
        st_get = (fun (x:a) -> x.(|l|));
        st_put = (fun (y:b) (x:a) -> x.(|l|) <- y)
    }
    

    These liftings of pure lenses to stateful lenses are convenient to fold in as part of the lens composition, lifting pure lenses as they are composed with stateful lenses, either on the left or the right.

    let ( |^. ) #a #b #c (m:lens a b) (#l:hlens b c) (sl:stlens l) =
        (as_stlens m) |.. sl
    
    let ( |.^ ) #a #b #c (#l:hlens a b) (sl:stlens l) (m:lens b c) =
        sl |.. as_stlens m
    

    We can also define some shorthands, x.[sl] and x.[sl] <- v, to apply a stateful lens sl to access or mutate an object x.

    let ( .[]   ) #a #b (#l:hlens a b) (x:a) (sl:stlens l) =
        sl.st_get x
    let ( .[]<- ) #a #b (#l:hlens a b) (x:a) (sl:stlens l) (y:b) = 
        let _ = sl.st_put y x in ()
    

    Stateful lenses at work

    We can now revisit our original example of circles and points, this time defining them to support destructive updates.

    type point = {
      x:ref int;
      y:ref int;
    }
    type circle = {
      center: ref point;
      radius: ref nat
    }
    

    These are just the types we had before, but now with mutable references in each field. Pure lenses to access and mutate each field are easy to define, as before.

    let center : lens circle (ref point) = {
      get = (fun c -> c.center);
      put = (fun p c -> {c with center = p})
    }
    let x : lens point (ref int) = {
      get = (fun p -> p.x);
      put = (fun x p -> {p with x = x})
    }
    ...
    

    Using a combination of pure and stateful lenses, we can mutate the x field of a circle’s center easily.

    let move_x delta c
       = let l = center |^. deref |.^ x |.. deref in
         c.[l] <- c.[l] + delta
    

    Notice that since a lens is a first-class value, the access path from a circle to the contents of the x-field of its center is a value that can be constructed once, bound to a variable (l) and re-used as needed.

    Of course, we also want to be able to specify move_x. But, since every stateful lens is fully specified by a pure lens, specifying move_x is now pretty easy in terms of its action on the heap. Here’s the type of move_x:

    val move_x (delta:int) (c:circle) : ST unit
      (requires (fun _ -> True))
      (ensures  (fun h0 _ h1 ->
                 let l = center |^. deref |.^ x |.. deref in
                 (h1, c) == (c.(h0, l) <- c.(h0, l) + delta)))
    
    

    We can specify it using just the same lens that we used in its implementation, except using the hlens that’s the pure counterpart of the stateful lens l. Since the type of a stateful lens always mentions its corresponding hlens, we can easily define functions to to specify a stateful lenses effect.

    let ( .() )   #a #b (#l:hlens a b) (x:a) (hs:(heap * stlens l)) = l.get (fst hs, x)
    let ( .()<- ) #a #b (#l:hlens a b) (x:a) (hs:(heap * stlens l)) (y:b) = l.put y (fst hs, x)
    

    Without these lenses, here’s the mouthful we’d have had to write instead.

    let move_x2 (delta:int) (c:circle) : ST unit
      (requires (fun _ -> True))
      (ensures  (fun h _ h' ->
                 let rp = c.center in
                 let p = sel h rp in
                 let rx = p.x in
                 let h1 = upd h rx (sel h rx + delta) in
                 let p = sel h1 rp in
                 let h2 = upd h1 rp ({p with x = rx}) in
                 h' == h2))
       = c.[center |^. v |.^ x |.. v] <- (c.[center |^. v |.^ x |.. v] + delta)
    

    Worse than the verbosity, it actually took me nearly half an hour to write this specification, requiring peeling back the layers of abstraction to figure out the exact order in which the reads and writes were occurring!

    Takeaways

    Lenses are a powerful way to structure data access and mutation. With stateful lens-indexed lenses, we can use them to implement and specify destructive updates of nested data structures in a compact and composable manner.

    You can see a fully detailed code listing for this blog post here.

    Lens libraries in other languages are much more extensive—I’ve merely scratched the surface ground out the most basic lens combinators.

  • printf*

    It must have been around 2002 when I first read a paper by Lennart Augustsson about a language called Cayenne. It was the first time I had come across dependent types. Lennart’s first example motivating the need for (and the power of) dependent types is printf: a function used by legions of programmers that is, in an essential way, very dependently typed. Having worked since then on many programming languages with various forms of dependent types, now, 15 years later, a language that I contribute to has a proper, dependently typed printf. Well, a sprintf, but Cayenne’s example was also a sprintf. This post is about explaining how sprintf works in F*.

    What is sprintf?

    Some variant of sprintf appears in the standard library of many programming languages—it allows for a string to be formatted and assembled from a number of programmer-provided values in a convenient way. A basic usage is like this:

       let format_date (month:string) (day:nat) (year:nat) =
           sprintf "%s %d, %d" month day year
    

    Where, format_date "Nov." 22 2017 returns the string "Nov. 22, 2017".

    Or:

       let format_date (dow:string) (month:string) (day:nat) (year:nat) =
           sprintf "%s, %s %d, %d" dow month day year
    

    The curious thing here is that sprintf’s first argument is a “format string”, and the number and types of its remaining arguments depends on the format string: i.e., sprintf is the quintessential dependently typed function.

    A sketch of sprintf in F*

    Here’s one type for a simplified version of sprintf in F*.

    val sprintf: s:string{valid_format_string s} -> sprintf_type s
    

    Given a first argument s, a valid_format_string, the return value has a type sprintf_type s that is a function of the format string s.

    Defining valid_format_string s is relatively straightforward, e.g., if one wanted to only allow %d and %s as format specifiers (with %% as an escape sequence for printing “%” itself), one could do it like this:

    let rec valid_format_string = function
        | [] -> true
        | '%'::'d'::s 
        | '%'::'s'::s 
        | '%'::'%'::s -> valid_format_string s    
        | '%'::_ -> false
        | _::s -> valid_format_string s
    

    sprintf_type s is a little more unusual: it is a function that computes a type from a format string s. Here’s one version of it:

    let rec sprintf_type (s:string{valid_format_string s}) =
        match s with
        | [] -> string
        | '%'::'d'::s -> (int -> sprintf_type s)
        | '%'::'s'::s -> (string -> sprintf_type s)
        | '%'::'%'::s -> sprintf_type_chars s
        | _::s -> sprintf_type s
    

    Look at that closely: in the first case, if the format string is empty, sprintf "" just returns a string. If the format string begins with "%d", then sprintf expects one more argument, an int, followed by whatever arguments are needed by the rest of the format string, and so on.

    Note: I cut a corner there, treating a string as a list of characters, whereas in F* you have to explicitly coerce a string to a list of characters. You can see the whole program, corners restored, at the end of this post.

    That was simple! Why did it take so long for F* to get sprintf?

    The version of sprintf provided in F*’s standard library (FStar.Printf.sprintf) is somewhat more complex than the simple example we’ve just seen. For one, it supports many more format specifiers than just %d and %s. But, that’s just a detail. More essentially, in order for callers of sprintf to be checked efficiently, the type-checker has to compute the recursive function sprintf_type s at type-checking time.

    This kind of type-level computation is a distinctive feature of many dependently typed languages, a feature that F* acquired in full generality about 2 years ago. Unlike other dependently typed languages, in addition to type-level computation, the F* type-checker also makes use of algebraic and logical reasoning via an SMT solver. Balancing these features has required quite a lot of care: e.g., compute too much in the type-checker and the SMT solver is faced with reasoning about enormous, partially reduced terms; compute too little and the SMT solver has to do a lot of computation, which is not its strong point.

    For sprintf, we don’t want to rely on SMT reasoning at all, since everything can be checked simply using type-level computation. For this, F* requires a hint in the type of sprintf:

    val sprintf : s:string{normalize (valid_format_string s)}
               -> normalize (sprintf_type s)
    

    Notice the additional calls to normalize, which signals to F* to fully reduce valid_format_string s and sprintf_type s before resorting to SMT reasoning.

    Additional material

    See the actual library (based on F* code that Catalin Hritcu initially modeled after Arthur Azevedo de Amorim’s Coq sprintf) for more details about how this works, including how we compile away uses of sprintf to a bunch of string concatenations.

    Here’s a small standalone version. Compare it to the version in Cayenne … it’s pretty similar, but it took (well, at least it took me) 15 years to get there! : )

    module SimplePrintf
    open FStar.Char
    open FStar.String
    
    let rec valid_format_chars = function
        | [] -> true
        | '%'::'d'::s 
        | '%'::'s'::s 
        | '%'::'%'::s -> valid_format_chars s    
        | '%'::_ -> false
        | _::s -> valid_format_chars s
    
    let valid_format_string s =
        valid_format_chars (list_of_string s)
    
    let rec sprintf_type_chars (s:list char) =
        match s with
        | [] -> string
        | '%'::'d'::s -> (int -> sprintf_type_chars s)
        | '%'::'s'::s -> (string -> sprintf_type_chars s)
        | '%'::'%'::s -> sprintf_type_chars s
        | _::s -> sprintf_type_chars s
    
    let sprintf_type s = sprintf_type_chars (list_of_string s)
    
    let rec sprintf_chars 
            (s:list char{normalize (valid_format_chars s)})
            (res:string)
        : normalize (sprintf_type_chars s)
        = match s with
          | [] -> res
          | '%'::'d'::s -> fun (x:int) -> sprintf_chars s (res ^ string_of_int x)
          | '%'::'s'::s -> fun (x:string) -> sprintf_chars s (res ^ x)
          | '%'::'%'::s -> sprintf_chars s (res ^ "%")
          | x::s -> sprintf_chars s (res ^ string_of_char x)
    
    let sprintf (s:string{normalize (valid_format_string s)})
       : normalize (sprintf_type s)
       = sprintf_chars (list_of_string s) ""
    
  • F* v0.9.5.0 released

    On 24 August 2017 we released F* v0.9.5.0. This is another big release with lots of changes and new features compared to v0.9.4.0.

    Main new features

    • Proofs by reification (see this paper)
    • A revision of the libraries based on a new formal account of monotonic state (see this paper)
    • Extraction of programs with user-defined effects
    • Experimental support for tactics
    • New IDE protocol and new IDE features: autocompletion, evaluation, real-time syntax checking, jump-to-definition, type-at-point, etc.

    Changes and other improvements

    • A reorganization of the library and a single fstarlib.cmxa against which to link all F* programs compiled to OCaml (this change is incompatible with previous versions of F*)
    • A new printer of source terms
    • Revised error reporting from failed SMT queries
    • Improved support for separate compilation via a binary format for checked modules
    • Fixed a ton of bugs (179 closed GitHub issues)
  • Teaching F*

    Having to teach F* provides strong motivation to dust off the cobwebs and tidy away long forgotten bread crumbs hidden deep down in remote directories to make the language easier to install and use. It is thus no coincidence that major releases have been aligned with some of us going back to school after a long summer of coding to step out there and present the newest features of the language to a crowd of rowdy students.

    The latest two occasions:

    It is no coincidence that both courses had a decidedly cryptographic focus given that F* is the language of choice of the Everest project.

    What features would you like to see included in future research schools? And what do you think are the biggest stumbling blocks when learning and teaching a hot-off-the-press research language like F*?

  • F* v0.9.4.0 released

    The F* team is pleased to announce the release of F* v0.9.4.0. This is the culmination of exactly one year of hard work from a very quickly expanding F* team. We’re not very good at keeping precise change lists, but here are the main highlights of this release:

    • Predicative hierarchy of universes with universe polymorphism
    • Uniform syntax between expressions and types allowing rich type-level computation
    • Dijkstra Monads for Free
    • Extraction to C via KreMLin
    • New parser based on Menhir
    • New pretty printer for surface syntax and fstar --indent
    • Changed default effect to Tot
    • Strict positivity check for inductives
    • New synatax for inductive type projectors and discriminators
    • Better semantics for module open and support for local opens
    • Better dependency analysis
    • Better error locations for Z3 failures
    • Replaced Z3 timeouts with machine independent resource limits
    • Cleaned up libraries and examples (a bit)
    • Improvements to interactive mode
    • Docker builds
    • Fixed a ton of bugs (262 closed GitHub issues)

    Enjoy the best F* release ever!

  • Introducing KreMlin

    The work we do these days on F* is often in service of Project Everest. The goal of Everest is to verify and deploy a drop-in replacement for the HTTPS stack, the protocol using which you are probably reading this page, securely (hopefully). So far, we’ve been focusing most of our efforts in Everest on TLS, the protocol at the heart of HTTPS.

    Right now, I’m stuck in the Eurostar back from our week-long meeting in Cambridge, UK, so it feels like a good time to write down some thoughts about KreMLin, a new compiler backend that we’re using in Everest, that several of us have been working on over the summer, at MSR and INRIA.

    As a reminder, Everest sets out to verify and deploy secure cryptographic protocols, starting with TLS 1.3. Deploy is the salient part: in order to see people adopt our code, we not only need to write and prove our TLS library, but also to

    • make sure it delivers a level of performance acceptable for browser vendors, and
    • package it in a form that’s palatable for a hardcode Windows developer that started writing C before I was born.

    A TLS library can, roughly speaking, be broken down into two parts: the protocol layer that performs the handshake (“libssl”) and the cryptographic layer that actually encrypts the data to be transmitted (“libcrypto”). The handshake connects to the server, says hi, agrees on which algorithms to use, and agrees on some cryptographic parameters. Once parameters have been setup, the cryptographic layer is responsible for encrypting the stream of data.

    ...Read more
  • Welcome to F*!

    After many discussions, and in the spirit of the Gallium Blog (where I was a regular), the F* team is happy to announce the F*-blog! Expect a variety of posts, ranging from technical digressions about Dijkstra Monads to engineering discussions about parsing technology, and pretty much anything in between.

    One of our stated goals is to make F* more accessible to beginners; this means making the setup easier, but also writing more documentation, so that people who are not in the vicinity of the F* team can write programs, too. We’ve started an effort on the wiki; the goal of this blog is to complement the wiki and make it easy for F* enthusiasts to keep up with the development; be notified about breaking changes on the master branch, and more generally make the development process more open.

    I expect that this blog will also cover related projects, such as KreMLin, our F*-to-C translator, and miTLS, our ongoing implementation of TLS 1.3 in F*. Stay tuned! ☭

subscribe via RSS