About

Type industry muggle ...

| Posted by Ki Yung Ahn

A Prolog Specification of Extensible Records based on Row Polymorphism

Finally, I got this to work. First, I'll go over specifications of polymorphic type systems needed for the specpification of extensible records. Then, I'll try to explain the specification of a type system supporting extensible records based on row poplymorphism. All the examples in this article are aviailible on an online IDE (follow the links on the TIPER hompage) along with the slides for the invited talk at CoALP-Ty '16.

Review on Polymorphic Type Systems

I wrote about this in my blog article and it is also a published paper in FLOPS 2016. So, I won't go into detailed explnanations but just list the Prolog specifications of STLC, HM, and type-constructor polymorphism. We need at least the flexibility of type-constructor polymorphism in the type system to support row polymorphism because row variables need to be categorized by a separate kind other than the kind for types.

Let's start from STLC:

:- set_prolog_flag(occurs_check,true).
:- op(500,xfy,$).

type(C,var(X),     A) :- first(X:A,C).
type(C,lam(X,E),A->B) :- type([X:A|C], E,  B).
type(C,E1 $ E2,    B) :- type(C,E1,A->B), type(C,E2,A).
 
first(K:V,[K1:V1|_]) :- K = K1, V = V1.
first(K:V,[K1:_|Xs]) :- K\==K1, first(K:V, Xs).

Then, HM:

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).

type(C,var(X),     A) :- first(X:S,C), inst_ty(S,A).
type(C,lam(X,E),A->B) :- type([X:mono(A)|C],E,B).
type(C,E1 $ E2,    B) :- type(C,E1,A->B), type(KC,C,E2,A).
type(C,let(X=E0,E),T) :- type(C,E0,A), type([X:poly(C,A)|C],E,T).

inst_ty(poly(C,T),T1) :- copy_term(t(C,T),t(C,T1)).
inst_ty(mono(T),   T).

first(K:V,[K1:V1|_]) :- K = K1, V = V1.
first(K:V,[K1:_|Zs]) :- K\==K1, first(K:V, Zs).

Then, HM + Type Constructor Polymorphism:

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).
 
kind(KC,var(Z), K) :- first(Z:K,KC).
kind(KC,F $ G, K2) :- kind(KC,F,K1->K2), kind(KC,G,K1).
kind(KC,A -> B, o) :- kind(KC,A,o), kind(KC,B,o).
 
type(KC,C,var(X),     A) --> { first(X:S,C) }, inst_ty(KC,S,A).
type(KC,C,lam(X,E),A->B) --> type(KC,[X:mono(A)|C],E,B),
                             [ kind(KC,A->B,o) ].
type(KC,C,E1 $ E2,    B) --> type(KC,C,E1,A->B), type(KC,C,E2,A).
type(KC,C,let(X=E0,E),T) --> type(KC,C,E0,A),
                             type(KC,[X:poly(C,A)|C],E,T).
 
inst_ty(KC,poly(C,T),T2) --> { copy_term(t(C,T),t(C,T1)),
                               free_variables(T,Xs),
                                     free_variables(T1,Xs1) },
                             samekinds(KC,Xs,Xs1), { T1=T2 }.
inst_ty(_, mono(T),   T) --> [].
 
samekinds(KC,[X|Xs],[Y|Ys]) --> { X\==Y },
                                [ kind(KC,X,K), kind(KC,Y,K) ],
                                samekinds(KC,Xs,Ys).
samekinds(KC,[X|Xs],[X|Ys]) --> [], samekinds(KC,Xs,Ys).
samekinds(_ ,[],    []    ) --> [].

first(K:V,[K1:V1|_]) :- K = K1, V = V1.
first(K:V,[K1:_|Zs]) :- K\==K1, first(K:V, Zs).
 
variablize(var(X)) :- gensym(t,X).

What's going on there is that we do type and kind inference in two stages, where kinds are either o or K1 -> K2 where both K1 and K2 are kinds. Becase there are type constructor variables with different kinds, we need to make sure that all types are 'well-kinded' by satisfying the kind predicate just as we are making sure that all terms are 'well-typed' by the typed predidate. Firist, we do type inference and collect all queries (or constraints) for the kind inference as an yet uninterpreed list, which is using Definite Clause Grammars (DCGs). The type predicate is defined as a DCG. One can use phrase predicate to invoking queries of DCGs. Then, we run the kind predicates generated by the DCG query over type predicate. For example, we can use the code above as follows:

main:-
    process,
    halt.

type_and_print(KC,C,E,T) :-
  phrase(type(KC,C,E,T),Gs), print(success_type), nl,
  (bagof(Ty, X^Y^member(kind(X,Ty,Y),Gs), Tys); Tys = []),
  free_variables(Tys,Xs), maplist(variablize,Xs),
  maplist(call,Gs),
  write("kind ctx instantiated as: "), print(KC), nl, print(E : T), nl.
 
process:-
    /* your code goes here */
    type_and_print(_,[],lam(x,var(x)),_), nl,
    type_and_print(_,[],lam(x,lam(y,var(y)$var(x))),_), nl,
    type_and_print(_,[],let(id=lam(x,var(x)),var(id)$var(id)),_), nl,
    KC0 = [ 'Nat':o, 'List':(o->o) | _ ],
    Nat = var('Nat'), List = var('List'),
    C0 =     [ 'Zero':mono(Nat)
        , 'Succ':mono(Nat -> Nat)
        , 'Nil' :poly([], List$A)
        , 'Cons':poly([], A->((List$A)->(List$A))) ],
    type_and_print(KC0,C0,lam(x,lam(n,var(x)$var('Succ')$var(n))),_),
    true.

:- main.

On top of this we can also easily add kind polymorphism (it is basically HM at the type level) as in our previous work (FLOPS 2016), but for row polymorphism, this is enough. Kind polymorphism is not needed for row polymorphism.

Extensible Records based on Row Polymorphism

Here is how we extend HM + tycon poly with row polymorphism for extensible records:

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).

kind(KC,var(X),   K1) :- first(X:K,KC).
kind(KC,F $ G,    K2) :- K2\==row, kind(KC,F,K1->K2),
                         K1\==row, kind(KC,G,K1).
kind(KC,A -> B,    o) :- kind(KC,A,o), kind(KC,B,o).
kind(KC,{R},       o) :- kind(KC,R,row).
kind(KC,[],      row).
kind(KC,[X:T|R], row) :- kind(KC,T,o), kind(KC,R,row).

type(KC,C,var(X),     A) --> { first(X:S,C) }, inst_ty(KC,S,A).
type(KC,C,lam(X,E),A->B) --> type(KC,[X:mono(A)|C],E,B),
                             [ kind(KC,A->B,o) ].
type(KC,C,X $ Y,      B) --> type(KC,C,X,A->B), type(KC,C,Y,A1),
                             !, { eqty(A,A1) }. % note the cut !
type(KC,C,let(X=E0,E),T) --> type(KC,C,E0,A),
                             type(KC,[X:poly(C,A)|C],E,T).
type(KC,C,{XEs},    {R}) --> { zip_with('=',Xs,Es,XEs) },
                             type_many(KC,C,Es,Ts),
                             { zip_with(':',Xs,Ts,R) }.
type(KC,C,sel(L,X),   T) --> { first(X:T,R) }, type(KC,C,L,{R}).

first(K:V,[K1:V1|Xs]) :- K = K1, V=V1.
first(K:V,[K1:V1|Xs]) :- K\==K1, first(K:V, Xs).

inst_ty(KC,poly(C,T),T2) --> { copy_term(t(C,T),t(C,T1)), 
                               free_variables(T,Xs),
                               free_variables(T1,Xs1) },
                             samekinds(KC,Xs,Xs1), { T1=T2 }.
inst_ty(KC,mono(T),   T) --> [].

samekinds(KC,[],    []    ) --> [].
samekinds(KC,[X|Xs],[Y|Ys]) --> { X\==Y },
                                [ kind(KC,X,K), kind(KC,Y,K) ],
                                samekinds(KC,Xs,Ys).
samekinds(KC,[X|Xs],[X|Ys]) --> [], samekinds(KC,Xs,Ys).

zip_with(F,[],    [],    []      ).
zip_with(F,[X|Xs],[Y|Ys],[FXY|Ps]) :- FXY=..[F,X,Y],
                                      zip_with(F,Xs,Ys,Ps).

type_many(KC,C,[],    []    ) --> [].
type_many(KC,C,[E|Es],[T|Ts]) --> type(KC,C,E,T),
                                  type_many(KC,C,Es,Ts).

The kind predicate is extended for the row kind. The row kind is inhabited by row configurations. A row configuration can either be a row variable or an extension of a row configuration with a name--type pair. In addition, we use a more advanced notion of type equality on application terms to overcome the limits of the structural unification in Prolog. The type predicate is extended for typing rules for records accordingly using some helper predicates (zip_with and type_many), which is rather straightforward.

A more subtle part of the change worth explaning further is the use of a more advanced notion of type equaity (eqty) in the clause for the application term, which is defined as follows:

eqty(A1,A2) :- (var(A1); var(A2)), !, A1=A2.
eqty({R1},{R2}) :- !, unify_oemap(R1,R2). % permutation(R1,R2) is not enough
eqty(A1->B1,A2->B2) :- !, eqty(A2,A1), !, eqty(B1,B2). % in case of subtyping
eqty(A,A).

The eqty predicate extends structural unification with a notion of opend-ended map unification (unify_oemap) with name--value pairs, where names are atomic terms in Prolog and values are unifiable objects modulo eqty (thus, eqty and uunify_oemap are mutually recursive definitions). Because we represent an extensible records as possibly opend-ended list by a row variable, using a Prolog standard library predicate such as perumtation, which works for finite lists, is not going to work well. In addition, we use cut in to prevent backtracking from eqty because this eqty predicate is intended to be a decision procedure that does not need backtraking. The use of eqty in application terms is in sync the with common wisdom of performing more advanced notions of type matching in applications (e.g., argument value is a subtyping of a formal argument of a function in type systems with subtyping) when developing syntax-directed rules for type systems. In terms of constraint generation and solving, we can also understand this as forcing to solve the collected constraints for records whenever the type system tries to type-infer (or type-check) the application terms. Ordinary structural unification is nativly handled by the Prolog where no manual control is needed for the constraint-generation and constraint-solving (except for delaying kind inference after type inference using DCGs). But open-ended records by membership constraints is more like handling constraints, therefore, some manual control is inevitable. What !, { eqty(A,A1) } is doing in the type predicate is that it prevents generating infinite hopless candidates for open-ended map unification (e.g., hopelessly trying to unify [x:_|_], [_,x:_|_], [_,_,x:_|_], ... with []) and decides to fail.

The predicate for open ended map unification (unify_oemap) is defined as follows:

% related paper:
%
% Membership-Constraints and Complexity in Logic Programming with Sets,
% Frieder Stolzenburg (1996).
% http://link.springer.com/chapter/10.1007%2F978-94-009-0349-4_15
% http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.8356

% unify finite maps
unify_map(A,B) :- submap_of(A,B), submap_of(B,A).

submap_of([], _).
submap_of([X:V|R],M) :- first(X:V1,M), eqty(V,V1), submap_of(R,M).


% finite map minus
mapminus(A,[],A).
mapminus([],_,[]).
mapminus([X:V|Ps],B,C) :- first(X:V1,B), !, eqty(V,V1) -> mapminus(Ps,B,C).
mapminus([X:V|Ps],B,[X:V|C]) :- mapminus(Ps,B,C).



% unify open ended maps with possibly uninstantiated variable tail at the end
unify_oemap(A,B) :- ( var(A); var(B) ), !, A=B.
unify_oemap(A,B) :-
        split_heads(A,Xs-T1), make_map(Xs,M1),
        split_heads(B,Ys-T2), make_map(Ys,M2),
        unify_oe_map(M1-T1, M2-T2).

make_map(L,M) :- setof(X:V,first(X:V,L),M). % remove duplicates
make_map([],[]).

split_heads([],[]-[]).
split_heads([X:V|T],[X:V]-T) :- var(T), !, true.
split_heads([X:V|Ps],[X:V|Hs]-T) :- split_heads(Ps,Hs-T).

% helper function for unify_oemap
unify_oe_map(Xs-T1,Ys-T2) :- T1==[], T2==[], unify_map(Xs,Ys).
unify_oe_map(Xs-T1,Ys-T2) :- T1==[], submap_of(Ys,Xs), mapminus(Xs,Ys,T2).
unify_oe_map(Xs-T1,Ys-T2) :- T2==[], submap_of(Xs,Ys), mapminus(Ys,Xs,T1).
unify_oe_map(Xs-T1,Ys-T2) :- 
        mapminus(Ys,Xs,L1), append(L1,T,T1),
        mapminus(Xs,Ys,L2), append(L2,T,T2).

%% ?- unify_oemap([z:string,y:bool|M1],[y:T,x:int|M2]).
%% M1 = [x:int|_G1426],
%% T = bool,
%% M2 = [z:string|_G1426].

You can run the full example code at http://ideone.com/x13a5O .

| Posted by Ki Yung Ahn

About scope extrusion of pi-calculus

Looking into pi-calculus (or -calculus, π-calculus) recently because it is likely to be my day job soon. There are some basic questions while looking into it, and one of them is regarding the scope extrusion. I'll assume the reader is familiar with the basic definitions of pi-calculus. If not, you man consult the wikipedia article on pi-calculus and many other materials freely available online.

In standard formulations of the pi-calculus a fresh channel name is created by -binder as where is the new name bound in process . This informal description in English is already confusing because is mentioned as a fresh channel name, suggesting that is a globally unique identifier, but is also described as a bound variable with a scope restriction within . IMHO, this confusion is not merely because one is not used to pi-calculus but there is a good reason to have such confusion. In addition, I think there is a rather well-known way of describing this without such confusion. In a nutshell, the -binder is not really a binder, rather it is an action that creates a fresh channel identifier, which should be treated analogous to the ref expression that creates a fresh reference in ML. It really is exactly like that. I really wonder why they had to define pi-calculus like this. What's strange is that pi-calculus was created by Robin Miler, the same famous person who designed ML. Now, let me give some more details below.

Scope extrusion is a "feature" of the pi-calculus that sets the pi-calculus aside from earlier process calculi CSP and CCS (Abadi & Gordon 1996). This "improvement" provided by scope extrusion allows the channel name freshly created by the -binder to escape its scope, more precisely, the scope can change according the axiom that relates restriction () and parallel composition ():

where does not appear free in . On the left hand side, is bound in only in . One the right hand side, is bound in both and . So, why do we extend the scope of $x$? It is because may send to and may then appear free in after certain number of reductions. Parallel processes and in can send and receive terms including variables. In pi-calculus, the matching pair of send and receive on the same channel by two parallel processes is a redex. The reduction rule in pi-calculus captures this idea:

The send action before is not a binder.
Both the channel name and are free variables. On the other hand, the receive action before binds
in . That is, is a local variable whose scope is within .
Consider, the process, or a composition of two parallel processes:

Here, is only bound in the left-hand side of the parallel composition but not the right-hand side. Therefore, sending from left-hand side to the right-hand side without any preparation will introduce a suddenly new free variable on the right-hand side. To remedy this problem, we use the axiom mentioned earlier on to pre-process the process syntax as follows, and then the reduction step can happen:

where the scope of extends to the left-hand side of the parallel composition. In contrast to the local variable binding of the receive , which is a true binder analogous to in the lambda calculus, the -binder is a fake binder, or at least a very strange one, somewhat like dynamic scoping. The scope extends as it is needed by the reduction semantics. FYI, the send action does not introduce any bound variables both and are free in because it is sending alreadly known term .

This is just so unnatural to me. Instead, we can make as a true binder just like in the send action, and make the reduction semantics to reduce the binder by substituting the bound variable with a fresh channel name like this:

where is a fresh channel name. As simple as that, just like allocating fresh locations (or addresses) for reference cells in ML. Then, everything became quite clear to me. The -action is a simple command that creates a globally fresh channel and binds it to the variable following . So, what we need is a clear distinction of variables and channel names, which is originally not present the calculus. This messy confusion, I think, is because of abusing variables for both local variable binding and global channel names in pi-calculus.

I am not the first to observe the need for the distinction between In fact, several people who worked on variations of pi-calculus have made a distinction between channel names and variables. For instance, in the spi-calculus (Abadi & Gordon 1996), processes can send and receive terms and within the term there is a distinction between variables and channel names. However, Abadi and Gordon's spi-calculus still rely on scope extrusion of pi-calculus and builds upon it, and several other variations of pi-calculus seems to be designed similarly; the distinction of variables and names were to make certain proofs of properties easier, but didn't seem to have any problems are strange feelings against the -binder itself like I have. FYI, the scoping of -bound variable have nothing to do with channel access rights or anything like security information flow purposes. The scope extrusion just makes all the parallel processes around the -binder have access to the new channel. In fact, that is exactly the motivation of spi-calculus trying to restrict channel access to certain processes.

So, my concluding question in this post is that what is the real benefit of having this strange scope extrusion in pi-calculus, which makes a very strange binder and needs an extra axiom to manage its scope manually? It really seems much more understandable and closer to the real implantation considering as a free channel initiation just like reference cells in ML. Is there really any benefit of manually extending the scope before reduction, or are we simply formulating it this way because of backwards compatibility with prior literature on pi-calculus?

P.S. also posted a question on cs theory stack exchange
http://cstheory.stackexchange.com/questions/36108/is-scope-extrusion-necessary-in-pi-calculus

| Posted by Ki Yung Ahn

Moved from Tumblr

Moving here cause mathjax support and syntax highlighting for code support are built-in.
But the main reason was I got really annoyed at Markdown parsing errors in Tumblr's web editor.If you need to visit my previous blog posts they ar still on Tumblr.
http://kyagrd.tumblr.com/

Bye Tumblr, Hello Logdown :)

| Posted by Ki Yung Ahn

n-Queens using MiniZinc

MiniZinc is a high level Constraint Programming (CP) language (or a frontend for constraint programming solvers). Being distributed with its own IDE, it is a most friendly enviornment to have a taste on what constraint programming is. Here is the n-Queens problem, which is like a hello world for CP, in MiniZinc.

include "globals.mzn";

int: n; % fixed integer parameter

% (xs[y],y), is a coordiate for a queen
array[1..n] of var 1..n: xs;

% constraint for the n queens problem on nxn borard
constraint alldifferent(xs);
constraint forall (i,j in 1..n where i<j) ( j-i != abs(xs[i]-xs[j]) );

solve satisfy;                % find satisfying solution

output ["Solution:\n", "xs = \(xs) \n",
        "To pretty-print the chessboard config, use ghci to run:\n", "let xs = \(xs) in putStrLn $ concat $ Data.List.intersperse \"\\n\" [ [if i==k then 'Q' else '#'  | k <- ys ] | i<-[1..length xs] ]"];

Note that fixed parameters, which should be initialized before runnning the solver, are declared witout the var keyword and the variables that are to be searched for solutions are declared with the var keyword. This var is in fact part of the type.

And in the IDE it looks like this.

n-Qeens code in MiniZinc IDE

Since we have not initizlied the fixed paramenter n, the IDE will prompt for its value when you run it with Ctrl-R.

prompt for fixed parameter in MiniZinc IDE

The output looks like this for n = 8.

output in MiniZinc IDE

Solution: xs = [3, 6, 4, 2, 8, 5, 7, 1] To pretty-print the chessboard config, use ghci to run: let xs = [3, 6, 4, 2, 8, 5, 7, 1] in putStrLn $ concat $ Data.List.intersperse "\n" [ [if i==k then 'Q' else '#' | k <- ys ] | i<-[1..length xs] ]

Try to pretty-print the result in 2D chessboard configuration using your favorite programming langauge, which is in my case Haskell.

ghci output

Seems like MiniZinc got a correct solution :)

You actually have to use an external mechanism to print this last step because you cannot use var types where normal types are required. var int is differet from int. The output command only prints strings not var strings. Var is like a stage annotation that contaminates the result once you use it. For instance if you try to print out something that relies on any var value MiniZinc will complain that it is var string not string. If a high level constraint programming lanaguage supported multi-staged computation we could actualy do those kind of things at the last step of printing out a result depending on the solution value.

| Posted by Ki Yung Ahn

Why Prolog does not have (open ended) set unification built-in

In recent months I have been wondering about why Prolog or other similar logic programming implementations do not generally support (open ended) set unification. There has been a lot of studies on finite set unification, which still didn’t seem to make into standardized libraries or common idioms in logic programming, but there are only a few studies on open ended set unification. The reason I started to wonder about this is because I needed a map unification for describing type inference including records, for instance {x:int, y:bool} should be equivalent to {y:bool, x:int}. A simplified problem, ignoring the type annotations, is the set unification, where {x,y} and {y,x} should be considered equivalent.

More generally, in order to handle extensible records, we need open ended unification where the tail (or the rest) of the set is a variable. For example consider unifying a set containing at least x denoted by {x|S1} and a set containing at least y denoted by {y|S2}. We intuitively know that the most general unification for the query ?- {x|S1} = {y|S2}. is S1 = {x|S}, S2 = {y|S} where the substitution for S1 and S2 shares the same tail S. This idea of open ended set unification has already been explored and published by Frieder Stolzenburg (1996). I was able to write this down in Prolog at https://github.com/kyagrd/ExtensibleRecordsWithSetMembLP (see setunify_naive.pl and setunify.pl in the repository) and it seems to work well in Prolog implementations. Therefore, in a way, this is a solved problem. However, this kind of open ended set unification is not wired into Prolog or other logic programming implementations as a primitive built-in unification operation.

So, why haven’t open ended sets been wired in as primitive values that could be unified in logic programming paradigm? The short answer to might be that it is hard to come up with an efficient internal representation of open ended sets, especially for nested open ended sets, for instance set of sets. We can deal with unification of open ended list of open ended list (here again open ended mean that the tail could be an uninstantiated logic variable), simply by applying the recursive definition of structural unification and get the most general unifier as the first answer when executing it as a logic program. However, when we are dealing with set of sets, this becomes a bit irritating. Consider the following unification:

?- { {x|S1} | SS1 } = { {y|S2} | SS2 }.

What is the most general unifier of this unification? We can think of two candidates.

  • First, try the top level open ended set unification without taking consideration that the elements are open ended sets:
    SS1 = { {y|S2} | SS }, SS2 = { {x|S1} | SS }

  • Second, considering that elements are open ended sets try unifying the known elements and if that unification of elements succeeds then unify its tail:
    S1 = {y|S}, S2 = {x|S}, SS1 = SS2

Both are unifiers. Although the latter looks more compact, the former is actually a more “general” unifier in the traditional sense because it represents more possibilities. When we instantiate S1 and S2 in the former as the latter unifier, we effectively get the latter. That is, further instantiating the former unification by S1 = {y|S}, S2 = {x|S}, it is equivalent to the latter because duplication is ignored in sets. Still, when we are in a situation where we eventually want the latter answer, the internal representation of sets could be far from compact by following the algorithm that generates the former. So, there are may be more things to consider to implement such set unification more efficiently when we start supporting unifications of open ended sets in addition to the usual structural unification in a general purpose logic programming system. And from my experience of writing down the algorithms for set unification it is quite difficult to predict whether the algorithm will behave as the former or the latter (or even differently from either of them).

Fortunately, for the problem of type inference involving extensible records, we do not need to consider this set-of-set situation, because the field names of a record are always atomic labels (e.g., x and y in the examples above), not a set, not even a compound structure. But this problem is interesting in itself, and logic programming enthusiasts might want to solve it. Since my current priority is expressing type inference involving record types, I only need one level of set unification on the labels and then the type annotation of that label could be another record and so on. That is, there could be a pattern of alternating structural unification and set unification (which is map unification essentially) but not the set-of-set unification when handling types of records. So, I don’t think I can take an initiative to solving this set-of-set unification problem considering efficient implementation right now but happy to discuss who might have an initiative to push forward on this particular problem.

I still need a generic unification opeator for both structural values and set values for type inferenece specifications because type systems with records would have both record types and non-record types (e.g., basic types, function types, …). In the implementation (internally hacking logic programming implementation based on micorKanren), the tricky part is tying the knot to provide a generic unification operator that could apply for both structural values and set values. The implementation should starting from a structural recursion, and adding a hook that calls open ended set unification when it meets a set constructor. However, the set operations are also defined in terms of a unification operator, so carelessly tying the knot could easily run into an infinite loop. I think I am getting close to tying the knot. I was able to implement set-of-set unification by tying the knot at the membership constraint so that set unification could call set unification for its elements if needed. I havn’t gotten to the last step of tying the knot at the structural equality to invoke set unification when it meets the set constructor.

| Posted by Ki Yung Ahn

Getting started with PureScript 0.7.1.0

Some Triva about PureScript

PureScript is a purely functional language specifically targeting JavaScript as its backend. If you have tasted Haskell and lamenting about the current untyped mess in scripting languages, especially regarding web-programming, probabiy your only choice out there is PureScript.

PureScript is the only language with a sane approch to staticaly support duck typing on records by using Row Polymorphism.

In addition, PureScript supports Extensible Effects with the same mechanism, using Row Polymorphism. Monads are labeled with effects (e.g. console for console I/O) in PureScript. This greatly reduces the problem of excessive transformer stack-up when composing Haskell monadic libraries simply for combining effects in obvious ways.

Short note for new PureScript users (like me)

This is a short note for those who want to try PureScript, who are already familliar with GHC and cabal. PureScript is relatively well documented, but for newbies it could be confusing because updated documents for the recent version 0.7.1.0 are scatterred around several URLs and the e-book and several other documents on the web is written based on the older version.

Installing PureScript is easy if you have installed cabal for installing PureScript and npm for installing JavaScript packages, especially for installing pulp (package dependency manangment/sandboxing for PureScript). Since purescript is implemented in Haskell, I felt it most natural to use cabal because I am GHC user. But you don’t have to have GHC and cabal installed in fact. There are also binary distribtions directly downlodable or via other binary packge managers from the PureScript homepage; see http://www.purescript.org/download/ for more informtion. I’ll just stick to cabal in this document.

You can install PureScript in any system cabal runs.

cabal install purescript


For npm and pulp,

You’d do somthing like this on Mac OS X

brew install npm
npm install -g pulp


and on Debian-based linux systems

sudo apt-get install npm
npm install pulp


Most users of Mac OS X would have Admin rights so it is most reasonlable to install pulp globalally in the system. The PureScript binary files should be installed in ~/.cabal/bin and you should have already added ~/.cabal/bin to your PATH already if you are a sensible GHC user. Since you’ve installed pulp globally with -g option, it should also be in your PATH. Note that installing pulp can take a while, especially when you have freshly installed npm, because it installs all its dependencies.

Altough you can sudo install -g pulp on linux systems, if you are in sudo group, it is usually not recommended in linux systems. So, just install it locally and set up some additional path in bashrc or whatever shell configuration file you use. In some linux distros, such as Debian, the executable filename for the “node.js” is named as nodejs rather than node in most other systems. So, you’d have to make a symbolic link in one of your PATH anyway. The pulp package mananager for PureScript also tries to find node. Anyway this is what I did in Debian after installing pulp locally. The npm pacakge manager for JavaScript on Debian installs user packages into ~/npm_modules directory and its binary/executable files are symlinked into ~/npm_modules/.bin directory.

kyagrd@debair:~$ ls -al node_modules/
total 16
drwxr-xr-x  4 kyagrd kyagrd 4096 Aug  2 22:30 .
drwxr-xr-x 33 kyagrd kyagrd 4096 Aug  3 00:01 ..
drwxr-xr-x  2 kyagrd kyagrd 4096 Aug  3 00:00 .bin
drwxr-xr-x  4 kyagrd kyagrd 4096 Aug  2 22:29 pulp
kyagrd@debair:~$ ls -al node_modules/.bin
total 8
drwxr-xr-x 2 kyagrd kyagrd 4096 Aug  3 00:00 .
drwxr-xr-x 4 kyagrd kyagrd 4096 Aug  2 22:30 ..
lrwxrwxrwx 1 kyagrd kyagrd   15 Aug  3 00:00 node -> /usr/bin/nodejs
lrwxrwxrwx 1 kyagrd kyagrd   16 Aug  2 22:30 pulp -> ../pulp/index.js


So, this is a good place to stick in the renamed symlink for “node.js”. (If you don’t like it here you can put it in anywhere else say ~/bin and add it to your PATH.)

kyagrd@debair:~$ ln -s /usr/bin/nodejs ~/node_modules/.bin/node
kyagrd@debair:~$ ls -al node_modules/.bin
total 8
drwxr-xr-x 2 kyagrd kyagrd 4096 Aug  3 00:00 .
drwxr-xr-x 4 kyagrd kyagrd 4096 Aug  2 22:30 ..
lrwxrwxrwx 1 kyagrd kyagrd   15 Aug  3 00:00 node -> /usr/bin/nodejs
lrwxrwxrwx 1 kyagrd kyagrd   16 Aug  2 22:30 pulp -> ../pulp/index.js


Now we’re done with all the tedious stuff and have fun.

kyagrd@debair:~$ mkdir pursproj
kyagrd@debair:~$ cd pursproj/
kyagrd@debair:~/pursproj$ pulp init
* Generating project skeleton in /home/kyagrd/pursproj
bower cached        git://github.com/purescript/purescript-console.git#0.1.0    
bower validate      0.1.0 against git://github.com/purescript/purescript-console.git#^0.1.0                                                                     
bower cached        git://github.com/purescript/purescript-eff.git#0.1.0        
bower validate      0.1.0 against git://github.com/purescript/purescript-eff.git#^0.1.0                                                                         
bower cached        git://github.com/purescript/purescript-prelude.git#0.1.1    
bower validate      0.1.1 against git://github.com/purescript/purescript-prelude.git#^0.1.0
bower install       purescript-console#0.1.0
bower install       purescript-eff#0.1.0
bower install       purescript-prelude#0.1.1

purescript-console#0.1.0 bower_components/purescript-console
└── purescript-eff#0.1.0

purescript-eff#0.1.0 bower_components/purescript-eff
└── purescript-prelude#0.1.1

purescript-prelude#0.1.1 bower_components/purescript-prelude


The pulp init command creates a template PureScript project for you. To see other command of pulp, you can pulp --help.

Usage: /home/kyagrd/node_modules/.bin/pulp [options]  [command-options]

Global options:
  --bower-file -b  Read this bower.json file instead of autodetecting it.
  --help -h              Show this help message.
  --monochrome           Don't colourise log output.
  --then         Run a shell command after the operation finishes.
                         Useful with `--watch`.
  --version -v           Show current pulp version
  --watch -w             Watch source directories and re-run command if
                         something changes.

Commands:
  browserify Produce a deployable bundle using Browserify.
  build      Build the project.
  dep        Invoke Bower for package management.
  docs       Generate project documentation.
  init       Generate an example PureScript project.
  psci       Launch a PureScript REPL configured for the project.
  run        Compile and run the project.
  test       Run project tests.

  Use `/home/kyagrd/node_modules/.bin/pulp  --help` to learn about
  command specific options.


We can try run and test commands

kyagrd@debair:~/pursproj$ pulp run
* Building project in /home/kyagrd/pursproj
* Build successful.
Hello sailor!
kyagrd@debair:~/pursproj$ cat src/Main.purs 
module Main where

import Control.Monad.Eff.Console

main = do
  log "Hello sailor!"
kyagrd@debair:~/pursproj$ pulp test
* Building project in /home/kyagrd/pursproj
* Build successful. Running tests...
You should add some tests.
* Tests OK.
kyagrd@debair:~/pursproj$ cat test/Main.purs 
module Test.Main where

import Control.Monad.Eff.Console

main = do
  log "You should add some tests."
kyagrd@debair:~/pursproj$ 


The run and test command invokes build command before running or testing if the project has not been built yet.

One last thing is about editing the “bower.json” file. The pulp package manager for PureScript uses bower (a local sandboxing package manager for javascirpt packages and also used for purescript cuase purescript package seems to be desinged to be compatible with javascript package versioning/dpenecy convention) to locally sandbox all its dependencies in the bower_compnents directory. I think for beginners like me, the only thing to edit is the dependency section.

kyagrd@debair:~/pursproj$ ls
bower.json  bower_components  output  src  test
kyagrd@debair:~/pursproj$ cat bower.json 
{
  "name": "pursproj",
  "version": "1.0.0",
  "moduleType": [
    "node"
  ],
  "ignore": [
    "**/.*",
    "node_modules",
    "bower_components",
    "output"
  ],
  "dependencies": {
    "purescript-console": "^0.1.0"
  }
}


The dependices section of the default “bower.json” created by invoking pulp init only contains the purescript-console package. As you need to use more library packages, you can list them in the dependencies section. There is a list of most commonly used basic packages (of course including purescript-console) combined as a sort of meta-package called purescript-base. We can edit the depenecy section, instead of purescript-console change it to purescript-base. Because purescript-base depends on purescript-console, it is redundent to specify purescript-console when there is purescript-base. The current version of purescript-base is 0.1.0, which happens to be same as the version of purescript-console.

So, change it like this and then install all its depending packages using pulp dep install (which invokes bower install). It sure does install more packages than before.

kyagrd@debair:~/pursproj$ vi bower.json  ### edit with text editor
kyagrd@debair:~/pursproj$ cat bower.json 
{
  "name": "pursproj",
  "version": "1.0.0",
  "moduleType": [
    "node"
  ],
  "ignore": [
    "**/.*",
    "node_modules",
    "bower_components",
    "output"
  ],
  "dependencies": {
    "purescript-base": "^0.1.0"
  }
}
kyagrd@debair:~/pursproj$ pulp dep install 
bower cached        git://github.com/purescript-contrib/purescript-base.git#0.1.0
bower validate      0.1.0 against git://github.com/purescript-contrib/purescript-base.git#^0.1.0
bower cached        git://github.com/purescript/purescript-arrays.git#0.4.1
bower validate      0.4.1 against git://github.com/purescript/purescript-arrays.git#^0.4.0
bower cached        git://github.com/purescript/purescript-control.git#0.3.0
bower validate      0.3.0 against git://github.com/purescript/purescript-control.git#^0.3.0
bower cached        git://github.com/purescript/purescript-either.git#0.2.0
bower validate      0.2.0 against git://github.com/purescript/purescript-either.git#^0.2.0
bower cached        git://github.com/purescript/purescript-enums.git#0.5.0
bower validate      0.5.0 against git://github.com/purescript/purescript-enums.git#^0.5.0
bower cached        git://github.com/purescript/purescript-foldable-traversable.git#0.4.0
bower validate      0.4.0 against git://github.com/purescript/purescript-foldable-traversable.git#^0.4.0
bower cached        git://github.com/paf31/purescript-lists.git#0.7.0
bower validate      0.7.0 against git://github.com/paf31/purescript-lists.git#^0.7.0
bower cached        git://github.com/purescript/purescript-math.git#0.2.0
bower validate      0.2.0 against git://github.com/purescript/purescript-math.git#^0.2.0
bower cached        git://github.com/purescript/purescript-maybe.git#0.3.3
bower validate      0.3.3 against git://github.com/purescript/purescript-maybe.git#^0.3.0
bower cached        git://github.com/purescript/purescript-monoid.git#0.3.0
bower validate      0.3.0 against git://github.com/purescript/purescript-monoid.git#^0.3.0
bower cached        git://github.com/purescript/purescript-strings.git#0.5.5
bower validate      0.5.5 against git://github.com/purescript/purescript-strings.git#^0.5.0
bower cached        git://github.com/purescript/purescript-tuples.git#0.4.0
bower validate      0.4.0 against git://github.com/purescript/purescript-tuples.git#^0.4.0
bower cached        git://github.com/paf31/purescript-unfoldable.git#0.4.0
bower validate      0.4.0 against git://github.com/paf31/purescript-unfoldable.git#^0.4.0
bower cached        git://github.com/purescript/purescript-integers.git#0.2.1
bower validate      0.2.1 against git://github.com/purescript/purescript-integers.git#^0.2.0
bower cached        git://github.com/purescript/purescript-st.git#0.1.0
bower validate      0.1.0 against git://github.com/purescript/purescript-st.git#^0.1.0
bower cached        git://github.com/purescript-contrib/purescript-bifunctors.git#0.4.0
bower validate      0.4.0 against git://github.com/purescript-contrib/purescript-bifunctors.git#^0.4.0
bower cached        git://github.com/purescript/purescript-lazy.git#0.4.0
bower validate      0.4.0 against git://github.com/purescript/purescript-lazy.git#^0.4.0
bower cached        git://github.com/purescript/purescript-invariant.git#0.3.0
bower validate      0.3.0 against git://github.com/purescript/purescript-invariant.git#^0.3.0
bower install       purescript-base#0.1.0
bower install       purescript-arrays#0.4.1
bower install       purescript-either#0.2.0
bower install       purescript-control#0.3.0
bower install       purescript-enums#0.5.0
bower install       purescript-foldable-traversable#0.4.0
bower install       purescript-lists#0.7.0
bower install       purescript-math#0.2.0
bower install       purescript-maybe#0.3.3
bower install       purescript-monoid#0.3.0
bower install       purescript-strings#0.5.5
bower install       purescript-tuples#0.4.0
bower install       purescript-unfoldable#0.4.0
bower install       purescript-st#0.1.0
bower install       purescript-integers#0.2.1
bower install       purescript-bifunctors#0.4.0
bower install       purescript-lazy#0.4.0
bower install       purescript-invariant#0.3.0

purescript-base#0.1.0 bower_components/purescript-base
├── purescript-arrays#0.4.1
├── purescript-console#0.1.0
├── purescript-control#0.3.0
├── purescript-either#0.2.0
├── purescript-enums#0.5.0
├── purescript-foldable-traversable#0.4.0
├── purescript-integers#0.2.1
├── purescript-lists#0.7.0
├── purescript-math#0.2.0
├── purescript-maybe#0.3.3
├── purescript-monoid#0.3.0
├── purescript-strings#0.5.5
├── purescript-tuples#0.4.0
└── purescript-unfoldable#0.4.0

purescript-arrays#0.4.1 bower_components/purescript-arrays
├── purescript-foldable-traversable#0.4.0
├── purescript-st#0.1.0
└── purescript-tuples#0.4.0

purescript-either#0.2.0 bower_components/purescript-either
└── purescript-foldable-traversable#0.4.0

purescript-control#0.3.0 bower_components/purescript-control
└── purescript-prelude#0.1.1

purescript-enums#0.5.0 bower_components/purescript-enums
├── purescript-either#0.2.0
├── purescript-strings#0.5.5
└── purescript-unfoldable#0.4.0

purescript-foldable-traversable#0.4.0 bower_components/purescript-foldable-traversable
├── purescript-bifunctors#0.4.0
└── purescript-maybe#0.3.3

purescript-lists#0.7.0 bower_components/purescript-lists
├── purescript-foldable-traversable#0.4.0
├── purescript-lazy#0.4.0
└── purescript-unfoldable#0.4.0

purescript-math#0.2.0 bower_components/purescript-math

purescript-maybe#0.3.3 bower_components/purescript-maybe
└── purescript-monoid#0.3.0

purescript-monoid#0.3.0 bower_components/purescript-monoid
├── purescript-control#0.3.0
└── purescript-invariant#0.3.0

purescript-strings#0.5.5 bower_components/purescript-strings
└── purescript-maybe#0.3.3

purescript-tuples#0.4.0 bower_components/purescript-tuples
└── purescript-foldable-traversable#0.4.0

purescript-unfoldable#0.4.0 bower_components/purescript-unfoldable
├── purescript-arrays#0.4.1
└── purescript-tuples#0.4.0

purescript-st#0.1.0 bower_components/purescript-st
└── purescript-eff#0.1.0

purescript-integers#0.2.1 bower_components/purescript-integers
├── purescript-math#0.2.0
└── purescript-maybe#0.3.3

purescript-bifunctors#0.4.0 bower_components/purescript-bifunctors
└── purescript-control#0.3.0

purescript-lazy#0.4.0 bower_components/purescript-lazy
└── purescript-monoid#0.3.0

purescript-invariant#0.3.0 bower_components/purescript-invariant
└── purescript-prelude#0.1.1


There are several tutorial examples online. Many of them should be doable by editing the src/Main.purs file. Ones that actually runs on browsers would need to use pulp browserify, which outputs single file that contains self contained javascript code. We can do it for this template project too (you need to check developer console for the output though).

kyagrd@debair:~/pursproj$ pulp browserify > index.js                           
* Browserifying project in /home/kyagrd/pursproj
* Building project in /home/kyagrd/pursproj
* Build successful.
* Browserifying...
kyagrd@debair:~/pursproj$ cat > index.html
<html>
 <head>
  <title>Example purescript app</title>
  <meta charset="UTF-8">
 </head>
 <body>
  Check the console
  <script type="text/javascript" src="index.js"></script>
 </body>
</html>
kyagrd@debair:~/pursproj$ ls
bower.json  bower_components  index.html  index.js  output  src  test


Load “index.html” in a browser that supports a deveolper console like Firefox. Every time you load “index.html” you can see the console log printing “hello sailor!”, just as it did when we pulp run on the command line console.

Happy PureScript-ing!

| Posted by Ki Yung Ahn

An example of higher-kinded polymorphism not involving type classes

We almost always explain higher-kinded polyomrphism (a.k.a. type constructor polymorphism) in Haskell with an example involving type classes such as Monad. It is natural to involve type classes because type classes were the motivation to support type constructor polymoprhism in Haskell. However, it is good to have an example that highlights the advantage of having higher-kinded polymorphism, because type constructor variables are not only useful for defining type classes. Here is an example I used in my recent draft of a paper to describe what type constructor polymorphism is and how they are useful.

data Tree c a = Leaf a | Node (c (Tree c a))

newtype Pair a = Pair (a,a)
type BTree a = Tree Pair a

type RTree a = Tree [] a


The ability to abstract over type constructors empowers us to define more generic data structures, such as Tree, that can be instantiated to well-knwon data structures, such as BTree (binary tree) or RTree (rose tree). The Tree datatype is parametrized by c :: * -> *, which determines the branching structure at the nodes, and a :: *, which determines the type of elemenets hanging on the leaves. By instantiating the type construcor variable c, we get concrete tree data structures, which we would define sparately as follows without type constructor polymophism, but only with type polymorphism:

data BTree a = BLeaf a | BNode (BTree a, BTree a)

data RTree a = RLeaf a | RNode [RTree a]
| Posted by Ki Yung Ahn

HM+HigherKindedPoly+KindPoly in 25 lines of hacky Prolog

Previously, I posted about 7-liner Hindley–Milner type inference written in Prolog.

Today I’ve managed to extend it to support higher-kinded polymorphism (a.k.a. type constructor polymorphsim) and kind polymorphism (of couse, both rank 1).

It is only 25 lines, but I left out pattern matching expressoins because there could be several differnt design choices (e.g. nested pattern support). As long as we are dealing with regular datatypes only, type inference for pattern matching shouldn’t be fundamentally complicated.

One trick I had to come up with is delaying the kind sanity check for function type introduction. If you just call on kind/3 inside type(..., A->B, ...), either one or both of A and B may contain variables that would later be further speciallized and we don’t have enough information for kind checking.

| Posted by Ki Yung Ahn

Hindley--Milner in 7 lines of hacky Prolog

While ago, I wrote a 8-liner Curry program of STLC type inference. (6 lines excluding datatype declearation)

There is a 7 liner Prolog for Hindley–Milner!
(I didn’t write this but stumbled across while web searching)

This 7-liner is not pure Prolog ‘cuase it uses copy_term/2 to implement instantiation of polymorphic types.

I wonder how much this would scale up. I think it should work up to type constructor polymorphism for regular ADTs in functional languages, and may also be able to add top-level rank-1 kind polymorphism on top of that. But it might get messy once you start to throw in non-regular datatypes and even worse when GADTs come along.

| Posted by Ki Yung Ahn

Some quick thoughts on a restriceted form of kind polymophism

It is well known that adding unrestricted kind polymorphism to the higher-order polymorphic lambda calculus, a.k.a. System \(F_\omega\), leads to logically inconsistent paradoxes. For those in dependently typed theorem proving community, this is non of their concern, since dependently typed proof assistants are usually based on predicative calculus with stratified universes. However, there is an alternative approach that has not been well explored yet.

Consider taking an alternative approach of starting from a more traditional functional language based on impredicative polymorphism and turning that into a theorem proving system. For instance consider a certain subset of Haskell with GHC extensions such as RankNTypes and ImpredicativeTypes (after all, core of GHC is an extension of \(F_\omega\) ), and trying to carve out a logically consistent language. This is exactly the motivation of my Ph.D. thesis, and its resulting artifact is the Nax language. To make the long story short, we should be able to have a logically consistent subset of say Haskell with GHC extension including GADTs and PolyKind. The benefit of such design is that we can reuse all the facilities that have already been built around functional languages like Haskell, which has far more sophisticated and well-optimized compilers than most theorem proving systems.

If we take this route, we should figure out a safe way to extend impredicative polymorphism with a restricted form of kind polymorphism. I believe that extending \(F_\omega\) with Rank 1 polymorphic kind bindings for finite number of globally defined type constructor variables is safe. That is, the resulting calculus with this extension is logically consistent. Proof of logical consistency of this extended calculus should be quite straightforward by mapping the extended calculus back into \(F_\omega\) by inlining.

Here is the syntax of the extended calculus: $$ \sigma ::= \kappa \mid \forall \chi.\kappa $$ $$ \kappa ::= \chi \mid * \mid \kappa \to \kappa’ $$ $$ A, B, F, G ::= X \mid A \to B \mid \lambda X^{\kappa}.F \mid F~G \mid \forall X^{\kappa}.B $$ $$ \Sigma ::= \cdot \mid T^\sigma,\Sigma $$ $$ \Gamma ::= \cdot \mid X^{\kappa} \mid x:A $$ $$ t, r, s ::= x \mid \lambda x.t \mid r~s $$

A type constructor name \(T\) is no different than a type variable \(X\). It is just a notional convention to emphasize that the variable stands for a globally defined name for a type constructor. From a list of globally defined type constructors \( T_1 = F_1, T_2 = F_2, \ldots, T_n = F_n \) a global context \(\Sigma\) can be obtained such that \(\Sigma = T_1^{\sigma_1}, T_2^{\sigma_2}, \ldots, T_n^{\sigma_n}\) Each \(\sigma_i\) is a polymorphic kind of \(F_i\). For instance, the polymorphic kind scheme of \(F_1 = \lambda X^{\chi}.X\) is \(\sigma_1 = \forall \chi.\chi \to \chi\).

For simplicity, let us consider just one global definition where \(T_1 = F_1\). Assuming \(F_1\) is well kinded, we get \(\Sigma = T_1^{\sigma_1]}\). In the derivation tree of a typing judgment \(T_1^{\sigma_1};\Gamma \vdash t : A\), we inline each occurrence of \(T_1\) with an instantiation of its definition \(F_1\), instantiating all the free kind variables in \(F_1\) appropriately for each occurrence. For instance, if we had \(F_1 = \lambda X^{\chi}.X\), we might instantiate it to \(\lambda X^{*}.X\), \(\lambda X^{ * \to * }.X\), or any appropriate instantiation of \(\chi\) as needed. Once we have inlined every occurrence of \(T_1\), the global context \(\Sigma\) becomes irrelevant. Therefore, we can simply ignore \(\Sigma\), and the resulting derivation tree is exactly the derivation tree for a typing judgment in \(F_\omega\).

Generalizing the inlining for \(n\) definitions is not so hard. The only additional thing to consider is that the previously defined type constructor names can be used in the latter definitions. Consider \(T_1 = F_1, T_2 = F_2, \ldots, T_n = F_n\). The first name \(T_1\) can appear in any of \(F_2, \ldots, F_n\). So, inlining should be performed in a cascading manner starting from \(T_1\), inlining inside \(\Sigma\) as well as other parts of the derivation tree.

Next step is to generalize this idea for System \(F_i\) which extends System \(F_\omega\) with erasable type indices. Here, this will get more complicated because type variables and term-index variables can appear in kinds as well as kind variables.