Planet Scheme

July 31, 2015

Programming Praxis

Incremental Sieve Of Eratosthenes

The standard Sieve of Eratosthenes requires that you specify an upper bound before you start. But often, programs that use prime numbers don’t know the upper bound in advance, or don’t want to pre-compute and store all of the primes up to their bound. In those cases, an incremental Sieve of Eratosthenes can be used:

Algorithm PRIMEGEN: Create a generating function that returns the next prime number each time it is called, starting from zero.
1. [Prime the pump] Yield 2, then yield 3.
2. [Initialization] Set ps ← PRIMEGEN, D ← {}, p ← next(ps), p ← next(ps) again, qp × p, and cp.
3. [Next candidate] Set cc + 2.
4. [Composite candidate] If cD, skip to Step 5. Otherwise, set sD{c}, mc + s. Remove c from D. While mD, set mm + s. Set D{m} ← s. Go to Step 3.
5. [Prime candidate] If c = q, skip to Step 6. Otherwise, yield c and go to Step 3.
6. [Next sieving prime] Set sp + p, mc + s. While mD, set mm + s. Set D{m} ← s. Set p ← next(ps), qp × p. Go to Step 3.

The PRIMEGEN algorithm creates the sequence of prime numbers, returning the next prime in the sequence each time it is called; the exact mechanism to do that depends on the implementation language. The first step primes the pump (sorry!) by issuing the first two primes: 2 is a special case because it is even, and 3 is a special case because the pump only considers odd numbers as prime candidates.

The second step initializes two important data structures: ps is the list of sieving primes, which is determined by calling PRIMEGEN recursively, and D is a dictionary of composite/stride pairs, one for each sieving prime less than the square root of the current prime; the dictionary will most likely be implemented as a hash table, but other data structures such as a balanced binary tree could also be used. The other initializations are the current sieving prime p, its square q, and the initial prime candidate c.

The third step begins the main loop of the function by calculating the next prime candidate; eventually, all odd numbers starting from 5 will be prime candidates. Then there are three possibilities, each handled by a separate step.

The fourth step handles the case that the candidate c is composite, resetting the dictionary entry for the appropriate sieving prime. The fifth step handles the case that the candidate c is both composite and less than the square q of the current sieving prime, which indicates that the candidate is prime. The sixth step occurs when the candidate is composite but not in the dictionary, having reached the square of the current sieving prime, when a new sieving prime is added to the dictionary and the current sieving prime is updated in variables p and q. After the appropriate option has been processed, the algorithm returns to the top of the main loop to obtain the next prime.

In the fourth and sixth steps, the while loop calculates the new dictionary entry for the current sieving prime: the stride s = 2p is the distance between odd multiples of the sieving prime, and m is the smallest multiple of the sieving prime p greater than the current candidate c that is not already in the dictionary. The dictionary is keyed by m, which is a multiple of a sieving prime, with the corresponding stride s as its value.

There are several points to note about this algorithm. First, it is recursive, and there will eventually be a stack of sieving prime sequences, which sounds bizarre but actually makes sense. Second, by postponing the addition of new sieving primes to the dictionary until reaching their squares, only primes less than the square root of the current candidate need to be stored. And third, eliminating duplicate dictionary entries with the while loop of steps 4 and 6 keeps the size of the dictionary at exactly the number of sieving primes already processed. The whole algorithm is very efficient, making it useful whenever processing primes incrementally.

Time complexity of the algorithm is O(n log log n), the same as any other implementation of the Sieve of Eratosthenes, and space complexity is O(sqrt n) to store the dictionary of sieving primes.

Your task is to implement the incremental Sieve of Eratosthenes. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at July 31, 2015 09:00 AM

July 28, 2015

Programming Praxis

Next To Last Item In A List

Today’s exercise is a reminder about the importance of writing good test code. We have two tasks. The first is to extract the next-to-last item from a linked list; for instance, given the list (1 2 3 4 5) the next-to-last item is 4. The second task is to extract the nth-from-last item from a linked last; for instance, given the same list, the second-from-last item is 4. In addition to writing the two functions, you should write test code that exercises the functions thoroughly.

Your task is to write the two functions and test code. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at July 28, 2015 09:00 AM

Andy Wingo

loop optimizations in guile

Sup peeps. So, after the slog to update Guile's intermediate language, I wanted to land some new optimizations before moving on to the next thing. For years I've been meaning to do some loop optimizations, and I was finally able to land a few of them.

loop peeling

For a long time I have wanted to do "loop peeling". Loop peeling means peeling off the first iteration of a loop. If you have a source program that looks like this:

while foo:
  bar()
  baz()

Loop peeling turns it into this:

if foo:
  bar()
  baz()
  while foo:
    bar()
    baz()

You wouldn't think that this is actually an optimization, would you? Well on its own, it's not. But if you combine it with common subexpression elimination, then it means that the loop body is now dominated by all effects and all loop-invariant expressions that must be evaluated for the expression to loop.

In dynamic languages, this is most useful when one source expression expands to a number of low-level steps. So for example if your language runtime implements top-level variable references in three parts, one where it gets a reference to a mutable box, then it checks if the box has a value, and and the third where it unboxes it, then we would have:

if foo:
  bar_location = lookup("bar")
  bar_value = dereference(bar_location)
  if bar_value is null: throw NotFound("bar")
  call(bar_value)

  baz_location = lookup("baz")
  baz_value = dereference(baz_location)
  if baz_value is null: throw NotFound("baz")
  call(baz_value)

  while foo:
    bar_value = dereference(bar_location)
    call(bar_value)

    baz_value = dereference(baz_location)
    call(baz_value)

The result is that we have hoisted the lookups and null checks out of the loop (if a box can never transition from full back to empty). It's a really powerful transformation that can even hoist things that traditional loop-invariant code motion can't, but more on that later.

Now, the problem with loop peeling is that usually values will escape your loop. For example:

while foo:
  x = qux()
  if x then return x
...

In this little example, there is a value x, and the return x statement is actually not in the loop. It's syntactically in the loop, but the underlying representation that the compiler uses looks more like this:

function qux(k):
  label loop_header():
    fetch(foo) -gt; loop_test
  label loop_test(foo_value):
    if foo_value then -> exit else -> body
  label body():
    fetch(x) -gt; have_x
  label have_x(x_value):
    if x_value then -> return_x else -> loop_header
  label return_x():
    values(x) -> k
  label exit():
    ...

This is the "CPS soup" I described in my last post. Only the bold parts are in the loop; notably, the return is outside the loop. Point being, if we peel off the first iteration, then there are two possible values for x that we would return:

if foo:
  x1 = qux()
  if x1 then return x1
  while foo:
    x2 = qux()
    if x2 then return x2
  ...

I have them marked as x1 and x2. But I've also duplicated the return x terms, which is not what we want. We want to peel off the first iteration, which will cause code growth equal to the size of the loop body, but we don't want to have to duplicate everything that's after the loop. What we have to do is re-introduce a join point that defines x:

if foo:
  x1 = qux()
  if x1 then join(x1)
  while foo:
    x2 = qux()
    if x2 then join(x2)
  ...
label join(x)
  return x

Here I'm playing fast and loose with notation because the real terms are too gnarly. What I'm trying to get across is that for each value that flows out of a loop, you need a join point. That's fine, it's a bit more involved, but what if your loop exits to two different points, but one value is live in both of them? A value can only be defined in one place, in CPS or SSA. You could re-place a whole tree of phi variables, in SSA parlance, with join blocks and such, but it's just too hard.

However we can still get the benefits of peeling in most cases if we restrict ourselves to loops that exit to only one continuation. In that case the live variable set is the intersection of all variables defined in the loop that are live at the exit points. Easy enough, and that's what we have in Guile now. Peeling causes some code growth but the loops are smaller so it should still be a win. Check out the source, if that's your thing.

loop-invariant code motion

Usually when people are interested in moving code out of loops they talk about loop-invariant code motion, or LICM. Contrary to what you might think, LICM is complementary to peeling: some things that peeling+CSE can hoist are not hoistable by LICM, and vice versa.

Unlike peeling, LICM does not cause code growth. Instead, for each expression in a loop, LICM tries to hoist it out of the loop if it can. An expression can be hoisted if all of these conditions are true:

  1. It doesn't cause the creation of an observably new object. In Scheme, the definition of "observable" is quite subtle, so in practice in Guile we don't hoist expressions that can cause any allocation. We could use alias analysis to improve this.

  2. The expression cannot throw an exception, or the expression is always evaluated for every loop iteration.

  3. The expression makes no writes to memory, or if it writes to memory, other expressions in the loop cannot possibly read from that memory. We use effects analysis for this.

  4. The expression makes no reads from memory, or if it reads from memory, no other expression in the loop can clobber those reads. Again, effects analysis.

  5. The expression uses only loop-invariant variables.

This definition is inductive, so once an expression is hoisted, the values it defines are then considered loop-invariant, so you might be able to hoist a whole chain of values.

Compared to loop peeling, this has the gnarly aspect of having to explicitly reason about loop invariance and manually move code, which is a pain. (Really LICM would be better named "artisanal code motion".) However it causes no code growth, which is a plus, though like peeling it can increase register pressure. But the big difference is that LICM can hoist effect-free expressions that aren't always executed. Consider:

while foo:
  x = qux() ? "hi" : "ho"

Here for some reason it could be faster to cache "hi" or "ho" in registers, which is what LICM allows:

hi, ho = "hi", "ho"
while foo:
  x = qux() ? hi : ho

On the other hand, LICM alone can't hoist the if baz is null checks in this example from above:

while foo:
  bar()
  baz()

The issue is that the call to bar() might not return, so the error that might be thrown if baz is null shouldn't be observed until bar is called. In general we can't hoist anything that might throw an exception past some non-hoisted code that might throw an exception. This specific situation happens in Guile but there are similar ones in any language, I think.

More formally, LICM will hoist effectful but loop-invariant expressions that postdominate the loop header, whereas peeling hoists those expressions that dominate all back-edges. I think? We'll go with that. Again, the source.

loop inversion

Loop inversion is a little hack to improve code generation, and again it's a little counterintuitive. If you have this loop:

while n < x:
  n++

Loop inversion turns it into:

if n < x:
  do
    n++
  while n < x

The goal is that instead of generating code that looks like this:

header:
  test n, x;
  branch-if-greater-than-or-equal done;
  x = x + 1
  goto header
done:

You make something that looks like this:

  test n, x;
  branch-if-greater-than-or-equal done;
header:
  x = x + 1
  test n, x;
  branch-if-less-than header;
done:

The upshot is that the loop body now contains one branch instead of two. It's mostly helpful for tight loops.

It turns out that you can express this transformation on CPS (or SSA, or whatever), but that like loop peeling the extra branch introduces an extra join point in your program. If your loop exits to more than one label, then we have the same problems as loop peeling. For this reason Guile restricts loop inversion (which it calls "loop rotation" at the moment; I should probably fix that) to loops with only one exit continuation.

Loop inversion has some other caveats, but probably the biggest one is that in Guile it doesn't actually guarantee that each back-edge is a conditional branch. The reason is that usually a loop has some associated loop variables, and it could be that you need to reshuffle those variables when you jump back to the top. Mostly Guile's compiler manages to avoid shuffling, allowing inversion to have the right effect, but it's not guaranteed. Fixing this is not straightforward, since the shuffling of values is associated with the predecessor of the loop header and not the loop header itself. If instead we reshuffled before the header, that might work, but each back-edge might have a different shuffling to make... anyway. In practice inversion seems to work out fine; I haven't yet seen a case where it doesn't work. Source code here.

loop identification

One final note: what is a loop anyway? Turns out this is a somewhat hard problem, especially once you start trying to identify nested loops. Guile currently does the simple thing and just computes strongly-connected components in a function's flow-graph, and says that a loop is a non-trivial SCC with a single predecessor. That won't tease apart loop nests but oh wells! I spent a lot of time last year or maybe two years ago with that "Loop identification via D-J graphs" paper but in the end simple is best, at least for making incremental steps.

Okeysmokes, until next time, loop on!

by Andy Wingo at July 28, 2015 08:10 AM

July 27, 2015

Andy Wingo

cps soup

Hello internets! This blog goes out to my long-time readers who have followed my saga hacking on Guile's compiler. For the rest of you, a little history, then the new thing.

In the olden days, Guile had no compiler, just an interpreter written in C. Around 8 years ago now, we ported Guile to compile to bytecode. That bytecode is what is currently deployed as Guile 2.0. For many reasons we wanted to upgrade our compiler and virtual machine for Guile 2.2, and the result of that was a new continuation-passing-style compiler for Guile. Check that link for all the backstory.

So, I was going to finish documenting this intermediate language about 5 months ago, in preparation for making the first Guile 2.2 prereleases. But something about it made me really unhappy. You can catch some foreshadowing of this in my article from last August on common subexpression elimination; I'll just quote a paragraph here:

In essence, the scope tree doesn't necessarily reflect the dominator tree, so not all transformations you might like to make are syntactically valid. In Guile 2.2's CSE pass, we work around the issue by concurrently rewriting the scope tree to reflect the dominator tree. It's something I am seeing more and more and it gives me some pause as to the suitability of CPS as an intermediate language.

This is exactly the same concern that Matthew Fluet and Stephen Weeks had back in 2003:

Thinking of it another way, both CPS and SSA require that variable definitions dominate uses. The difference is that using CPS as an IL requires that all transformations provide a proof of dominance in the form of the nesting, while SSA doesn't. Now, if a CPS transformation doesn't do too much rewriting, then the partial dominance information that it had from the input tree is sufficient for the output tree. Hence tree splicing works fine. However, sometimes it is not sufficient.

As a concrete example, consider common-subexpression elimination. Suppose we have a common subexpression x = e that dominates an expression y = e in a function. In CPS, if y = e happens to be within the scope of x = e, then we are fine and can rewrite it to y = x. If however, y = e is not within the scope of x, then either we have to do massive tree rewriting (essentially making the syntax tree closer to the dominator tree) or skip the optimization. Another way out is to simply use the syntax tree as an approximation to the dominator tree for common-subexpression elimination, but then you miss some optimization opportunities. On the other hand, with SSA, you simply compute the dominator tree, and can always replace y = e with y = x, without having to worry about providing a proof in the output that x dominates y (i.e. without putting y in the scope of x)

[MLton-devel] CPS vs SSA

To be honest I think all this talk about dominators is distracting. Dominators are but a lightweight flow analysis, and I usually find myself using full-on flow analysis to compute the set of optimizations that I can do on a piece of code. In fact the only use I had for dominators in the nested CPS language was to rewrite scope trees! The salient part of Weeks' observation is that nested scope trees are the problem, not that dominators are the solution.

So, after literally years of hemming and hawing about this, I finally decided to remove nested scope trees from Guile's CPS intermediate language. Instead, a function is now a collection of labelled continuations, with one distinguished entry continuation. There is no more $letk term to nest continuations in each other. A program is now represented as a "soup" -- basically a map from labels to continuation bodies, again with a distinguished entry. As an example, consider this expression:

function(x):
  return add(x, 1)

If we rewrote it in continuation-passing style, we'd give the function a name for its "tail continuation", ktail, and annotate each expression with its continuation:

function(ktail, x):
  add(x, 1) -> ktail

Here the -> ktail means that the add expression passes its values to the continuation ktail.

With nested CPS, it could look like:

function(ktail, x):
  letk have_one(one): add(x, one) -> ktail
    load_constant(1) -> have_one

Here the label have_one is in a scope, as is the value one. With "CPS soup", though, it looks more like this:

function(ktail, x):
  label have_one(one): add(x, one) -> ktail
  label main(x): load_constant(1) -> have_one

It's a subtle change, but it took a few months to make so it's worth pointing out what's going on. The difference is that there is no scope tree for labels or variables any more. A variable can be used at a label if it flows to the label, in a flow analysis sense. Indeed, determining the set of variables that can be used at a label requires flow analysis; that's what Weeks was getting at in his 2003 mail about the advantages of SSA, which are really the advantages of an intermediate language without nested scope trees.

The question arises, though, now that we've decided on CPS soup, how should we represent a program as a value? We've gone from a nested term to a graph term, and we need to find a way to represent it somehow that facilitates looking up labels by name, and facilitates tree rewrites.

In Guile's IR, labels and variables are both integers, so happily enough, we have such a data structure: Clojure-style maps specialized for integer keys.

Friends, if there has been one realization or revolution for me in the last year, it has been Clojure-style data structures. Here's why. In compilers, I often have to build up some kind of analysis, then use that analysis to transform data. Often I need to keep the old term around while I build a new one, but it would be nice to share state between old and new terms. With a nested tree, if a leaf changed you'd have to rebuild all surrounding terms, which is gnarly. But with Clojure-style data structures, more and more I find myself computing in terms of values: build up this value, transform this map to that set, fold over this map -- and yes, you can fold over Guile's intmaps -- and so on. By providing an expressive data structure for which I can control performance characteristics by using transients if needed, these data structures make my programs more about data and less about gnarly machinery.

As a concrete example, the old contification pass in Guile, I didn't have the mental capacity to understand all the moving parts in such a way that I could compute an optimal contification from the beginning; instead I had to iterate to a fixed point, as Kennedy did in his "Compiling with Continuations, Continued" paper. With the new CPS soup language and with Clojure-style data structures, I could actually fit more of the algorithm into my head, with the result that Guile now contifies optimally while avoiding the fixed-point transformation. Also, the old pass used hash tables to represent the analysis, which I found incredibly confusing to reason about -- I totally buy Rich Hickey's argument that place-oriented programming is the source of many evils in programs, and hash tables are nothing if not a place party. Using functional maps let me solve harder problems because they are easier for me to reason about.

Contification isn't an isolated case, either. For example, we are able to do the complete set of optimizations from the "Optimizing closures in O(0) time" paper, including closure sharing, which I think makes Guile unique besides Chez Scheme. I wasn't capable of doing it on the old representation because it was just too hard for me to think about, because my data structures weren't right.

This new "CPS soup" language is still a first-order CPS language in that each term specifies its continuation, and that variable names appear in the continuation of a definition, not the definition itself. This effectively makes every variable a phi variable, in the sense of SSA, and you have to do some work to get to a variable's definition. It could be that still this isn't the right number of names; consider this function:

function foo(k, x):
  label have_y(y) bar(y) -> k
  label y_is_two() load_constant(2) -> have_y
  label y_is_one() load_constant(1) -> have_y
  label main(x) if x -> y_is_one else -> y_is_two

Here there is no distinguished name for the value load_constant(1) versus load_constant(2): both are possible values for y. If we ended up giving them names, we'd have to reintroduce actual phi variables for the joins, which would basically complete the transformation to SSA. Until now though I haven't wanted those names, so perhaps I can put this off. On the other hand, every term has a label, which simplifies many things compared to having to contain terms in basic blocks, as is usually done in SSA. Yet another chapter in CPS is SSA is CPS is SSA, it seems.

Welp, that's all the nerdery for right now. Talk at yall later!

by Andy Wingo at July 27, 2015 02:43 PM

July 24, 2015

Programming Praxis

One-Swappable Array

Today’s exercise helps somebody with their homework:

Given an array of unique integers, determine if it is possible to sort the array by swapping two elements of the array. For instance, the array [1,2,6,4,5,3,7] can be sorted by swapping 3 and 6, but there is no way to sort the array [5,4,3,2,1] by swapping two of its elements. You may use O(n) time, where the array has n integers, and constant additional space.

Your task is to write a program that determines if an array can be sorted with a single swap. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at July 24, 2015 09:00 AM

July 22, 2015

GNU Guix

GNU Guix 0.8.3 released

We are pleased to announce the next alpha release of GNU Guix, version 0.8.3.

The release comes with USB installation images to install the standalone Guix System Distribution (GuixSD), and with tarballs to install the package manager on top of a running GNU/Linux system, either from source or from binaries.

The highlights for this release include:

See http://lists.gnu.org/archive/html/guix-devel/2015-07/msg00585.html for details.

About GNU Guix

GNU Guix is a functional package manager for the GNU system. The Guix System Distribution or GuixSD is an advanced distribution of the GNU system that relies on GNU Guix and respects the user's freedom.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. Guix uses low-level mechanisms from the Nix package manager, except that packages are defined as native Guile modules, using extensions to the Scheme language. GuixSD offers a declarative approach to operating system configuration management, and is highly customizable and hackable.

GuixSD can be used on an i686 or x86_64 machine. It is also possible to use Guix on top of an already installed GNU/Linux system, including on mips64el and armv7.

by Ludovic Courtès at July 22, 2015 09:25 AM

July 21, 2015

Programming Praxis

A Number Puzzle

In last week’s exercise we had a word puzzle, so today’s exercise will be a number puzzle:

Find a 10-digit number, with all digits unique, such that the first n digits of the number are divisible by n. For instance, in the 3-digit number 345, the 1-digit prefix, 3, is divisible by 1, the 2-digit prefix, 34, is divisible by 2, and the 3-digit prefix, 345, is divisible by 3.

Your task is to write a program that finds a 10-digit number as defined above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at July 21, 2015 09:00 AM

July 19, 2015

GNU Guix

GSoC update

This year Guix was lucky to have 3 GSoC projects, and they have made rather good progress so far:

  • Manolis successfully completed the recipes to get a cross-compilation toolchain to GNU/Hurd, with part of the work already in the main branch. This allowed him to produce statically-linked bootstrap binaries (stumbling upon nasty ld.so issues on the way.) Manolis is now running Guix and building packages natively on GNU/Hurd, which will constitute a large part of the remainder of his project.
  • Rémi has written Guile bindings to crucial parts of the GNUnet API, including the file sharing API. This will allow him to move to the next step: Writing tools to publish and retrieve Guix substitutes (pre-built binaries.)
  • Rohan laid the foundations of the DHCP client. The current code can send packets on all the configured network interfaces. Rohan hopes to have working code to establish leases in the following weeks.

Happy hacking!

About GNU Guix

GNU Guix is a functional package manager for the GNU system. The Guix System Distribution or GuixSD is an advanced distribution of the GNU system that relies on GNU Guix and respects the user's freedom.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. Guix uses low-level mechanisms from the Nix package manager, except that packages are defined as native Guile modules, using extensions to the Scheme language. GuixSD offers a declarative approach to operating system configuration management, and is highly customizable and hackable.

GuixSD can be used on an i686 or x86_64 machine. It is also possible to use Guix on top of an already installed GNU/Linux system, including on mips64el and armv7.

by Ludovic Courtès at July 19, 2015 08:57 AM

July 17, 2015

Programming Praxis

The Gas Station Problem

Today’s exercise is a classic problem in computer science courses:

There is a truck driving clockwise on a circular route; the truck has a gas tank with infinite capacity, initially empty. Along the route are gas stations with varying amounts of gas available: you are given a list of gas stations with the amounts of gas they have available and the amounts of gas required to drive to the next gas station. You must find a gas station that, for a trip starting from that gas station, will be able to return to that gas station.

For instance, consider a route with eight gas stations having 15, 8, 2, 6, 18, 9, 21, and 30 gallons of gas; from each of those gas stations to the next requires 8, 6, 30, 9, 15, 21, 2, and 18 gallons of gas. Obviously, you can’t start your trip at the third gas station, with 2 gallons of gas, because getting to the next gas station requires 30 gallons of gas and you would run out of gas reaching it.

Your task is to write a program that determines a suitable starting point for the truck; your algorithm should be linear in the number of gas stations and require constant space. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at July 17, 2015 10:38 AM

July 14, 2015

Programming Praxis

Ordered Words

An ordered word is one in which the letters in the word appear in alphabetic order; for instance, dirt and knotty are ordered words, praxis is not.

Your task is to write a program to find the longest ordered word in a dictionary. When you are finished, you are welcome to read a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at July 14, 2015 09:00 AM

July 10, 2015

Programming Praxis

Partitioning The Telephone Book

The city where I live used to publish a telephone directory; the “white pages” were distributed to all telephone customers once a year. Now my city no longer prints the directory; it is available on the internet, or you can pay an operator to look up a telephone number for you.

But there are still cities that print telephone directories, and some of those cities are big enough that the directory must be partitioned into multiple volumes. Consider this distribution of the first letters of customer’s last names:

A  B C  D  E  F G H I J  K L  M  N 0 P Q R  S  T U V W X Y Z
16 4 17 10 15 4 4 6 7 14 9 17 27 6 1 9 0 12 20 8 0 3 4 0 3 4

I’m not sure what the units are (probably tens of thousands of telephone customers), but the total is 220, and the telephone company has decided to print 4 volumes, so each should be about 55 units. One possible partitioning is A-D, E-J, K-O, P-Z, with counts of 47, 50, 60 and 63 units, differences of 8, 5, 5 and 8 from the ideal of 55, and a total difference of 26. Another partitioning is A-E, F-K, L-O, P-Z, with counts of 62, 44, 51, and 63 units, differences of 7, 11, 4 and 8, and a total difference of 30, which is worse. Before continuing, you might want to work out the optimal solution by hand, finding a minimum score of 18.

Your task is to write a program that determines the partitioning of the telephone book that minimizes the total difference. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at July 10, 2015 09:00 AM

July 07, 2015

Programming Praxis

Powerset

Sets are ubiquitous in programming; we have discussed sets in several previous exercises. One of the operations that can be performed on a set is to compute its powerset, which is a set of all the subsets of the original set. For instance, the powerset of (1 2 3) is the set (() (1) (2) (3) (1 2) (1 3) (2 3) (1 2 3)) where neither the order of the subsets nor the order of the elements of the individual subsets matters.

Your task is to write a program that computes the powerset of a set. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at July 07, 2015 09:00 AM

July 05, 2015

Grant Rettke

A Scheme Interpreter in Forth

This is an interpreter for a really simple dynamically scoped Scheme dialect. It only runs with Gforth, because it uses Gforth’s structs to implement its data structures.

by Grant at July 05, 2015 05:18 PM

Mastery, Questions, Hardware, Software, LISP, Forth, TI-99/4A

Over the last two years a few questions and ideas have visited me and the following is my attempt to piece them together…

This claim by Alan Kay has haunted me for the past seven months. So have these two works by Charles Moore. They make me wonder:

  • Can you be a master software craftsman without having completely mastered your
    tools?

    • Is this a natural part of the evolution of a programmer?
  • What are the surprisingly intuitive and non-obvious intersections between
    the Theory of Computation, Number Theory, and a CPU that help on that path to mastery of software development?

    • Does every mathematician require an intuitive sense of the value of the
      transistor?

It is time to start studying closer the bare metal. What is the best place to start? What are my requirements? There are only two requirements. First, I want to go through the process of learning and exploration interactively and quickly. Those are the traits of a LISP system. Second, I want it to be super inexpensive. Everyone with a television set and a keyboard should be able to follow the same approach. They shouldn’t even require the Internet. If they have Internet at work or school then they can use the Sneakernet. It should be trivial to sell this system for next to nothing. The computer will have video and sound too. It even has a beautiful HD screen. That is it. With that in mind I started looking.

The relationship between the programming language and hardware is tightly woven. Most of us don’t consider this today because we own machines that spend 99% of their time idle. Looking at languages and inexpensive hardware is a real treat because you start caring real quick! Quickly, too, I ended up on Armpit Scheme.

Every LISP programmer implements their own LISP! Or so the say, I never did. I still want to. This seems like the perfect project: C language, bare metal, and LISP, on some serious hardware. Doing some light reading about LISP on small devices, my imagination took over and quickly concluded that the next logical step was to rebuild a Lisp Machine from scratch! No, that is a little too far out of scope. Armpit seems like a fine place to start so I read about the development boards where it runs. Then the two things slowly effected this vector.

One of my best friends Bruce had loved to share with me his delight in programming FORTH. Scheme was my enlightenment tool, and his was FORTH. We would spend hours talking about both of them. Our conversations went something like this: “Me: In Scheme I explored X, and it was fun!” and then “Bruce: I explored that very idea in FORTH and this is how I did it and it was fun!”. FORTH was built to run on small CPUs. That got me learning more about FORTH.

There are a lot of distributions. There are great books about it. The community is passionate. One of the rights of passage is to write your own FORTH. It runs on a lot of CPUs. That was enough to convince me. Around the same time, something else was on my mind: Vintage Computers.

As a kid we had an Apple 2e. It was a delightful machine. Perhaps that is the right place to start? Watching craigslist and estate sales, there weren’t very many. The market is demanding them again, and on ebay they are making some money. Still, it has all the right traits now: simple, keyboard, video and sound and disk, and constrained. It ought to be inexpensive, but isn’t right now. That is OK. Months go by and I keep poking around at things. Two things jump out as possible options.

The TI Launchpad is a $9.99USD computer. It is 16-bit, has memory, and it is fast. It runs at least CamelForth, 4E4th, eForth, and 430eForth. Pay attention to the names involved. The community is small and tight knit. Implementing FORTH seems like a great way to learn/master Assembly, C, hardware, and FORTH. The source code and hardware are out there. I will go with 430eForth and the LaunchPad machine. Around the same time while learning more about FORTH I ended up back on a vintage machine option for FORTH.

The old personal computers all either included or had available, FORTH. Most of them are available free in source form today. It could be fun to use and learn on a vintage box because that is all bare metal. Low and behold, I end up watching this video to learn about TurboForth.

TurboForth is a 2015 FORTH implementation for the TI-99/4A. Perfect. Perfect! Using a real PC, you get the fun of exploring a machine with video memory and making sounds. That is just included because it is a personal computer. TurboForth lets you explore the hardware, the machine, and even the implementation itself. That is just wonderful. There is another cool part: the TI-99/4A has a very active community.

They’ve got an active user group right down in Chicago. They’ve got a conference this October! There is hardware and software to connect your box to USB storage or a hard drive. There are loads of youtube videos about it. It helps that the machine is still available at a very reasonable price. To top it off, there are first class emulators out there. People are still writing games for this machine today, in FORTH nonetheless. There are lot of good games for it, too.

The TI-99/4A and the TI Launchpad seem like fine places to start. They meet all of the requirements. Without having dug into the details, the above are assumptions, and that is a fine place to start.

by Grant at July 05, 2015 04:17 PM

July 04, 2015

Greg Hendershott

Keyword structs, revisited

This revises my Keyword structs post to fix some mistakes, discuss the struct* match pattern, and rewrite the macro to use syntax-parse and support default arguments.


A good rule of thumb in Racket is to use a struct instead of list when you’re juggling more than two or three items.

For ad-hoc prototyping, you can use a list:

1
2
3
4
5
6
7
8
;; Return first name, last name, and age
(define (get-person)
  (list "John" "Doe" 32))

(define p (get-person))
(first p)  ; => "John"
(second p) ; => "Doe"
(third p)  ; => 32

Getting the stuff out is a bit cleaner using match, which lets you “destructure” the list and bind to identifiers in one swell foop:

1
2
3
4
5
6
(match (get-person)
  [(list first last age)
   (values first last age)])
; "John"
; "Doe"
; 32

But what if you need to add or delete list members later? It’s error-prone.

That’s where a real struct can help:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
(struct person (first last age))
(define (get-person)
  (person "John" "Doe" 32))

(define p (get-person))
(person-first p)
; "John"
(person-last p)
; "Doe"
(person-age p)
; 32

;; Or using `match`:
(match (get-person)
  [(person first last age)
   (values first last age)])
; "John"
; "Doe"
; 32

Now let’s say you add a social security number field, ssn:

1
2
3
(struct person (first last ssn age))
(define (get-person)
  (person "John" "Doe" "xxx-xx-xxxx" 32))

Everything still works fine when you access the fields by-name:

1
2
3
4
5
6
7
(define p (get-person))
(person-first p)
; "John"
(person-last p)
; "Doe"
(person-age p)
; 32

Although if you used match, which is by-position, that needs to be updated:

1
2
3
4
5
6
(match (get-person)
  [(person first last age)
   (values first last age)])
; match: wrong number for fields for structure person: expected 4 but got 3
;  at: (first last age)
;  in: (person first last age)

So you need to fix it:

1
2
3
4
5
6
7
(match (get-person)
  [(person first last ssn age)
   (values first last ssn age)])
; "John"
; "Doe"
; "xxx-xx-xxxx"
; 32

struct*

This is where the struct* match pattern can help. By getting the fields by-name, it is insulated from the addition of new fields:

1
2
3
4
5
6
(match (get-person)
  [(struct* person ([first first] [last last] [age age]))
   (values first last age)])
; "John"
; "Doe"
; 32

This needs to be updated only if/when you need the new ssn field. So although it’s more verbose, using struct* is more resilient.

We could reduce the verbosity, by allowing either [field pat] or just field — where the latter expands to use the same symbol for both the field and pattern, as we wrote out in the example above. This would be a nice enhancement to the official struct* in racket/match. Meanwhile here’s a struct** match expander that wraps struct* to do so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#lang racket/base

(require racket/match
         (for-syntax racket/base
                     syntax/parse))

(define-match-expander struct**
  (λ (stx)
    (define-syntax-class field
      (pattern [id:id pat:expr])
      (pattern id:id #:with pat #'id))
    (syntax-parse stx
      [(_ struct-id:id (field:field ...))
       #'(struct* struct-id ([field.id field.pat] ...))])))

(module+ test
  (require rackunit)
  (struct foo (a b c))
  (define x (foo 1 2 3))
  (check-equal? (match x [(struct** foo (a b [c x])) (list a b x)])
                x)
  (check-equal? (match x [(struct*  foo ([a a][b b][c c])) (list a b c)])
                (match x [(struct** foo (a    b    [c c])) (list a b c)])))

Making structs

Creating an instance of a struct has exactly the same form/shape as creating a list:

1
2
(list   "John" "Doe" 32)
(person "John" "Doe" 32)

It’s just person instead of list. Either way, you’re specifying the fields by-position, not by-name. If you have a struct with more than a few fields:

1
(struct foo (a b c d e f g h))

Then creating the struct is itself error-prone. You will probably start jotting down comments to help you keep track of what field you’re on:

1
2
3
4
5
6
7
8
(foo 10     ;a
     "foo"  ;b
     13     ;c
     "bar"  ;d
     "baz"  ;e
     #f     ;f
     "x"    ;g
     42)    ;h

It would help if we could turn those comments into actual keywords. Using keyword arguments is helpful for any function with more than a few arguments. We’d like to write:

1
2
3
4
5
6
7
8
(foo #:a 10
     #:b "foo"
     #:c 13
     #:d "bar"
     #:e "baz"
     #:f #f
     #:g "x"
     #:h 42)

That way, Racket could help us catch mistakes. Even better, we’re free to supply the arguments in a different order, and it’s OK. It’s by-name, not by-position.

As a bonus, it would be great to have optional arguments, with a default value. (Especially since structs #:auto option requires all fields to share the same default value.)

Certainly we could define a foo/keyword function like this, which calls the plain foo struct constructor. I’ve done this many times. Admittedly, if you change the foo struct, you have to change this function, too. But usually they’re adjacent in the source code, and anyway it’s only the one place to make the mistake.

Even so, it would be neat if Racket had an option to create such keyword argument constructors for structs automatically.

A macro

Well, this is Racket. Any sentence that starts with, “It would be neat if Racket could ___”, can be answered with, “And I can add that to Racket myself!”

Here’s what we’d be writing by hand:

1
2
3
4
(struct foo (a b c ...))

(define (foo/kw #:a a #:b b #:c [c 42] ...)
  (foo a b c ...))

We’re defining a function whose name is the struct name with "/kw" appended. For each struct field, we want a keyword argument, where the keyword is similar to the field name. Also, we’d like to support optional arguments.

So here’s a macro:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#lang racket/base

(require (for-syntax racket/base
                     racket/list
                     racket/syntax
                     syntax/parse))

(begin-for-syntax
 (define syntax->keyword (compose1 string->keyword symbol->string syntax->datum)))

(define-syntax (struct/kw stx)
  (define-syntax-class field
    (pattern id:id
             #:with ctor-arg #`(#,(syntax->keyword #'id) id))
    (pattern [id:id default:expr]
             #:with ctor-arg #`(#,(syntax->keyword #'id) [id default])))
  (syntax-parse stx
    [(_ struct-id:id (field:field ...) opt ...)
     (with-syntax ([ctor-id (format-id #'struct-id "~a/kw" #'struct-id)]
                   [((ctor-arg ...) ...) #'(field.ctor-arg ...)]) ;i.e. append*
       #'(begin
           (struct struct-id (field.id ...) opt ...)
           (define (ctor-id ctor-arg ... ...) ;i.e. append*
             (struct-id field.id ...))))]))

;;; Example usage:

;; Define a struct type
(struct/kw foo (a b [c 42]) #:transparent)

;; Use normal ctor
(foo 1 2 3)                ; => (foo 1 2 3)

;; Use keyword ctor
(foo/kw #:a 1 #:b 2 #:c 3) ; => (foo 1 2 3)

;; Use keyword ctor, taking advantage of default arg for #:c field
(foo/kw #:a 1 #:b 2)       ; => (foo 1 2 42)

Lines 2–6 require some modules that aren’t part of the racket/base environment that macros run in.

Lines 8–9 define a helper function that can be used by a macro. To do that, the function must be define in a begin-for-syntax form.

Line 11 onward is the macro definition.

Lines 12–16 define a syntax class to use with syntax-parse. The class matches struct fields, which can be either an identifier alone or an [identifier default-value] form. In both cases, the syntax class defines an extra bit of syntax, ctor-arg. For each field, this is the arg spec to use in the definition of our special constructor function. This will be something like #:id id in the first case or #:id [id default] in the second case.

Lines 17–24 are the syntax-parse form. The pattern is:

1
    [(_ struct-id:id (field:field ...) opt ...)

This means there will be an identifier for the struct, followed by a list of zero or more fields, and finally zero or more options.

Lines 19–20 use with-syntax to create a couple pattern variables:

1
2
     (with-syntax ([ctor-id (format-id #'struct-id "~a/kw" #'struct-id)]
                   [((ctor-arg ...) ...) #'(field.ctor-arg ...)]) ;i.e. append*

The first, ctor-id, is simply the name of our constructor function — append /kw to the user’s struct identifier.

The second, ctor-arg, is our list of arg specs for the constructor function. We’ll need to append* these — “flatten” them one level, from a list of lists into a list. That’s the reason for the funny nested ellipses: ((ctor-arg ...) ...) — it sets us up to say ctor-arg ... ... down on line 23.

Finally lines 21–24 are the template — the syntax we’re returning. This is simply a struct definition plus the definition of our special constructor function. Again, the business with the double ellipses is how we append* a list of lists like this:

1
'((#:a a) (#:b b) (#:c [c 42]))

down to:

1
'(#:a a #:b b #:c [c 42])

Which is the argument list we want for our constructor.

And that’s it. Although this macro doesn’t exhaustively cover all possible struct options, it’s an example of something you could use in a project to write code that is less repetitive and more resilient.

by Greg Hendershott at July 04, 2015 05:15 PM