Planet Scheme

December 16, 2014

Programming Praxis

Find Two Added Integers

I saw this question at a popular interview-question site and got mad at the suggested answer:

You are given two lists containing identical sets of integers except that the first list contains two additional integers not present in the second set. For instance, with a = {1 4 9 2 7 3} and b = {9 7 4 1}, the two additional integers are 2 and 3. Write a program to find the two additional integers.

Your task is to write the requested program. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at December 16, 2014 09:00 AM

December 12, 2014

Programming Praxis

Magic Squares

Magic squares are an arrangement of consecutive counting numbers starting from 1 into rows and columns in which every row, column and the two main diagonals all sum to the same number. Magic squares are an important element of recreational mathematics, and have been known since antiquity; this magic square of order 3, with magic sum 15, was published in China in 650BC:

4 9 2
3 5 7
8 1 6

That magic square was constructed by the “start bottom, move down and right, else up” rule: Starting from either of the center cells on the bottom row, enter a 1, then move to the next square diagonally down and right (wrapping around the sides of the square if necessary), then enter the next number in sequence, 2, and so on. However, if the next square is occupied, move instead to the next square up from the current square, then continue the sequence. Various rules like “start left, move down and right, else up” and “start top, move up and right, else down” also work, but rules like “start top, move down and left, else up” don’t; I’ll let you have the pleasure of figuring out which rules work and which don’t. This method works only for odd orders; there are other methods for even orders, which we may examine at some other time.

In the example square, the starting cell is the center cell on the bottom edge, where you see 1. Moving down and right and wrapping to the top of the next column, enter 2. Then moving down and right and wrapping to the left of the next row, enter 3. The next cell down and right already contains 1, so instead move up from the current cell and enter 4. Then moving down and right, enter 5. Then moving down and right, enter 6. The next cell down and right, wrapping both the row and column, already contains 4, so instead move up from the current cell and enter 7. Then moving down and right and wrapping to the left of the next row, enter 8. Finally, moving down and right and wrapping to the top of the next column, enter 9.

Your task is to write a program that takes an odd order n, one of the four starting cells and a rule and generates the indicated magic square. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at December 12, 2014 09:00 AM

December 10, 2014

The Racket Blog

The Racket package system and Planet

We have recently moved the majority of Racket's code base into packages and repositories separate from the main core repository. This has given the Racket package system another cycle of attention. Whenever this happens, there are often questions and confusion about how to solve various distribution problems with the package system. A natural point of comparison is the older Planet system provided by Racket that appears to solve similar problems. In this blog post, I attempt to explain the purpose of the package system and its relation to Planet.


The package system and Planet do not solve the same problem and don't exist for the same reason.


Planet is:

  1. A file distribution mechanism for source code.


    Via .plt files that are installed into a particular place on your machine and then raco setup'd.

  2. A mechanism for automatically downloading and installing source code just before it is needed by programs.


    Via the (planet ...) require form.

  3. A centralized database of libraries


    Via the Planet website and its server & protocol which were undocumented and proprietary for the majority of Planet's life

  4. A prescriptive model of how programs and libraries should be composed.


    Specifically the system of major/minor versions, tagging packages by author name, and embedding the names of packages in source code.


In contrast, the package system is:

  1. A file distribution mechanism for source code, byte code, and documentation.

    Via the raco pkg command.


In this way, the package system is almost identical to an operating system package system like Debian's dpkg and apt systems. The problem is very finely tailored and becomes more flexible as a result (notice that we can now distribute byte code and documentation.) This design aspires to follow the admonition of holy writ: "Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary."


Furthermore, it was intended to solve practical problems throughout the Racket ecosystem. In particular, one of the common complaints people had and have about Planet is the very long install times because of long builds. The package system allows this problem to be solved by distributing pre-built code.


Since the package system specifically does not address jobs 2, 3, or 4 of Planet, we have to ask, "Do they need to be solved?" and if so, "How can we solve them on top of the package system, i.e. as a library in honor of the design principle?".


In particular, 2 and 3 are very painful for people wanting to just use the file distribution mechanism of Planet. 2 causes unpredictability, because you don't know if running a program will start a long invocation of "raco setup", require Internet access, and start running un-vetted code. 3 requires you to share your code if you want to use the file distribution mechanism and is a single point of failure for doing installation.


By not mandating how to address 2 and 3 in the package system, we offer flexibility. Here is where the solutions to these jobs are now:


2. There is currently no way to get automatic installs of packages. However, both DrRacket and xrepl offer advice about which packages you might want to install to compile and run the program. It would be natural to extend this advice to be automatic and patches are welcome. Given the experiences of operating systems which merely make suggestions (nethack: command not found, provided by nethack-console), I personally feel like we are at the sweet spot.


3. The file distribution mechanism's flexible package sources combine with a very simple protocol for package catalogs (Take a URL, add/pkg/, add a string, get a read-able hash table) to look up packages you don't yet have. As a service, we run a few catalogs (one for each release, plus pkgs.r-l.o). But we expect that users with special needs (such as sensitive installations that need exactly certain tested and trusted versions, especially with proprietary software) will build their own catalogs on private Web sites.


Clearly, however, job 4 is where Planet and the package system differ the most.


With the package system, we follow the precedent of operating systems. An OS package's job is to get files into the right spot. An OS package contains a binary and instructions to install it as /usr/bin/zsh. It is not typical in OSes to be able to install multiple packages (such as different "versions" of the "same" package) that both provide /usr/bin/zsh. When you're at a Unix prompt, you don't have to write zsh-5.0.5/usr/bin/zsh. It's possible that many consider this is a big problem with OSes and indeed we do observe that it is fairly common to provide packages that provide binaries and libraries with embedded names such as how on my machine I have python2.6, python2.7, and python3.2 all in my $PATH. It is important to realize, however, that the deb format and the apt tool didn't need to change to support this change or future changes in perspective in how to compose code.


I hope this analogy helps understand the Racket package system. In the package system, a package doesn't install "binaries", "man pages", and "init scripts", but installs similar things, such as "module paths", "documentation manuals", and "raco commands". Each of these has a notion of conflict: can't have two zshs or two racket/lists; can't have two zsh.1 pages or two docs named doc; can't have two modules trying to provide raco neo-tokyo-is-about-to-explode. If you find a random .deb on the Internet, can you predict what binaries it will contain from its name? No. The same goes for Racket packages. However, if you are egregiously weird, then people probably won't want to install your packages, just like for random debs.


However, clearly rules are helpful. In the world of operating systems, you know that basically all packages distributed by Debian can be installed at the same time, except for "virtual packages" that do stuff like selecting whether postfix or sendmail should be responsible for the sendmail command. These rules are not enforced through technology, though. Instead, the Debian maintainers have a social process that enforces them, with information being provided by technology (such as regression systems that identify unintended conflicts.) The catalog server that the Racket team provides helps facilitate a similar process with the concentric rings (all ring <=1 packages can be installed at once and ring 1< packages can do anything.)


Non-conflicting sets of packages is the simplest rule to define and enforce. Other rules about backwards compatibility are much more complicated to define and enforce. I do not believe there is much precedent in the world of OSes, although we can see a little bit of what they do through things like libgtk, libgtk2, and libgtk3, where generally code written for one libgtk2 package is compatible with all libgtk2 packages made in the future, but libgtk3 is effectively a totally different package and introduces totally separate binaries like gtk3-config.


The most that the Racket team attempts to do here is to say, "Here are the rules we will follow and we think you should follow them too." Specifically, that we will maintain backwards compatibility or make a new package. We can't and won't enforce this, nor do we always live up to it with our own work (but we feel really bad about it when we do.)


Although my main goal of this section has been to explain my solution to (4), a great thing about the package system is that it is not binding at all. You can decide to follow the same rules as Planet. It is easy to do so:


  • Always name your packages $AUTHOR-$PACKAGE-$MAJOR
  • Always provide modules from only the collection, $AUTHOR-$PACKAGE-$MAJOR
  • Maintain backwards compatibility within releases of $AUTHOR-$PACKAGE-$MAJOR
  • Update the 'version metadata in the package info.rkt to reflect the $MINOR version.

And, boom!, you've recreated the rules of Planet to a T except for two things: (a) you'll still need to put a dependency on $AUTHOR-$PACKAGE-$MAJOR on the outside of code in a package info.rkt file rather than just inside files and (b) you can't use $AUTHOR-$PACKAGE to refer to "whatever the current $MAJOR" is.


The first compromise of adding something to the info.rkt is fairly modest, as it requires O(1) line modifications.


The second compromise is more severe, although actually you could just maintain such a package and deal with the breakage that occurs when you try to upgrade. Such breakage, however, was present in Planet too, as when a package was installed based on $AUTHOR-$PACKAGE only the local machine would cache the version used, so if you took the requiring module to another machine, it would download a new version and, potentially, have a backwards incompatibility problem. Using the package system in the most naive way (i.e. installing the $AUTHOR-$PACKAGE at some point and programming to that) would work exactly the same as Planet, except that the package system was designed to be able to port installations from one machine to another with raco pkg migrate.


I hope this blog post has helped explain the package system and shown that it does not prevent you from doing anything that Planet let you do, it only allows you to do more.

by Jay McCarthy (noreply@blogger.com) at December 10, 2014 05:58 PM

December 09, 2014

Programming Praxis

Every Possible Fraction

When I saw this web page a few days ago, I knew it would make a great exercise, simple but fascinating. Starting with x = 1, every fraction in lowest terms is generated by x ′ = 1 / (ny + 1), where n = ⌊ x ⌋ and y = x − n. The underlying math is known as the Stern-Brocot tree, and has been known since the ancient Greeks, who used terms of the Stern-Brocot tree to design the gear ratios in the Antikythera mechanism.

Your task is to write a program that generates the fractions in lowest terms. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at December 09, 2014 09:00 AM

December 08, 2014

Ben Simon

Trigraphs, Diana Pads and Zombies

I'm browsing this guy's zombie response kit when I noticed this item here:

This naturally raises the questions: (a) what's a tri-graph and (b) how does it help with "clandestine" communications?

Of course, the Internet explains: a tri-graph (or is it trigraph?) is a lookup table used in One Time Pad (OTP) encryption. One example of an OTP are pages and pages of random text. If you and I have copies of the same one time pad, then we can use a tri-graph as follows:

1. Pick a line in the OTP (known as our key text). Say:

  VAXPM IPIXU QUXIP MAXIU

2. Write the clear text below it:

  VAXPM IPIXU QUXIP MAXIU
  MEETA TTENT ONIGH TXXXX

(that's: meet at 10 tonight)

3. For each letter, look up the key text and plain text character in the trigraph and find the resulting encrypted text.

  VAXPM IPIXU QUXIP MAXIU
  MEETA TTENT ONIGH TXXXX
  SVYRN YRNPM VSULD UCFUI

Because of the symmetric nature of the tri-graph, I can decode the text using the same procedure. That is, if I look up a key and encrypted letter, I will arrive and the decoded letter.

That may seem like a neat parlor trick, but the fact is, the above encryption is actually quite strong. In fact, assuming that the pads are truly randomly generated, never reused and never compromised the system is unbreakable.

One use case for the above system was during the Vietnam War:

Special Forces were one of (if not the only) units in Vietnam to utilize Morse code on a regular basis. We used a method of encryption called the Diana Cryptosystem.

The basis of these "One-Time Pads", is that there were only two matching pads in existence, and they would only be used one time. They were booklets that contained randomly generated groups of 5-letter "words;” 30 words to a page. The person sending a message would first write the letters to the message, over these random groups of words. Included in the front of each one-time pad was a one-page encryption table. If I wanted to send the letter "P", and the letter under the "P" was an "A", then I would send a "K". The person listening on the frequency at the other end, would have the other matching pad. They would write the letter they received (a "K") over the letter in their one-time pad (an "A"), and decipher it based on the table, yielding the original letter "P".

Each communication site in Vietnam (we had over 100 A-Camps along the Cambodian / Laotian border, and some 20 B-detachment sites spread over the country) had a different pad, depending on the location they were having the commo-check with. It obviously was very important that both people were using the appropriate matching pads, or the deciphered messages would not make any sense.

After a while, most of us became so proficient with the system, that we actually learned the deciphering matrix by heart. No matter what pads anyone had, the combinations always were the same. i.e. Any 3 letters always went together, regardless of the order; "BKO"/"KOB"/"OBK"/"BOK". After listening to thousands and thousands of transmissions, it really got quite simple. If I was listening to code, and a letter "B" was sent (now remember, we usually sent around 20-25 "words" (5 letters per word) a minute, hence the importance of the "speed" keys!), and the letter it was associated with was an "O", most of us would decipher as we heard it, and just write the "K". That may sound like quite a yarn, but it is absolutely true.

That's my kind of solution: simple enough that a a soldier can manually execute the encryption and send it over Morse Code, yet sophisticated enough that the code was effectively unbreakable.

Naturally, I had to experiment with this encryption, and of course, write some code to make it easier to play with. You can find the code here, or copied below. The code provides two top level function: make-pad which generates characters for a one time pad and tri-lookup which does the tri-graph lookup:

(In the session above, k is the key text and p is the plain text)

Thinking more about this form of encryption, it's remarkable how practical it could be. For example, if both my wife and I had a business card sized piece of paper packed with random characters, we could send dozens of short messages to each other using the above technique. Furthermore, any stream of text that we agree upon could be used as a key. While random text would be ideal, technically any string of characters would work. We could use song lyrics, the 5th paragraph from the 2nd story on the 3rd page of the New York Times, street names from a map, ingredients on a shampoo bottle, etc. Heck, you could even incorporate the clue to the key in the messages. For example:

 K541 8151 Z781 0742 AI06 PU72 EBXBH HGOXU

Where K541 8151 Z781 0742 AI06 PU72 corresponds to tweet 541815178107420672 (the spaces and letters are meant to throw off attackers).

Of course, straying from randomized text definitely weakens the setup. I wouldn't want to protect state secrets or call in an air strike using these short-cuts. But, this would work for leaving a romantic note to my wife or other less than sensitive message.

At the end of the day, I suppose the most important question is: does a tri-graph belong in your zombie fighting kit? Heck yeah! It also belongs in your kids play fort setup, and teenager's clandestine communication kit.


;;
;; Implement a version of One Time Pad encryption.
;; Use a trigraph / diana pad method
;; http://danmorgan76.wordpress.com/2013/09/30/encryption-via-a-one-time-pad/
;; http://home.earthlink.net/~specforces/spdiana.htm
;; 

(define (a->i letter)
 (- (char->integer letter) 65))
  
(define (i->a index)
 (integer->char (+ 65 index)))
 
(define (rand-char)
 (i->a (random-integer 26)))

(define (range low high)
 (if (> low high) '() (cons low (range (+ 1 low) high))))
 
(define (head items)
 (let loop ((i 0) (items items) (accum '()))
  (cond ((or (null? items) (= i 5)) (reverse accum))
        (else
         (loop (+ i 1) (cdr items) (cons (car items) accum))))))

(define (tail items)
 (reverse (head (reverse items))))
 
(define (make-pad rows cols)
 (define (make-block)
  (apply string (map (lambda (i) (rand-char)) (range 1 5))))
 (define (make-row)
  (for-each (lambda (i)
             (if (> i 1) (display " ")) 
             (display (make-block)))
            (range 1 cols)))
 (for-each (lambda (i)
             (make-row) (newline))
           (range 1 rows)))

(define (tri-row i)
 (reverse (map (lambda (pos)
                 (i->a (modulo (- pos i) 26)))
          (range 0 25))))

(define (tri-lookup key plain)
 (let loop ((key (string->list key))
            (plain (string->list plain))
            (coded '()))
  (cond ((null? plain) (apply string (reverse coded)))
        ((equal? #\space (car plain))
         (loop (cdr key) (cdr plain)
               (cons #\space coded)))
        (else
         (let ((row (a->i (car plain)))
               (col (a->i (car key))))
           (loop (cdr key) (cdr plain) 
                 (cons (list-ref (tri-row row) col) coded)))))))          
(define k "ASDFA POUYK")
(define p "HELLO WORLD")

by Ben Simon (noreply@blogger.com) at December 08, 2014 01:07 PM

December 05, 2014

Programming Praxis

Free Time

We have a very practical task today: Given the set of busy-time intervals of two people, as in a calendar, find the free-time intervals of both people so they can arrange a meeting.

Your task is to write a program to find free time. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the
comments below.


by programmingpraxis at December 05, 2014 09:00 AM

December 02, 2014

Andy Wingo

there are no good constant-time data structures

Imagine you have a web site that people can access via a password. No user name, just a password. There are a number of valid passwords for your service. Determining whether a password is in that set is security-sensitive: if a user has a valid password then they get access to some secret information; otherwise the site emits a 404. How do you determine whether a password is valid?

The go-to solution for this kind of problem for most programmers is a hash table. A hash table is a set of key-value associations, and its nice property is that looking up a value for a key is quick, because it doesn't have to check against each mapping in the set.

Hash tables are commonly implemented as an array of buckets, where each bucket holds a chain. If the bucket array is 32 elements long, for example, then keys whose hash is H are looked for in bucket H mod 32. The chain contains the key-value pairs in a linked list. Looking up a key traverses the list to find the first pair whose key equals the given key; if no pair matches, then the lookup fails.

Unfortunately, storing passwords in a normal hash table is not a great idea. The problem isn't so much in the hash function (the hash in H = hash(K)) as in the equality function; usually the equality function doesn't run in constant time. Attackers can detect differences in response times according to when the "not-equal" decision is made, and use that to break your passwords.

Edit: Some people are getting confused by my use of the term "password". Really I meant something more like "secret token", for example a session identifier in a cookie. I thought using the word "password" would be a useful simplification but it also adds historical baggage of password quality, key derivation functions, value of passwords as an attack target for reuse on other sites, etc. Mea culpa.

So let's say you ensure that your hash table uses a constant-time string comparator, to protect against the hackers. You're safe! Or not! Because not all chains have the same length, "interested parties" can use lookup timings to distinguish chain lookups that take 2 comparisons compared to 1, for example. In general they will be able to determine the percentage of buckets for each chain length, and given the granularity will probably be able to determine the number of buckets as well (if that's not a secret).

Well, as we all know, small timing differences still leak sensitive information and can lead to complete compromise. So we look for a data structure that takes the same number of algorithmic steps to look up a value. For example, bisection over a sorted array of size SIZE will take ceil(log2(SIZE)) steps to get find the value, independent of what the key is and also independent of what is in the set. At each step, we compare the key and a "mid-point" value to see which is bigger, and recurse on one of the halves.

One problem is, I don't know of a nice constant-time comparison algorithm for (say) 160-bit values. (The "passwords" I am thinking of are randomly generated by the server, and can be as long as I want them to be.) I would appreciate any pointers to such a constant-time less-than algorithm. However a bigger problem is that the time it takes to access memory is not constant; accessing element 0 of the sorted array might take more or less time than accessing element 10. In algorithms we typically model access on a more abstract level, but in hardware there's a complicated parallel and concurrent protocol of low-level memory that takes a non-deterministic time for any given access. "Hot" (more recently accessed) memory is faster to read than "cold" memory.

Non-deterministic memory access leaks timing information, and in the case of binary search the result is disaster: the attacker can literally bisect the actual values of all of the passwords in your set, by observing timing differences. The worst!

You could get around this by ordering "passwords" not by their actual values but by their cryptographic hashes (e.g. by their SHA256 values). This would force the attacker to bisect not over the space of password values but of the space of hash values, which would protect actual password values from the attacker. You still leak some timing information about which paths are "hot" and which are "cold", but you don't expose actual passwords.

It turns out that, as far as I am aware, it is impossible to design a key-value map on common hardware that runs in constant time and is sublinear in the number of entries in the map. As Zooko put it, running in constant time means that the best case and the worst case run in the same amount of time. Of course this is false for bucket-and-chain hash tables, but it's false for binary search as well, as "hot" memory access is faster than "cold" access. The only plausible constant-time operation on a data structure would visit each element of the set in the same order each time. All constant-time operations on data structures are linear in the size of the data structure. Thems the breaks! All you can do is account for the leak in your models, as we did above when ordering values by their hash and not their normal sort order.

Once you have resigned yourself to leaking some bits of the password via timing, you would be fine using normal hash tables as well -- just use a cryptographic hashing function and a constant-time equality function and you're good. No constant-time less-than operator need be invented. You leak something on the order of log2(COUNT) bits via timing, where COUNT is the number of passwords, but since that's behind a hash you can't use it to bisect on actual key values. Of course, you have to ensure that the hash table isn't storing values in sorted order and short-cutting early. This sort of detail isn't usually part of the contract of stock hash table implementations, so you probably still need to build your own.

Edit: People keep mentioning Cuckoo hashing for some reason, despite the fact that it's not a good open-hashing technique in general (Robin Hood hashes with linear probing are better). Thing is, any operation on a data structure that does not touch all of the memory in the data structure in exactly the same order regardless of input leaks cache timing information. That's the whole point of this article!

An alternative is to encode your data structure differently, for example for the "key" to itself contain the value, signed by some private key only known to the server. But this approach is limited by network capacity and the appropriateness of copying for the data in question. It's not appropriate for photos, for example, as they are just too big.

Edit: Forcing constant-time on the data structure via sleep() or similar calls is not a good mitigation. This either massively slows down your throughput, or leaks information via side channels. Remote attackers can measure throughput instead of latency to determine how long an operation takes.

Corrections appreciated from my knowledgeable readers :) I was quite disappointed when I realized that there were no good constant-time data structures and would be happy to be proven wrong. Thanks to Darius Bacon, Zooko Wilcox-O'Hearn, Jan Lehnardt, and Paul Khuong on Twitter for their insights; all mistakes are mine.

by Andy Wingo at December 02, 2014 10:01 PM

Programming Praxis

Gray Code Neighbors

Our exercise today is based on Gray code sequences, where each number in the sequence differs from its predecessor in only one bit. We studied Gray code sequences in a previous exercise. Here is an interview question base on Gray code sequences:

Given two integers, determine if they are two consecutive numbers in a Gray code sequence.

Your task is to write a program that determines if two numbers appear consecutively in a Gray code sequence. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at December 02, 2014 09:00 AM

December 01, 2014

Jeremy Kun

Linear Programming and the Simplex Algorithm

In the last post in this series we saw some simple examples of linear programs, derived the concept of a dual linear program, and saw the duality theorem and the complementary slackness conditions which give a rough sketch of the stopping criterion for an algorithm. This time we’ll go ahead and write this algorithm for solving linear programs, and next time […]

by j2kun at December 01, 2014 04:00 PM

November 28, 2014

Programming Praxis

A Prime Number Puzzle

Today’s exercise comes from the world of recreational mathematics.

For each number n from 1 to 9 inclusive, find a number m that begins with the digit n, has n digits, has each two-digit sequence within m a different prime number, and is as small as possible.

When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at November 28, 2014 09:00 AM

November 27, 2014

Andy Wingo

scheme workshop 2014

I just got back from the US, and after sleeping for 14 hours straight I'm in a position to type about stuff again. So welcome back to the solipsism, France and internet! It is good to see you on a properly-sized monitor again.

I had the enormously pleasurable and flattering experience of being invited to keynote this year's Scheme Workshop last week in DC. Thanks to John Clements, Jason Hemann, and the rest of the committee for making it a lovely experience.

My talk was on what Scheme can learn from JavaScript, informed by my work in JS implementations over the past few years; you can download the slides as a PDF. I managed to record audio, so here goes nothing:


55 minutes, vorbis or mp3

It helps to follow along with the slides. Some day I'll augment my slide-rendering stuff to synchronize a sequence of SVGs with audio, but not today :)

The invitation to speak meant a lot to me, for complicated reasons. See, Scheme was born out of academic research labs, and to a large extent that's been its spiritual core for the last 40 years. My way to the temple was as a money-changer, though. While working as a teacher in northern Namibia in the early 2000s, fleeing my degree in nuclear engineering, trying to figure out some future life for myself, for some reason I was recording all of my expenses in Gnucash. Like, all of them, petty cash and all. 50 cents for a fat-cake, that kind of thing.

I got to thinking "you know, I bet I don't spend any money on Tuesdays." See, there was nothing really to spend money on in the village besides fat cakes and boiled eggs, and I didn't go into town to buy things except on weekends or later in the week. So I thought that it would be neat to represent that as a chart. Gnucash didn't have such a chart but I knew that they were implemented in Guile, as part of this wave of Scheme consciousness that swept the GNU project in the nineties, and that I should in theory be able to write it myself.

Problem was, I also didn't have internet in the village, at least then, and I didn't know Scheme and I didn't really know Gnucash. I think what I ended up doing was just monkey-typing out something that looked like the rest of the code, getting terrible errors but hey, it eventually worked. I submitted the code, many years ago now, some of the worst code you'll read today, but they did end up incorporating it into Gnucash and to my knowledge that report is still there.

I got more into programming, but still through the back door, so to speak. I had done some free software work before going to Namibia, on GStreamer, and wanted to build a programmable modular synthesizer with it. I read about Supercollider, and decided I wanted to do something like that but with the "unit generators" defined in GStreamer and orchestrated with Scheme. If I knew then that Scheme could be fast, I probably would have started on an entirely different course of things, but that did at least result in gainful employment doing unrelated GStreamer things, if not a synthesizer.

Scheme became my dominant language for writing programs. It was fun, and the need to re-implement a bunch of things wasn't a barrier at all -- rather a fun challenge. After a while, though, speed was becoming a problem. It became apparent that the only way to speed up Guile would be to replace its AST interpreter with a compiler. Thing is, I didn't know how to write one! Fortunately there was previous work by Keisuke Nishida, jetsam from the nineties wave of Scheme consciousness. I read and read that code, mechanically monkey-typed it into compilation, and slowly reworked it into Guile itself. In the end, around 2009, Guile was faster and I found myself its co-maintainer to boot.

Scheme has been a back door for me for work, too. I randomly met Kwindla Hultman-Kramer in Namibia, and we found Scheme to be a common interest. Some four or five years later I ended up working for him with the great folks at Oblong. As my interest in compilers grew, and it grew as I learned more about Scheme, I wanted something closer there, and that's what I've been doing in Igalia for the last few years. My first contact there was a former Common Lisp person, and since then many contacts I've had in the JS implementation world have been former Schemers.

So it was a delight when the invitation came to speak (keynote, no less!) the Scheme Workshop, behind the altar instead of in the foyer.

I think it's clear by now that Scheme as a language and a community isn't moving as fast now as it was in 2000 or even 2005. That's good because it reflects a certain maturity, and makes the lore of the tribe easier to digest, but bad in that people tend to ossify and focus on past achievements rather than future possibility. Ehud Lamm quoted Nietzche earlier today on Twitter:

By searching out origins, one becomes a crab. The historian looks backward; eventually he also believes backward.

So it is with Scheme and Schemers, to an extent. I hope my talk at the conference inspires some young Schemer to make an adaptively optimized Scheme, or to solve the self-hosted adaptive optimization problem. Anyway, as users I think we should end the era of contorting our code to please compilers. Of course some discretion in this area is always necessary but there's little excuse for actively bad code.

Happy hacking with Scheme, and au revoir!

by Andy Wingo at November 27, 2014 05:48 PM

November 25, 2014

Programming Praxis

Thou Impertinent Urchin-Faced Miscreant!

In 1968, Newsweek published an article “How to Win at Wordsmanship” by Philip Broughton. You pick a random number from 000 to 999, then look up a three word phrase in a table coded by the three digits of the random number:

Column 1 Column 2 Column 3
0. integrated 0. management 0. options
1. total 1. organizational 1. flexibility
2. systematized 2. monitored 2. capability
3. parallel 3. reciprocal 3. mobility
4. functional 4. digital 4. programming
5. responsive 5. logistical 5. concept
6. optional 6. transitional 6. time-phase
7. synchronized 7. incremental 7. projection
8. compatible 8. third-generation 8. hardware
9. balanced 9. policy 9. contingency

For instance, the random number 031 leads to the phrase “integrated reciprocal flexibility,” which you can use in a technical conversation or report. Broughton said “No one will have the remotest idea of what you are talking about, but the important thing is that they’re not about to admit it.”

Since then, many buzz-phrase generators have been developed, including corporate-speak “holistically embrace customer-directed imperatives” and the Shakespearean insult generator that gave us the title of our exercise. We have a couple of word lists on the next page, or you can Google for “buzz-phrase generator” to find lots of them on the web, but it’s most fun to build your own.

Your task is to write a program that generates random buzz-phrases. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at November 25, 2014 09:00 AM

November 23, 2014

The Racket Blog

800!

800
(Racket Tasks On Rosetta Code)

Since (and even before) Asumu Takikawa’s post “200!” at the beginning of March 2013, folk have been beavering away, implementing tasks on Rosetta Code. And on November 15th 2014:

800 tasks have now been Implemented1 in Racket on the Rosetta Code website!

Before I go any further it must be said that, without a doubt… this is awesome! This achievement represents a lot of work, and a lot of code. And everyone who has participated should be thanked and congratulated for getting this far.

So thank you. And well done!

What is Rosetta Code?

Rosetta Code (RC) describes itself as:

.. “a programming chrestomathy site. The idea is to present solutions to the same task in as many different languages as possible, to demonstrate how languages are similar and different, and to aid a person with a grounding in one approach to a problem in learning another. Rosetta Code currently has 758 tasks, 134 draft tasks, and is aware of 560 languages, though we do not (and cannot) have solutions to every task in every language.”2

Of these tasks, 800 have been implemented in Racket… some tasks, like Hello World/Text, have been implemented in hundreds of languages. Some, however, like Time-based One-time Password Algorithm, have only been implemented in 3 (including Racket).

If you haven’t already, I suggest you take a quick look about the site to get a feel of what that means in practice.

WARNING: Rosetta Code is a wiki. Like any wiki it will steal your time from you as you browse tasks, algorithms, languages and the occasional link to Wikipedia. Don’t say I didn’t warn you.

What Can You Do With Rosetta Code?

Learn From It

Rosetta Code is a valuable resource with plenty of material to absorb and ideas to be had from. If you’re new to Racket, there are tasks like Loops/For which will get you on your way with fundamental programming tasks.

If you want something juicier there are other tasks (like Nonogram solver which runs to over 400 lines of Racket) for you to pick over.

And there’s everything in between.

Write Code!

Each task gives you a chance to think, “Is this how I would do this?”

Even if I don’t submit something, I find it’s fun to write some code around the task. In fact, I don’t even have to type code into a REPL, the thought exercise is often fun enough!

Some tasks, like (“Chess Player”), are shall we say, very challenging. But don’t let even that put you off thinking of, tinkering around or coding a solution to them.

If there isn’t a Racket implementation for a task you like the looks of, have a go.

Someone might have a better idea of how to do it in Racket later. But if there isn’t an implementation now — change that now!

Remember that others will be reading your code to understand Racket all the better. So please try to adhere to the Racket Style Guide as best you can3. Again, don’t worry about getting that perfect. Like anything, learning Racket style takes practice, and nobody expects perfection. And the experienced contributors/documentors are always at hand to help correct style.

Hints:

  • I always have a Wikimedia Cheatsheet to hand. I can never remember its markdown syntax (which unfortunately doesn’t really support <code/>, either)
  • Pick whatever task you want… but it would be good to clear all the “Complete Tasks” (as opposed to “Draft Tasks”) if you have a choice.
  • Don’t add code until it’s running and producing the output you expect. This isn’t Project Euler, you don’t have to guess the answer in most cases — it’s likely someone has some sample output to compare to.4
  • If you can’t get your head around the algorithm in the task description then try to translate another language into Racket. You’ll learn Racket, you’ll learn the other language, and in working it through for yourself you’ll also see how the algorithm takes shape and works.
  • My workflow for posting a new solution is this: Once I have something to submit, I add a “Stub” Racket implementation. I edit the whole Task page (because that is all that is available to edit at that point), find whatever is after Racket alphabetically, and insert boilerplate code. I check this template code as a “minor edit” described as “Racket stub added — implementation later”. This then allows is for me to be able to edit the Racket section in isolation and keeps the rest of the task (everyone else’s hard work) safe from any, er, silliness.
{{header|Racket}}
<lang racket>
</lang racket>
{{out}}
<pre>
</pre>
  • I have started to make it my habit (especially when showcasing one or two functions) to use #lang racket/base and requireing the salient individual functions.
  • It is often not appropriate to fully document the functionality of Racket functions in the RC task implementation. You can, however, point to the canonical documentation on the Racket website. So I also include a link to http://docs.racket-lang.org/reference/... when I need to.
  • The RC administrators have switched off image uploading (or I, at least, cannot find out how). Even though Racket can produce images as results, think hard about whether you want the hassle of trying to present images to the reader. If you find out a method that works for you please tell me, I’d love to know.

I suppose you could also extend all of the above to coding in another language — if you really have to.

Improve What’s Already There

Rosetta Code is a wiki.

It is open to anyone to edit.

Don’t be afraid to.

If you see something that could be implemented, styled or documented better — work to improve it.

Once you have your improved entry together, show it to the author of the original post. Besides being courteous, he or she might have an opinion on what else you might do. Often, there is something bugging them, and you are scratching an itch of theirs!

I have never had anyone react badly to a change request. Everyone appreciates that you have made an effort to produce your change (and that you’re not just standing in the aisles complaining that it doesn’t look right).

Teach Through It

If there is an aspect of Racket, algorithm or other “CS task” (in the broadest sense) that you want to share: see to it that it is adequately illustrated on Rosetta Code. If it is not, then create a task to demonstrate it. Not only will you show how something is done properly (i.e. in Racket), but you will also be inviting others to implement the task in their own favourite languages.

The Competition5

Back at 200, Racket was the 54th most popular language. But for some time now, it has been sitting at #2 in the popularity1 ranking for quite some time now. For a while, it has been placed a long way behind TCL, and being hotly pursued by Python (never more than 10 tasks behind).

One of my motivators is that having seen Racket get to #2 — I don’t want to see it any lower in the rankings. I’m sure there’s something in the Python lot that wants to overtake us! This healthy competition has kept both of the communities pushing ahead with implementing the tasks.

The Intro Projects page of the racket wiki has “Implement a Rosetta code task” as a “Small Project”. I think of it as slightly more of a “Recreational” project (this at least justifies to myself the element of competition that has crept in.)

The Rallying Call

or “What Specifically Would Help Racket on Rosetta Code?”

RC is a good way to present Racket as a most general programming language. So as a tool for Racket advocacy, as well as for the purposes of RC, we need to:

  • Implement more tasks in Racket to keep a high profile: 800 tasks, #2 in the popularity stakes. This keeps Racket visible; and proves it capable of (almost) anything. I would so love to give TCL a run for its money — so there’s 41 tasks to go before we can even think of taking a breather!
    • We have implemented 800 tasks in Racket. The quote above says there are 892 (758+134) tasks in total. That means that there are 92 more tasks to get to grips with.

  • Suggest new tasks: Especially tasks that will demonstrate the latest shiny feature of the latest shiny versions of Racket!

    Personally I can’t believe that there are less than 900 things that you would want to do with a programming language. If you think of a task, add it. Even impossible tasks provoke thought and imagination — and interesting solutions!

  • Improve those tasks that have been implemented in Racket: We want to maintain a body of good, useful code, to allow us to teach and demonstrate Racket. There are a number of reasons why existent tasks need revisiting:
    • Racket technology has moved on (and moves on) apace. What was unavailable and experimental even 18 months ago is now available and reliable. This new technology needs to be demonstrated.
    • Code that is even older is very “schemey” (I have in some cases simply copied the Scheme implementation and stuck a #lang racket tag on the front). Although compatible, Racket has moved quite a way on from Scheme.
    • Some implementors (not a million miles from where I’m standing, for example), were not as au fait with the language and/or style guide as they might be now. It’s a housekeeping job, I know, but giving the examples as consistent a style as possible will help satisfy this aspiration from the Racket Style Guide:

    “Doing so will help us … and our users, who use the open source code … as an implicit guide to Racket programming.”

  • Document tasks: see my hint about documentation and links above for what I now think is good practice. If some code seems utterly heiroglyphic, see if it can be made clearer. Remember this is Racket, not APL.
  • General tidying up never goes amiss.
  • RC is run by someone outside the Racket community. At the bottom of the “Small Projects” section of the Racket wiki is a suggestion to collect the RC examples into something “owned” by the Racket community. I’ve been thinking about this… if anyone has suggestions, let me know. There are limits to what we can put on RC (defined by the purpose of RC itself). It would be good to remove those limits by implementing something along RC’s lines oursleves.
  • Very specifically… anyone with a joystick, drivers and some spare time - please could you do “Joystick Position”. The possession of a joystick puts you in a position of great power with respect to that task. Exercise your responsibility.

And Finally…

Well done everyone again! Keep up the good work. And see you at 1000!



  1. You can track Racket (and everyone else’s) progress on the Popular Programming Languages report, which is updated hourly or so. 


  2. Rosetta Code’s Front Page 


  3. The style guide is actually the chapter called “How to Program Racket” in the main Racket documentation. One of the RC “style” rules is that code should be 80 characters wide. Personally, I ignore that in favour of Racket’s more generous 102. Sometimes someone on RC objects. Sometimes I then care enough to put the required newlines in. 


  4. Even if there are example results don’t necessarily trust them. e.g. in The ISAAC Cipher, the cypher engine isn’t reset between test runs in the Pascal implementation. That error is propagated through all other implementations. Mine (Racket) conforms to show that I’m doing the same thing as everyone else; but I also do what I think to be a more correct test later. 


  5. Hold on a mo… this is meant to be a pedagogical exercise, not a competition 

by Tim Brown (noreply@blogger.com) at November 23, 2014 03:40 PM

November 21, 2014

Programming Praxis

An Array Of Zeroes

Beware! Today’s exercise, which derives from an interview question asked at Facebook, is trickier than it looks:

You are given an array of integers. Write a program that moves all non-zero integers to the left end of the array, and all zeroes to the right end of the array. Your program should operate in place. The order of the non-zero integers doesn’t matter. As an example, given the input array [1,0,2,0,0,3,4], your program should permute the array to [1,4,2,3,0,0,0] or something similar, and return the value 4.

Your task is to write the indicated program. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at November 21, 2014 09:00 AM

November 18, 2014

Jeremy Kun

Learning a single-variable polynomial, or the power of adaptive queries

Problem: Alice chooses a secret polynomial with nonnegative integer coefficients. Bob wants to discover this polynomial by querying Alice for the value of for some integer of Bob’s choice. What is the minimal number of queries Bob needs to determine  exactly? Solution: Two queries. The first is , and if we call , then the second query […]

by j2kun at November 18, 2014 03:00 PM