Planet Scheme

November 27, 2015

Programming Praxis

Happy Thanksgiving

There will be no exercise today. To all my readers in the United States: Happy Thanksgiving!

by programmingpraxis at November 27, 2015 09:00 AM

November 24, 2015

Programming Praxis

Reversible Random Number Generator

Some applications of random number generators. games, for instance, require that the random sequence be available to run in reverse.

This is easy to do with a simple linear congruential random number generator, which is characterized by the formula next = a * prev + c (mod m). With a little bit of algebra, that formula can be written prev = a−1 * (next - c) (mod m), where a−1 is the inverse of a (mod m); that modular inverse always exists because the linear congruential generator requires that a and m are coprime.

Your task is to write a reversible random number generator. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at November 24, 2015 09:00 AM

The Racket Blog

Racket v6.3

Racket version 6.3 is now available from
  • Racket's macro expander uses a new representation of binding called "set of scopes". The new binding model provides a simpler explanation of how macros preserve binding, especially across module boundaries and in hygiene-bending expansions. The new expander is mostly compatible with existing Racket macros, but there are some incompatibilities. For the formally inclined, a research paper on this macro system will appear at POPL next year:
  • Racket's GUI library now uses Gtk+ 3 when available, instead of Gtk+ 2. Set the PLT_GTK2 environment variable to select Gtk+ 2.
  • Added a new Redex tutorial based on a week-long workshop in SLC.
  • Better syntax error checking for Redex patterns that do not use holes correctly.
  • The blueboxes are more agressive about finding names to look up in the docs, meaning they are useful much more often.
  • Submodules are now fully supported in Typed Racket. Previously, some uses of submodules would produce internal errors, making it hard to module+ test and module+ main effectively in Typed Racket. The switch to the set-of-scopes expander fixed these problems, and submodules are now happily at home in Typed Racket.
  • The typed/racket/unsafe library provides import and export forms that circumvent contract generation. This improves performance for typed-untyped interaction at the cost of safety and debuggability.
  • Typed Racket provides experimental support for units (from racket/unit).
  • The experimental define-new-subtype form allows overlaying finer distinctions between otherwise identical types, similar to Haskell's new type.
  • The Promise type constructor changes in a backwards-incompatible way to exclude promises created with promise/name.
  • The unstable-* packages are out of the main distribution. Most of their contents have been either merged with established Racket libraries or spun off as their own packages. This change is backwards compatible for packages that properly list their dependencies. Full details:
  • edu: big-bang supports a display-mode clause so that world programs can take over the entire screen.
Feedback welcome

by Ryan Culpepper ( at November 24, 2015 04:30 AM

November 23, 2015

Ben Simon

Reading Racket Documentation on Android, No Internet Needed

I've got emacs and racket running nicely on my Android phone (thanks to the awesome Gnuroot). But that left me with a new wrinkle: what's the best way to access Racket Documentation while programming on my phone? Obviously, I can open up a browser to but what if I'm a context where I don't have access to the Net?

By default, apt-get appears to install racket documentation in /usr/share/doc/racket, which means that my phone had the documentation, I just needed a way of viewing it.

Method 1: w3m. A quick and easy way to view the documentation is to run apt-get install w3m, and install w3m. w3m is an extremely handy tool as it lets you view html files from the command line. For example:

Method 2: launch a local web server. While w3m is a fast and functional solution, I wanted to be able to view the documentation in a standard browser, too. I figured I could accomplish this if I kicked off a local webserver. I could then point my regular Android browser to this server and I'd be all set. While I'm sure I could have installed apache or another full featured server, I found an easier and lighter weight solution: wbox. wbox is a testing tool that happens to offer the capability of being a stand alone, zero configuration, web server. Installing wbox is as easy as running apt-get install wbox. And here it is in action:

(Note the URL is: http://localhost:8081)

wbox worked perfectly, and I should be able to leverage this same approach to hosting any local documentation.

Combine this local web browser approach with Android Multi Window, and you've got a remarkably functional environment.

I'm one step closer to a full self contained, no-Internet-needed, development environment.

by Ben Simon ( at November 23, 2015 11:53 AM

Guile News

Give a talk in GNU Guile's track at FOSDEM!

GNU Guile will have its own developer room at FOSDEM 2016, which will take place in Brussels, Belgium, 30–31 January 2016.

Now is the time to submit a proposal for a talk in our devroom! We welcome all kinds of talks, be it about functional programming with Guile, experience reports on embedding Guile in your application, Web development with Guile, GNU Guix development, and more—Guile has very diverse use cases, and this devroom should reflect that.

We look forward to reading your proposal and to meeting you! :-)

by Ludovic Courtès at November 23, 2015 09:32 AM

November 20, 2015

Programming Praxis

External Sorting

Sorting the lines of a file that fits in memory is a simple matter of loading the file and calling the sort function. When the file doesn’t fit in memory, however, it must be sorted in pieces, then combined into a single output, which is harder.

Your task is to write a program that sorts the lines of a file that is too large to fit into memory into ascending ascii order. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at November 20, 2015 09:00 AM

November 19, 2015

Ben Simon

Thermaroid: Android + a Thermal Printer = Super Cheap, Super Low Quality Photos on the Fly

Thermaroid lives! Today I finished smooshing together my Camera API progress with my past success printing to a Bluetooth Thermal Printer. The result is a drop dead simple app. The app starts up with absolutely no UI, it's just a live camera preview. Short press on the screen and the camera focuses. Long press the image is captured and sent to the Bluetooth printer. Here's an action shot:

(Yes, that's a screenshot of me using Thermaroid to take a picture of a picture that I just printed out with Thermaroid. Got that?)

The app needs quite a bit of polishing before it's ready for prime time, including: some indication that your photo is being printed and proper code for backgrouding the printing process. But functionality wise, it works.

I think I'm finally getting the hang of using Kawa Scheme to build apps. Apparently I needed to let go of this notion that I could somehow write Java code using Scheme's terse conventions. Kawa lets you do this a bit, but at the end of the day, things seem to work better (or compile with fewer complaints) when you annotate your code with types. I'm definitely interested in seeing if I can do some refactoring to eek out a bit more elegance than purely translating Java to Scheme. We'll see.

Check out the code here. And happy photo'ing!

by Ben Simon ( at November 19, 2015 05:21 PM

GNU Guix

Service composition in GuixSD

GuixSD provides a declarative, stateless approach to operating system configuration management. In this context, the mechanism offered to select and compose system services is a crucial one. This post presents the new service framework introduced in the 0.9.0 version of GNU Guix.

Declarative Configuration Management

GuixSD is not like your parents’ distro. Instead of fiddling with configuration files all around, or running commands that do so as a side effect, the system administrator declares what the system will be like. This takes the form of an operating-system declaration, which specifies all the details: file systems, user accounts, locale, timezone, system services, etc.

If you’re familiar with it, this may remind you of what deployment tools like Ansible and Puppet provide. There is an important difference though: GuixSD takes a stateless—or “purely functional”—approach. This means that instantiating the system with guix system always produces the same result, without modifying the current system state. This is what makes it possible to test new system configurations, roll-back to previous ones, and so on. The guix system command allows system configurations to be instantiated on the bare metal, in virtual machines, or in containers, which makes it easy to test them.

In GuixSD, operating-system declarations are first-class objects in the host language. They can be inspected at the REPL:

It is also possible to write functions that take or return OS configurations. For instance, the virtualized-operating-system function returns a variant of the given OS where the set of file systems and the initrd are changed so that the resulting OS can be used in a lightweight virtual machine environment. Likewise for containerized-operating-system.

Services Beyond Daemons

System services are specified in the services field of operating-system declarations, which is a list of service objects. As a user, we want to be able to ideally add one line specifying the system service we want to add, possibly with several instances of a service, and have GuixSD do the right thing.

Before 0.9.0, GuixSD had a narrow definition of what a “system service” is. Each service in the operating-system configuration had to map to exactly one dmd service—GNU dmd is the init system of GuixSD. This would work well in many cases: an SSH server or a log-in daemon is indeed a service that dmd has to take care of, even a file system mount is an operation that can be usefully inserted into dmd’s service dependency graph.

However, this simple mapping failed to capture more complex service composition patterns. A striking example is “super-daemons”—daemons that can spawn other daemons, such as dbus-daemon or inetd. From the user viewpoint, it does not matter whether a daemon is started by dmd, or by dbus-daemon, or by inetd; this should be transparent. If it’s a D-Bus service, then dbus-daemon’s configuration file should be told about the service; if it’s an inetd service, then inetd.conf should be augmented accordingly; if it’s a dmd service, information on how to start and stop it should go to dmd’s configuration file. Unfortunately, the pre-0.9.0 services could not express such things.

Worse, this approach did not capture the more general pattern of service extension. In the examples above, the super-daemons are effectively extended by other services that rely on them. But there are many cases where services are similarly extended: eudev can be passed new device rules, polkit can be extended with new rules and actions, the Pluggable authentication module system (PAM) can be extended with new services, and so on. At that point it was clear that GuixSD’s naive approach wouldn’t scale.

Composing System Services

The lesson learned from these observations is that system services extend each other in various way. The new service composition framework is built around this model: “system services”, broadly defined, can extend each other, and services and their “extends” relationships form a graph. The root of the graph is the operating system itself.

We can see that this pattern applies to services that are not daemons. PAM is one such example. Accounts are another example: GuixSD provides an “account service” that can be extended with new user accounts or groups; for example, the Network time protocol (NTP) daemon needs to run under the unprivileged “ntp” user, so the NTP service extends the account service with an “ntp” user account. Likewise, the “/etc” service can be extended with new files to be added to /etc; the “setuid” service can be extended with new programs to be made setuid-root. See the manual for more examples.

The nice thing is that composition of services is made explicit: extensions can only happen where explicit extension relationships have been declared. By looking at the extension graph, users can see how services fit together. The guix system extension-graph command, for instance, takes an operating-system declaration and renders the extension graph in the Graphviz format, making it easy to inspect the OS configuration structure.

The API makes it easy to see how services contributed to a specific service’s configuration. For instance, the following expression shows the PAM service as extended by other declared services:

The result is a service object whose value is a list of pam-service objects. Likewise, the following expression returns the /etc service, whose value is a list of entries to be added to /etc:

This contrasts with the approach taken by NixOS, GuixSD’s cousin, and described in this 2010 paper. In NixOS, the whole system configuration is described in an “attribute set”—a list of key/value associations, similar to JavaScript objects or Python dictionaries. Each NixOS service is passed the whole system configuration, allowing it to inspect and change any part of it.

This form of ambient authority gives a lot of flexibility, but it makes it harder to reason about service composition—all a service implementation does is inspect, add, or modify attributes of the global configuration, which may or may not affect other services. The use of a loose key/value dictionary also prevents good error reporting; for instance, a typo in a service name may go undetected. Lastly, NixOS services are enabled by writing service.enable = true stanzas, which leads to complications for services that may have several instances, each with its own configuration.

Wrapping Up

The new service composition framework in GuixSD 0.9.0 addresses shortcomings found in previous versions of GuixSD. It simplifies operating-system declarations for users, and provides a highly extensible framework that clearly exposes the way services are composed.

This new framework has already allowed us to integrate Freedesktop and GNOME services in a convenient way. We hope it will prove fruitful as we address other types of services, such as Web services.

About GNU Guix

GNU Guix is a functional package manager for the GNU system. The Guix System Distribution or GuixSD is an advanced distribution of the GNU system that relies on GNU Guix and respects the user's freedom.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. Guix uses low-level mechanisms from the Nix package manager, except that packages are defined as native Guile modules, using extensions to the Scheme language. GuixSD offers a declarative approach to operating system configuration management, and is highly customizable and hackable.

GuixSD can be used on an i686 or x86_64 machine. It is also possible to use Guix on top of an already installed GNU/Linux system, including on mips64el and armv7.

by Ludovic Courtès at November 19, 2015 10:05 AM

November 18, 2015

Ben Simon

Doing Battle With, and Losing To, the Android Camera2 API

I'm back to fiddling with with my Android Bluetooth Thermal Printer. Specifically, I'm building a small camera app that outputs to the printer directly, rather than saving files on your phone.

Looking at the Camera API, I realized that it was now deprecated, so I started building the app using Camera2. I'm not proud to admit it, but Camera2 beat me. I give up. That's one complete and crazy API to use. That's not a bad thing, it's just not ideal for quick little apps.

Oh, and all of this is written in Kawa Scheme. So if you're looking for Scheme Android examples, or how you might invoke the camera from a Scheme app, check out my project's repository, hopefully it'll be of some use.

For now, the app does little more than open up the camera, switch into preview mode and focus when you press the screen. But hopefully with the Camera functionality in place, hooking in my existing Thermal Printer Code shouldn't be too painful.

Two gotchas that I overcame:

1. I was getting the message: A TextureView or a subclass can only be used with hardware acceleration enabled, and of course the camera didn't work. I tried turning on Hardware Acceleration in the AndroidManifest.xml. That made no difference. Ultimately, it was this note that solved the issue:

The default value is "true" if you've set either minSdkVersion or targetSdkVersion to "14" or higher; otherwise, it's "false".

Turns out, I wasn't setting minSdkVersion at all. I did so (setting it to 22), and that fixed the issue.

2. I fired up the camera preview but the screen remained black. There were no messages in the Android monitor so I was baffled. Then it hit me: the camera is lying down on its lens, of course the screen is black. I picked it up, and the camera view showed the correct image. What the heck was I thinking?

Here's the project if you want to follow along: Thermaroid

by Ben Simon ( at November 18, 2015 04:30 PM

November 17, 2015

Programming Praxis

File Reversal

Given a file containing lines of text that fits into memory, it is easy to write the file with lines in reverse order: read the lines of text into memory, then write them in reverse. It is harder to reverse the lines of a file if the file is too big to fit into memory.

Your task is to write a program that reverses the lines of a text file that is too big to fit in memory. When you are finished you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at November 17, 2015 09:00 AM

November 13, 2015

Programming Praxis

Head And Tail

Today’s exercise is a simple file-handling task for beginning programmers: take the name of a text file as input and write as output the first and last lines of the file.

Your task is to write the file-handling program described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at November 13, 2015 09:00 AM

November 11, 2015

GNU Guix

Reproducible builds: a means to an end

What we stand for

GNU Guix is committed to improving the freedom and autonomy of computer users. This obviously manifests in the fact that GuixSD is a fully free distro, and this is what GNU stands for. All the packages in Guix are built from source, including things like firmware where there is an unfortunate tendency to use pre-built binaries; that way, users can know what software they run. On the technical side, Guix also tries hard to empower users by making the whole system as hackable as possible, in a uniform way—making Freedom #1 practical, à la Emacs.

Guix provides pre-compiled binaries of software packages as a service to its users—these are substitutes for local builds. This is a convenient way to save time, but it could become a threat to users if they cannot establish that those substitutes are authentic—that their Corresponding Source really is what it claims to be.

Reproducible builds

We view “reproducible builds” as a technical means to an end: that of guaranteeing user autonomy and safety. What matters here is that, if package build processes are reproducible, then users actually have a chance to verify that the substitutes (pre-compiled binaries) they download correspond to the source code that supposedly produced them.

Guix builds packages in a fully isolated environment to maximize reproducibility—a crucial feature inherited from Nix. Thus, by construction, very few variations are possible between separate instances of a build environment; the set of files accessible in the environment, the host name, environment variables, locale, and so on are fully under control and cannot change. This eliminates a whole class of possible discrepancies between independent builds.

The only things that may vary are the kernel, and the hardware. The most prominent example of how ‘hardware’ details can leak into a build process are timestamps: it’s unfortunately quite common for build processes to query the system clock and record it in build outputs. Eelco Dolstra, Andres Löh, and Nicolas Pierron described sources of non-determinism in their 2010 JFP paper about NixOS, along with a study on how this affects packages of the distribution in practice. The Reproducible Debian project has since made a similar evaluation but at a larger scale, and with a larger number of independent builds, thereby providing more insight.

Reproducible Debian has demonstrated one thing: contrary to what one might expect, sources of non-determinism are common in build processes. To eliminate the sources of non-determinism that remain in spite of the isolation techniques used in Nix and Guix, the most viable approach appears to be to fix upstream projects that suffer from these problems—one by one.

The project is a great effort to try and address that collectively. We are glad Guix hackers were invited to participate in the first Reproducible Build Summit organized by the project, which will take place in December.

Going further

How can we take advantage of the fact that builds are reproducible, when they are, to actually empower users? There are several things we can do.

First, users must be able to build software locally in the same way distribution developers would do it. This possibility is an integral part of the transparent source/binary deployment model provided by functional package management; Guix users can use the --no-substitutes command-line option to force a local build.

Second, we must make it easy for users to use the binary provider of their choice, and possibly to use several of them, something that Guix allows.

Third, it must be equally simple for any user to publish their locally-built binaries as a way to improve diversity and avoid any single point of failure or trust. The guix publish command is a simple way to serve signed binaries over HTTP. A fully peer-to-peer approach based on GNUnet was tackled as part of GSoC 2015; the code needs more work before it can be integrated into Guix, but the approach is promising.

Last but not least, users must be able to challenge binary providers by themselves. The ability to verify binaries should not be the privilege of power developers. To address that, the just-released 0.9.0 version of GNU Guix provides a new command called guix challenge. The command allows users to automatically compare the build results of their local builds against those served by one or more binary providers. It allows both to find out about non-reproducible builds—and indeed, has already proved to be fruitful, and possibly to find out about compromised servers.

This and other matters were discussed in a Guix talk earlier this week (slides). We strongly believe in a future where the ability to authenticate distribution-provided binaries will be commonplace. Let’s build it!

About GNU Guix

GNU Guix is a functional package manager for the GNU system. The Guix System Distribution or GuixSD is an advanced distribution of the GNU system that relies on GNU Guix and respects the user's freedom.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. Guix uses low-level mechanisms from the Nix package manager, except that packages are defined as native Guile modules, using extensions to the Scheme language. GuixSD offers a declarative approach to operating system configuration management, and is highly customizable and hackable.

GuixSD can be used on an i686 or x86_64 machine. It is also possible to use Guix on top of an already installed GNU/Linux system, including on mips64el and armv7.

by Ludovic Courtès at November 11, 2015 02:44 PM

November 10, 2015

Programming Praxis

Triangle Of The Gods

The nth element of the Smarandache consecutive number sequence (A007908) is the sequence of the numbers from 1 to n concatenated in order:

1, 12, 123, 1234, 12345, 123456, 1234567, 12345678, 123456789, 12345678910, 1234567891011, 123456789101112, …

The sequence is sometimes called the “Triangle of the Gods,” and the story goes that anyone who can specify the smallest prime number in the sequence is admitted to heaven.

Your task is to find the smallest prime number in the sequence. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at November 10, 2015 09:00 AM

November 09, 2015

Andy Wingo

embracing conway's law

Most of you have heard of "Conway's Law", the pithy observation that the structure of things that people build reflects the social structure of the people that build them. The extent to which there is coordination or cohesion in a system as a whole reflects the extent to which there is coordination or cohesion among the people that make the system. Interfaces between components made by different groups of people are the most fragile pieces. This division goes down to the inner life of programs, too; inside it's all just code, but when a program starts to interface with the outside world we start to see contracts, guarantees, types, documentation, fixed programming or binary interfaces, and indeed faults as well: how many bug reports end up in an accusation that team A was not using team B's API properly?

If you haven't heard of Conway's law before, well, welcome to the club. Inneresting, innit? And so thought I until now; a neat observation with explanatory power. But as aspiring engineers we should look at ways of using these laws to build systems that take advantage of their properties.

in praise of bundling

Most software projects depend on other projects. Using Conway's law, we can restate this to say that most people depend on things built by other people. The Chromium project, for example, depends on many different libraries produced by many different groups of people. But instead of requiring the user to install each of these dependencies, or even requiring the developer that works on Chrome to have them available when building Chrome, Chromium goes a step further and just includes its dependencies in its source repository. (The mechanism by which it does this isn't a direct inclusion, but since it specifies the version of all dependencies and hosts all code on Google-controlled servers, it might as well be.)

Downstream packagers like Fedora bemoan bundling, but they ignore the ways in which it can produce better software at lower cost.

One way bundling can improve software quality is by reducing the algorithmic complexity of product configurations, when expressed as a function of its code and of its dependencies. In Chromium, a project that bundles dependencies, the end product is guaranteed to work at all points in the development cycle because its dependency set is developed as a whole and thus uniquely specified. Any change to a dependency can be directly tested against the end product, and reverted if it causes regressions. This is only possible because dependencies have been pulled into the umbrella of "things the Chromium group is responsible for".

Some dependencies are automatically pulled into Chrome from their upstreams, like V8, and some aren't, like zlib. The difference is essentially social, not technical: the same organization controls V8 and Chrome and so can set the appropriate social expectations and even revert changes to upstream V8 as needed. Of course the goal of the project as a whole has technical components and technical considerations, but they can only be acted on to the extent they are socially reified: without a social organization of the zlib developers into the Chromium development team, Chromium has no business automatically importing zlib code, because the zlib developers aren't testing against Chromium when they make a release. Bundling zlib into Chromium lets the Chromium project buffer the technical artifacts of the zlib developers through the Chromium developers, thus transferring responsibility to Chromium developers as well.

Conway's law predicts that the interfaces between projects made by different groups of people are the gnarliest bits, and anyone that has ever had to maintain compatibility with a wide range of versions of upstream software has the scar tissue to prove it. The extent to which this pain is still present in Chromium is the extent to which Chromium, its dependencies, and the people that make them are not bound tightly enough. For example, making a change to V8 which results in a change to Blink unit tests is a three-step dance: first you commit a change to Blink giving Chromium a heads-up about new results being expected for the particular unit tests, then you commit your V8 change, then you commit a change to Blink marking the new test result as being the expected one. This process takes at least an hour of human interaction time, and about 4 hours of wall-clock time. This pain would go away if V8 were bundled directly into Chromium, as you could make the whole change at once.

forking considered fantastic

"Forking" sometimes gets a bad rap. Let's take the Chromium example again. Blink forked from WebKit a couple years ago, and things have been great in both projects since then. Before the split, the worst parts in WebKit were the abstraction layers that allowed Google and Apple to use the dependencies they wanted (V8 vs JSC, different process models models, some other things). These abstraction layers were the reified software artifacts of the social boundaries between Google and Apple engineers. Now that the social division is gone, the gnarly abstractions are gone too. Neither group of people has to consider whether the other will be OK with any particular change. This eliminates a heavy cognitive burden and allows both projects to move faster.

As a pedestrian counter-example, Guile uses the libltdl library to abstract over the dynamic loaders of different operating systems. (Already you are probably detecting the Conway's law keywords: uses, library, abstract, different.) For years this library has done the wrong thing while trying to do the right thing, ignoring .dylib's but loading .so's on Mac (or vice versa, I can't remember), not being able to specify soversions for dependencies, throwing a stat party every time you load a library because it grovels around for completely vestigial .la files, et cetera. We sent some patches some time ago but the upstream project is completely unmaintained; the patches haven't been accepted, users build with whatever they have on their systems, and though we could try to take over upstream it's a huge asynchronous burden for something that should be simple. There is a whole zoo of concepts we don't need here and Guile would have done better to include libltdl into its source tree, or even to have forgone libltdl and just written our own thing.

Though there are costs to maintaining your own copy of what started as someone else's work, people who yammer on against forks usually fail to recognize their benefits. I think they don't realize that for a project to be technically cohesive, it needs to be socially cohesive as well; anything else is magical thinking.

not-invented-here-syndrome considered swell

Likewise there is an undercurrent of smarmy holier-than-thou moralism in some parts of the programming world. These armchair hackers want you to believe that you are a bad person if you write something new instead of building on what has already been written by someone else. This too is magical thinking that comes from believing in the fictional existence of a first-person plural, that there is one "we" of "humanity" that is making linear progress towards the singularity. Garbage. Conway's law tells you that things made by different people will have different paces, goals, constraints, and idiosyncracies, and the impedance mismatch between you and them can be a real cost.

Sometimes these same armchair hackers will shake their heads and say "yeah, project Y had so much hubris and ignorance that they didn't want to bother understanding what X project does, and they went and implemented their own thing and made all their own mistakes." To which I say, so what? First of all, who are you to judge how other people spend their time? You're not in their shoes and it doesn't affect you, at least not in the way it affects them. An armchair hacker rarely understands the nature of value in an organization (commercial or no). People learn more when they write code than when they use it or even when they read it. When your product has a problem, where will you find the ability to fix it? Will you file a helpless bug report or will you be able to fix it directly? Assuming your software dependencies model some part of your domain, are you sure that their models are adequate for your purpose, with the minimum of useless abstraction? If the answer is "well, I'm sure they know what they're doing" then if your organization survives a few years you are certain to run into difficulties here.

One example. Some old-school Mozilla folks still gripe at Google having gone and created an entirely new JavaScript engine, back in 2008. This is incredibly naïve! Google derives immense value from having JS engine expertise in-house and not having to coordinate with anyone else. This control also gives them power to affect the kinds of JavaScript that gets written and what goes into the standard. They would not have this control if they decided to build on SpiderMonkey, and if they had built on SM, they would have forked by now.

As a much more minor, insignificant, first-person example, I am an OK compiler hacker now. I don't consider myself an expert but I do all right. I got here by making a bunch of mistakes in Guile's compiler. Of course it helps if you get up to speed using other projects like V8 or what-not, but building an organization's value via implementation shouldn't be discounted out-of-hand.

Another point is that when you build on someone else's work, especially if you plan on continuing to have a relationship with them, you are agreeing up-front to a communications tax. For programmers this cost is magnified by the degree to which asynchronous communication disrupts flow. This isn't to say that programmers can't or shouldn't communicate, of course, but it's a cost even in the best case, and a cost that can be avoided by building your own.

When you depend on a project made by a distinct group of people, you will also experience churn or lag drag, depending on whether the dependency changes faster or slower than your project. Depending on LLVM, for example, means devoting part of your team's resources to keeping up with the pace of LLVM development. On the other hand, depending on something more slow-moving can make it more difficult to work with upstream to ensure that the dependency actually suits your use case. Again, both of these drag costs are magnified by the asynchrony of communicating with people that probably don't share your goals.

Finally, for projects that aim to ship to end users, depending on people outside your organization exposes you to risk. When a security-sensitive bug is reported on some library that you use deep in your web stack, who is responsible for fixing it? If you are responsible for the security of a user-facing project, there are definite advantages for knowing who is on the hook for fixing your bug, and knowing that their priorities are your priorities. Though many free software people consider security to be an argument against bundling, I think the track record of consumer browsers like Chrome and Firefox is an argument in favor of giving power to the team that ships the product. (Of course browsers are terrifying security-sensitive piles of steaming C++! But that choice was made already. What I assert here is that they do well at getting security fixes out to users in a timely fashion.)

to use a thing, join its people

I'm not arguing that you as a software developer should never use code written by other people. That is silly and I would appreciate if commenters would refrain from this argument :)

Let's say you have looked at the costs and the benefits and you have decided to, say, build a browser on Chromium. Or re-use pieces of Chromium for your own ends. There are real costs to doing this, but those costs depend on your relationship with the people involved. To minimize your costs, you must somehow join the community of people that make your dependency. By joining yourself to the people that make your dependency, Conway's law predicts that the quality of your product as a whole will improve: there will be fewer abstraction layers as your needs are taken into account to a greater degree, your pace will align with the dependency's pace, and colleagues at Google will review for you because you are reviewing for them. In the case of Opera, for example, I know that they are deeply involved in Blink development, contributing significantly to important areas of the browser that are also used by Chromium. We at Igalia do this too; our most successful customers are those who are able to work the most closely with upstream.

On the other hand, if you don't become part of the community of people that makes something you depend on, don't be surprised when things break and you are left holding both pieces. How many times have you heard someone complain the "project A removed an API I was using"? Maybe upstream didn't know you were using it. Maybe they knew about it, but you were not a user group they cared about; to them, you had no skin in the game.

Foundations that govern software projects are an anti-pattern in many ways, but they are sometimes necessary, born from the need for mutually competing organizations to collaborate on a single project. Sometimes the answer for how to be able to depend on technical work from others is to codify your social relationship.

hi haters

One note before opening the comment flood: I know. You can't control everything. You can't be responsible for everything. One way out of the mess is just to give up, cross your fingers, and hope for the best. Sure. Fine. But know that there is no magical first-person-plural; Conway's law will apply to you and the things you build. Know what you're actually getting when you depend on other peoples' work, and know what you are paying for it. One way or another, pay for it you must.

by Andy Wingo at November 09, 2015 01:48 PM

November 06, 2015

Programming Praxis

Self-Organizing Lists

There are many applications where you need to keep a small number of items, a set, and check for membership in the set, and there are many algorithms for doing that: a linked list keeps the items in random order to be retrieved by linear search, an ordered array keeps the items in sequence to be retrieved by binary search, a hash table performs some kind of math on the item, various kinds of trees can store and search the items, and so on.

Today’s exercise looks at a simple algorithm that is appropriate for search in a set that isn’t too big, doesn’t change too often, and isn’t searched too many times, where the definition of “too” depends on the needs of the user. The idea is to store the items of the set in a linked list, search the list on request, and each time an item is found, move it to the front of the list. The hope is that frequently-accessed items will stay near the front of the list, so that instead of an average search that requires inspection of half the list, frequently-accessed items are found much more quickly.

Your task is to write functions that maintain a set as a self-organizing list: you should provide functions to insert and delete items from the set as well as to search within the set. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at November 06, 2015 03:00 AM