We have seen the n−1 primality prover in one previous exercise and the n+1 primality prover in another previous exercise. In today’s exercise we combine the two tests. There are two versions, one using a factoring bound and one without. The basic idea of both provers is to express n − 1 = f1 * r1 and n + 1 = f2 * r2 with r1 and r2 odd, gcd(f1, r1) = gcd(f2, r2) = 1 and the prime factorizations of f1 and f2 known, then perform various tests if the partial factorizations are sufficient. This is often called the BLS primality prover because the 1975 paper by John Brillhart, Derrick Lehmer and John Selfridge proves all the necessary theorems.
For the unbounded version of the test, factor n−1 and n+1 by any convenient method until max(f1*f1*f2/2, f1*f2*f2/2) > n. Then n is prime if all the factors of f1 and f2 are prime and the following conditions hold:
Condition 1: For each prime p dividing f1 there is an integer a such that a^(n−1) = 1 (mod n) and gcd(a^((n−1)/p)−1, n) = 1. A different a may be used for each prime p.
Condition 2: For each prime p there is a Lucas sequence U(P,Q) with discriminant D = P^2 – 4Q 0 and the jacobi symbol (D/N) = −1 such that U(n+1) = 0 (mod n) and gcd(U((n+1)/p), n) = 1. The same discriminant D must be used for each prime p, but P and Q may vary. Given P and Q, a new P‘ and Q‘ with the same D can be calculated as P‘ = P + 2 and Q‘ = P + Q + 1.
The bounded version is similar. Factor n−1 and n+1 by trial division until max(b*f1+1, b*f2−1) * (b*b*f1*f2/2 + 1) > n, where b is the smallest number that is known to be a factor of neither r1 nor r2; if you’re clever, you can perform the two trial divisions simultaneously, looking for remainders of either 0 or 2 when dividing n+1 by the current prime. Then n is prime if Condition 1 and Condition 2 and the following additional conditions hold:
Condition 3: There is an integer a such that a^(n−1) = 1 (mod n) and gcd(a^((n−1)/R1)-1, n) = 1.
Condition 4: There is a Lucas sequence U(P,Q) with the same discriminant D as in Condition 2 above such that U(n+1) = 0 (mod n) and gcd(U((n+1)/R2), n) = 1.
In practice, it is common to write a single program that includes both versions. First perform trial division up to some convenient limit, using the bounded formula to determine if you have enough factors. If that is unsuccessful, use some other method (Pollard rho or elliptic curves work well, because they are good at finding small factors of large numbers) to find more factors, recursively prove any such factors prime, and continue until you have enough.
Your task is to implement the BLS primality prover. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
In today’s exercise you are challenged to write a program in a language you’ve never used before. We’ve done this before, and it’s fun; the idea is to get you out of your comfort zone, so you’re thinking about programming, not just blindly following habit.
Your task is to write a program in a language you’ve never used before. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
This is a guest post by my colleague Adam Lelkes. The goal of this primer is to introduce an important and beautiful tool from probability theory, a model of fair betting games called martingales. In this post I will assume that the reader is familiar with the basics of probability theory. For those that need […]
I was reading Jon Bentley’s book More Programming Pearls the other day; it’s a superb book, and I re-read it every few years. In the first chapter Bentley uses a profiler to improve a small function to identify prime numbers by trial division; in today’s exercise, we will do the same.
Your task is to use a profiler to improve a program you have written; depending on the facilities provided by your language, you may have to write your own profiler. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
This year again Guix participates in the Google Summer of Code under the umbrella of the GNU Project.
If you are an eligible student, your contributions to GNU's package manager and to the GNU system are welcome!
We have collected project ideas touching a variety of topics. If you are a free software hacker passionate about GNU/Linux packaging, Scheme, functional programming, Emacs, or peer-to-peer networking, check out the proposed projects. The list is far from exhaustive, so feel free to bring your own!
DrRacket includes a new package manager GUI, available via the File|Package Manager ... menu item. The GUI is also available as a stand-alone program via the "gui-pkg-manager" package.
The main Racket distribution has been separated into about 200 packages. The Racket installer combines the core system with bundled versions of these packages.
Alternatively, you may now install a Minimal Racket distribution — which is about 1/10 the size of the main distribution — and add only those packages that you need.
Package installation supports pre-built packages that include compiled byte code and rendered documentation, meaning packages can be installed quickly when built versions are available. All packages in the main distribution are available in pre-built form.
The recent 5.92 and 5.93 releases served as release candidates for 6.0, and 6.0 includes a few additional repairs related to the package system.
Further improvements to the package system are in the works, notably including package documentation on the package-catalog web site.
COMPATIBILITY NOTE: PLaneT, the previous Racket package system, will remain in place for the foreseeable future, but we expect all package work to shift to the new system.
Beyond the package system, this release brings a number of other changes:
Racket's HTML documentation has a new and improved look, thanks to Matthew Butterick.
So far on this blog we’ve given some introductory notes on a few kinds of algebraic structures in mathematics (most notably groups and rings, but also monoids). Fields are the next natural step in the progression. If the reader is comfortable with rings, then a field is extremely simple to describe: they’re just commutative rings […]
The hands of an analog clock occasionally cross as they revolve around the dial.
Your task is to write a progam that determines how many times the hands cross in one twelve-hour period, and compute a list of those times. When you are finisehd, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Last time we saw a geometric version of the algorithm to add points on elliptic curves. We went quite deep into the formal setting for it (projective space ), and we spent a lot of time talking about the right way to define the “zero” object in our elliptic curve so that our issues with […]
I’m pleased to announce that another paper of mine is finished. This one is submitted to ICALP, which is being held in Copenhagen this year (this whole research thing is exciting!). This is joint work with my advisor, Lev Reyzin. As with my first paper, I’d like to explain things here on my blog a […]
Given two words, determine if the first word, or any anagram of it, appears in consecutive characters of the second word. For instance, cat appears as an anagram in the first three letters of actor, but car does not appear as an anagram in actor even though all the letters of car appear in actor.
Your task is to write a function to determine if an anagram is present in a word. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Last week was Guile 2.0's third anniversary. Guile 2 was a major milestone for Guile and so, like in previousyears, we organized a birthday potluck---a hack fest where Guilers brought their freshly cooked dishes.
The Potluck Dishes
This year again we got a variety of fine dishes. Here's the menu:
Alex Sassmannshausen's dish actually relates to food: food-guile is a program that suggests meals taken from a collection of recipes, and according to various constraints.
Doug Evans came up with a patch to improve SIGINT handling in GDB's Guile support---the ability to extend GDB with Guile code, which Doug committed in GDB a couple of weeks ago.
Another GDB-related dish is Ludovic Courtès' GDB pretty-printer for Guile's SCM values. This GDB extension, written in Guile, allows GDB to display the SCM values manipulated by libguile in a human-friendly way.
This has been another pleasant potluck. Thanks to all the participants, and happy birthday Guile 2!
What's that out by the woodshed? It's a steaming pile -- it's full of bugs -- it's compost, a leaf function compiler for Guile!
Around this time last year, a few of us cooked up some hack-dishes to bring to a potluck for Guile 2.0's release anniversary. Last year, mine was a little OpenGL particle demo.
That demo was neat but it couldn't be as big as I would have liked it to be because it was too slow. So, this year when the potluck season rolled around again I sat down to make a little compiler for the subset of Scheme that you see in inner numeric loops -- bytevector access, arithmetic, and loops.
The result is compost. Compost compiles inner loops into native x86-64 machine code that operates on unboxed values.
As you would imagine, compost-compiled code is a lot faster than code interpreted by Guile's bytecode interpreter. I go from being able to compute and render 5K particles at 60 fps up to 400K particles or so -- an 80-fold improvement. That's swell but it gets sweller. The real advantage is that with fast inner loops, I can solve more complicated problems.
Like this one!
(Videos on the internet are a surprisingly irritating problem. Maybe because it's not a cat? Check wingolog.org/pub/ for various other versions of 1300-bodies-512x416 if that doesn't play for you.)
Last year's demo hard-coded a gravitational attractor at (0, 0, 0). This one has no hard-coded attractor -- instead, each particle attracts each other. This so-called n-body simulation is an n-squared problem, so you need to be really efficient with the primitives to scale up, and even then the limit approaches quickly.
With compost, I can get to about 1650 particles at 60 frames per second, using 700% CPU on this 4-core 2-thread-per-core i7-3770 machine, including display with the free software radeon drivers. Without compost -- that is to say, just with Guile's bytecode virtual machine -- I max out at something more like 120 particles, and only 200% CPU.
The rest of this post describes how compost works. If compilers aren't your thing, replace the rest of the words with cat noises.
meow meow meow meow meow meow meow meow
The interface to compost is of course a macro, define/compost. Here's a little loop to multiply two vectors into a third, element-wise:
(use-modules (compost syntax) (rnrs bytevectors))
(define/compost (multiply-vectors (dst bytevector?)
(let lp ((n start))
(define (f32-ref bv n)
(bytevector-ieee-single-native-ref bv (* n 4)))
(define (f32-set! bv n val)
(bytevector-ieee-single-native-set! bv (* n 4) val))
(when (< n end)
(f32-set! dst n (* (f32-ref a n) (f32-ref b n)))
(lp (1+ n)))))
It's low-level but that's how we roll. If you evaluate this form and all goes well, it prints out something like this at run-time:
This indicates that compost compiled your code into a shared object at macro-expansion time, and then printed out that message when it loaded it at runtime. If composting succeeds, compost writes out the compiled code into a loadable shared object (.so file). It then residualizes a call to dlopen to load that file at run-time, followed by code to look up the multiply-vectors symbol and create a foreign function. If composting fails, it prints out a warning and falls back to normal Scheme (by residualizing a plain lambda).
In the beginning of the article, I called compost a "leaf function compiler". Composted functions should be "leaf functions" -- they shouldn't call other functions. This restriction applies only to the low-level code, however. The first step in composting is to run the function through Guile's normal source-to-source optimizers, resulting in a CPS term. The upshot is that you can use many kinds of abstraction inside the function, like the little f32-ref/f32-set! helpers above, but in the end Guile should have inlined or contified them all away. It's a restriction, but hey, this is just a little hack.
Let's look at some assembly. We could get disassembly just by calling objdump -d /home/wingo/.cache/guile/compost/rmYZoT-multiply-vectors.so, but let's do it a different way. Let's put that code into a file, say "/tmp/qux.scm", and add on this code at the end:
$ gdb --args guile /tmp/qux.scm
(gdb) b 'multiply-vectors'
Function "multiply-vectors" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 ('multiply-vectors') pending.
Starting program: /opt/guile/bin/guile /tmp/qux.scm
[New Thread 0x7ffff604b700 (LWP 13729)]
;;; loading /home/wingo/.cache/guile/compost/Kl0Xpc-multiply-vectors.so
Breakpoint 1, 0x00007ffff5322000 in multiply-vectors () from /home/wingo/.cache/guile/compost/Kl0Xpc-multiply-vectors.so
multiply-vectors () at /tmp/qux.scm:12
12 (when (< n end)
Word! GDB knows about the symbol, multiply-vectors. That's top. We are able to step into it, and it prints Scheme code!
Both of these swell things are because compost residualizes its compiled code as ELF files, and actually re-uses Guile's linker. The ELF that we generate can be loaded by dlopen, and its symbol tables and DWARF debugging information are known to GDB.
(In my last article on ELF I mentioned that Guile had no plans to use the system dynamic library loader (dlopen). That's still true; Guile has its own loader. I used the system loader in this place, though, just because I thought it was a neat hack.)
We can tell GDB to disassemble the next line:
(gdb) set disassemble-next-line on
9 (bytevector-ieee-single-native-ref bv (* n 4)))
=> 0x00007ffff532201d <multiply-vectors+29>: 4c 0f af c9 imul %rcx,%r9
13 (f32-set! dst n (* (f32-ref a n) (f32-ref b n)))
=> 0x00007ffff532203b <multiply-vectors+59>: f2 0f 59 c1 mulsd %xmm1,%xmm0
0x00007ffff532203f <multiply-vectors+63>: 49 b9 04 00 00 00 00 00 00 00 movabs $0x4,%r9
GDB does OK with these things, but it doesn't have special support for Scheme, and really you would want column pointers, not just lines. That data is in the DWARF but it's not processed by GDB. Anyway here's the disassembly:
Now if you know assembly, this is pretty lame stuff -- it saves registers it doesn't use, it multiplies instead of adds to get the bytevector indexes, it loads constants many times, etc. It's a proof of concept. Sure beats heap-allocated floating-point numbers, though.
safety and semantics
Compost's goal is to match Guile's semantics, while processing values with native machine operations. This means that it needs to assign concrete types and representations to all values in the function. To do this, it uses the preconditions, return types from primitive operations, and types from constant literals to infer other types in the function. If it succeeds, it then chooses representations (like "double-precision floating point") and assigns values to registers. If the types don't check out, or something is unsupported, compilation bails and runtime will fall back on Guile's normal execution engine.
There are a couple of caveats, however.
One is that compost assumes that small integers do not promote to bignums. We could remove this assumption with better range analysis. Compost does do some other analysis, like sign analysis to prove that the result of sqrt is real. Complex numbers will cause compost to bail.
Compost also doesn't check bytevector bounds at run-time. This is terrible. I didn't do it though because to do this nicely you need to separate the bytevector object into two variables: the pointer to the contents and the length. Both should be register-allocated separately, and range analysis would be nice too. Oh well!
Finally, compost is really limited in terms of the operations it supports. In fact, if register allocation would spill on the function, it bails entirely :)
If it's your thing, have fun over on yon gitorious. Compost needs Guile from git, and the demos need Figl, the GL library. For me this project is an ephemeral thing; a trial for future work, with some utility now, but not something I really want to support. Still, if it's useful to you, have at it.
I woke up this morning at 5 thinking about the universe, and how friggin big it is. I couldn't go back to sleep. All these particles swirling and interacting in parallel, billions upon zillions, careening around, somehow knowing what forces are on them, each somehow a part of the other. I studied physics and I never really boggled at the universe in the way I did this morning thinking about this n-body simulation. Strange stuff.
I remember back in college when I was losing my catholicism and I would be at a concert or a dance show or something and I would think "what is this, this is just particles vibrating, bodies moving in the nothing; there is no meaning here." It was a thought of despair and unmooring. I don't suffer any more over it, but mostly because I don't think about it any more. I still don't believe in some omniscient skydude or skylady, but if I did, I know he or she has got a matrix somewhere with every particle's position and velocity.
Swirl on, friends, and until next time, happy hacking.
It’s been a while since we had any interview questions. We have two today, since they are so simple:
1) Given a number n, find the smallest 3-digit number such that the product of its digits is equal to n. For example, given n = 100, the solution is 455.
2) Given two arrays, one with n items and one with n+2 items including the same items as the array with n items plus two additional items, find the two additional items. Assume none of the items are duplicates.
Your task is to write solutions to the two interview questions given above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Last time we looked at the elementary formulation of an elliptic curve as the solutions to the equation where are such that the discriminant is nonzero: We have yet to explain why we want our equation in this form, and we will get to that, but first we want to take our idea of intersecting […]