wingologA mostly dorky weblog by Andy Wingo2012-03-08T22:44:59Ztekutihttps://wingolog.org/feed/atomAndy Wingohttps://wingolog.org/an in-depth look at the performance of guile's web serverhttps://wingolog.org/2012/03/08/an-in-depth-look-at-the-performance-of-guiles-web-server2012-03-08T22:44:59Z2012-03-08T22:44:59Z

What ho, ladies! And what ho, gentlemen! The hack is on and apace. Today, the topic is performance: of Guile and of its web server, in microseconds and kiloinstructions. Brew up a pot of tea; this is a long article.

the problem

I have been poking at Guile's web server recently. To recap, Guile is an implementation of Scheme. It is byte-compiled, and has a set of runtime libraries written in C and, increasingly, in Scheme itself.

Guile includes some modules for dealing with the web: HTTP, clients, servers, and such. It's all written in Scheme. It runs this blog you're reading right now.

To be precise, this web log is served by tekuti, an application written on top of Guile's web server; and actually, in this case there is Apache in front of everything right now, so you're not talking directly to Guile. Perhaps that will change at some point.

But anyway, a few months ago, this blog was really slow to access. That turned out to be mostly due to a bug in tekuti, the blog application, in which generating the "related links" for a post would always end up invoking git. (The database for the blog is implemented in git.) Spawning another process was slow. Fellow hacker and tekuti user Aleix Conchillo Flaqué fixed the problem a year ago, but it took a while for it me to finally review and merge it.

So then, at that point, things were tolerable, but I had already contracted the performance bug, so I went on to spend a couple months drilling down, optimizing Guile and its web server -- the layers below tekuti.

10K requests per second: achievement unlocked!

After a lot of tweaking, to the compiler, runtime, HTTP libraries, and to the VM, Guile can now serve over 10000 requests per second on a simple "Hello world" benchmark. This is out of the box, so to speak, if the master branch in git were a box. You just check out Guile, build it using the normal autoreconf -vif & & ./configure && make dance, and then run the example uninstalled:

$ meta/guile examples/web/hello.scm

In another terminal, you can connect directly to the port to see what it does. Paste the first paragraph of this and press return, and you should see the second part:

$ nc localhost 8080
GET / HTTP/1.0
Host: localhost:8080
User-Agent: ApacheBench/2.3
Accept: */*

HTTP/1.0 200 OK
Content-Length: 13
Content-Type: text/plain;charset=utf-8

Hello, World!

I should say a little more about what the server does, and what the test application is. It reads the request, and parses all the headers to native Scheme data types. This is strictly more work than is needed for a simple "pong" benchmark, but it's really useful for applications to have all of the headers parsed for you already. It also filters out bad requests.

Guile then passes the request and request body to a handler, which returns the response and body. This example's handler is very simple:

(define (handler request body)
  (values '((content-type . (text/plain)))
          "Hello, World!"))

It uses a shorthand, that instead of building a proper response object, it just returns an association-list of headers along with the body as a string, and relies on the web server's sanitize-response to produce a response object and encode the body as a bytevector. Again, this is more work for the server, but it's a nice convenience.

Finally, the server writes the response and body to the client, and either closes the port, as in this case for HTTP/1.0 with no keep-alive, or moves the client back to the poll set if we have a persistent connection. The reads and writes are synchronous (blocking), and the web server runs in one thread. I'll discuss this a bit more later.

You can then use ApacheBench to test it out:

$ ab -n 100000 -c100 http://localhost:8080/
Server Software:        
Server Hostname:        localhost
Server Port:            8080

Document Path:          /
Document Length:        13 bytes

Concurrency Level:      100
Time taken for tests:   9.631 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      9200000 bytes
HTML transferred:       1300000 bytes
Requests per second:    10383.03 [#/sec] (mean)
Time per request:       9.631 [ms] (mean)
Time per request:       0.096 [ms] (mean, across all concurrent requests)
Transfer rate:          932.85 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       3
Processing:     1   10   1.7      9      20
Waiting:        1   10   1.7      9      20
Total:          3   10   1.7      9      20

Percentage of the requests served within a certain time (ms)
  50%      9
  66%      9
  75%     12
  80%     12
  90%     12
  95%     12
  98%     13
  99%     13
 100%     20 (longest request)

I give you the whole printout because I find it interesting. There's no huge GC pause anywhere. This laptop has an Intel i7-2620M 2.70GHz CPU. It was bought last year, so it's recent but not fancy. There are two cores, and so part of one core is being used by ApacheBench, and the whole other one used by Guile.

Of course, this is a flawed benchmark. If you really care about this sort of thing, Mark Nottingham wrote a nice article on HTTP benchmarking last year that shows all the ways in which this test is wrong.

However, flawed though it is, this test serves my purposes: to understand the overhead that Guile and its web server imposes on a web application.

So in that light, I'd like to take apart this test and try to understand its performance. I'll look at it from three directions: bottom-up (using low-level profiling), top-down (using a Scheme profiler), and transverse (profiling the garbage collector).

bottom-up: clock cycles, instructions retired

The best way to measure the performance of an application is with a statistical profiler. Statistical profiling samples what's really happening, without perturbing the performance characteristics of your application.

On GNU/Linux systems, we have a few options. None are really easy to use, however. Perf comes the closest. You just run perf record -g guile examples/web/hello.scm, and it records its information. Then you run perf report and dive into the details through the mostly pleasant curses application.

However, for some reason perf could not capture the call graphs, in my tests. My machine is x86-64, which does not include frame pointers in its call frames, so perhaps it was a naive stack-walking algorithm. The associated DWARF information does include the necessary stack-walking data.

Anyway, at least you can get a good handle on what individual functions are hot, and indeed what source lines and instructions take the most time. These lowest-level statistical profilers typically sample based on number of clock cycles, so they correspond to real time very well.

So much for measurement. For actually understanding and improving performance, I find valgrind's callgrind tool much more useful. You run it like this:

valgrind --tool=callgrind --num-callers=20 /path/to/guile examples/web/http.scm

Valgrind can record the execution of your program as it runs, tracking every call and every instruction that is executed. You can then explore the resulting log with the kcachegrind graphical tool. It's the best thing there is for exploring low-level execution of your program.

Now, valgrind is not a statistical profiler. It slows down and distorts the execution of your program, as you would imagine. Beyond those caveats, though, you have to keep two things in mind: firstly, that a count of instructions executed does not correspond directly to clock cycles spent. There are memory latencies, cache misses, branch mispredictions, and a whole host of randomnesses that can affect your runtime. Secondly, valgrind only records the user-space behavior of your program. I'll have more to say on that later.

If I run the "hello.scm" benchmark under valgrind, and hit it with a hundred thousand connections, I can get a pretty good idea what the aggregate behavior of my program is. But I can do better than big-picture, for this kind of test. Given that this test does the same thing a hundred thousand times, I can use valgrind's accurate call and instruction counts to give me precise information on how much a single request takes.

So, looking at the total instruction and call counts, and dividing by the number of requests (100K), I see that when Guile handles one request, the algorithmic breakdown is as follows:

instructions per requestpercentage of totalprocedurecalls per request
250K100(total)-
134K52.5bytecode interpretation-
47K18.5port buffer allocation1
18K7.2display1
7.6K3allocation within VM (closures, pairs, etc)83
7.2K2.84read-line5
3.6K1.40substring-downcase3
3.2K1.24string->symbol4
3.1K1.23accept1
2.7K1.06substring11
2.6K1.03string-index8
2.5K1.00hashq-ref14
23K9other primitives < 1% each-

Guile's ports can be buffered, like C's stdio FILE* streams, and the web server does turn on buffering. The 18.5% of the time spent allocating port buffers seems like a ripe place for optimization, but digging into it, almost all of the time within those routines is spent in the garbage collector, and most of that marking the heap. Switching to a generational collector could help here, but I'm not sure how much, given that port buffers are probably 4 KB each for input and output, and thus might not fit into a young generation. Marking from more threads at once could help -- more on that in some future essay.

There are some primitives that can be optimized as well, but with the VM taking up 52% of the runtime, and 23% for allocation and the garbage collector, Amdahl's law is against us: making the primitives twice as fast would result only a 15% improvement in throughput.

Turning Amdahl's argument around, we can predict the effect of native compilation on throughput. If Guile 2.2 comes out with native compilation, as it might, and that makes Scheme code run 5 times faster (say), then the 50% of the instructions that are currently in the VM might drop to 10% -- leading to an expected throughput improvement of 67%.

top-down: scheme-level profiling with statprof

But what is going on in the VM? For that, I need a Scheme profiler. Fortunately, Guile comes with one, accessible at the REPL:

$ guile --no-debug
> ,profile (load "examples/web/hello.scm")
%     cumulative   self             
time   seconds     seconds      name
 15.42      1.55      1.55  close-port
 11.61      1.17      1.17  %after-gc-thunk
  6.20      1.65      0.62  setvbuf
  4.29      0.43      0.43  display
  3.34      0.35      0.34  accept
  2.86      0.35      0.29  call-with-error-handling
  2.38      0.24      0.24  hashq-ref
  2.23      9.16      0.22  with-default-trap-handler
  2.07      0.67      0.21  build-response
  1.75      1.17      0.18  sanitize-response
...
Sample count: 629
Total time: 10.055736686 seconds (1.123272351 seconds in GC)

(I use the --no-debug argument to avoid some per-VM-instruction overhead imposed by running Guile interactively; see VM Hooks for more.)

Here I get a really strange result. How is close-port taking all the time? It's implemented as a primitive, not in Scheme, and Valgrind only thought it took 0.40% of the instructions. How is that?

To answer this, we need to remember a couple things. First of all, we recall that Guile's statistical profiler uses setitimer to get signals delivered periodically, after some amount of time spent on the program's behalf, including time spent in the kernel. Valgrind doesn't account for system time. So here we are seeing that close-port is indeed taking time, specifically to flush out the buffered writes. The time is really being spent in the write system call.

So this is good! We know now that we should perhaps look at tuning the kernel to buffer our writes better.

We can also use the profiling data counts to break down the time spent in serving one request from a high level. For example, http-read handles traversing the poll set and accepting connections, and tail-calls read-request to actually read the request. Looking at the cumulative times in the chart tells me that out of each request, the time spent breaks down like this:

microseconds per requestprocedure
100(total)
29poll and accept
16reading request
0request handler
12sanitize-response
8writing response headers
0writing response body
15close-port
20other

Of course, since this is statistical, there is some uncertainty about the whole thing. Still, it seems sensible enough.

transverse: who is doing all the allocating?

Often when you go to optimize a Scheme program, you find that it's spending a fair amount of time in garbage collection. How do you optimize that? The rookie answer to this is to try to patch the collector to allocate faster, or less frequently, or something. Veterans know that the solution to GC woes is usually just to allocate less. But how do you know what is allocating? GC is a fairly transverse cost, in that it can charge the good parts of your program for the expenses of the bad.

Guile uses the Boehm-Demers-Weiser conservative collector. For what it is, it's pretty good. However its stock configuration does not provide very much insight into the allocation patterns of your program. One approximation that can be made, though, is that the parts of the program that cause garbage collection to run are the parts that are allocating the most.

Based on this insight, Guile includes a statistical profiler that samples when the garbage collector runs. One thing to consider is that GC probably doesn't run very often, so for gcprof tests, one might need to run the test for longer. In this example, I increased the load to a million requests.

$ guile --no-debug
> (use-modules (statprof))
> (gcprof (lambda () (load "examples/web/hello.scm")))
%     cumulative   self             
time   seconds     seconds      name
 86.55     82.75     82.75  setvbuf
  9.37      8.96      8.96  accept
  1.70      1.63      1.63  call-with-error-handling
  1.31      1.25      1.25  %read-line
  0.29      0.28      0.28  substring
  0.24      0.23      0.23  string-downcase
  0.10     91.80      0.09  http-read
  0.10      0.09      0.09  parse-param-list
  0.10      0.09      0.09  write-response
---
Sample count: 2059
Total time: 95.611532937 seconds (9.371730448 seconds in GC)

Here we confirm the result that we saw in the low-level profile: that the setvbuf Scheme procedure, which can cause Guile to allocate buffers for ports, is the primary allocator in this test.

Another interesting question to answer is, "how much are we allocating, anyway?" Using the statistics REPL command, I can see that the 100K requests entailed a total allocation of about 1.35 GB, which divides out to 13.5 KB per request. That sounds reasonable: about 4 KB each for the read and write buffers, and some 4 KB of various other allocation: strings, pairs, the final bytevector for output, etc.

The test incurred 220 stop-the-world garbage collections. So, about 1 out of 450 (0.2%) of requests trigger a GC. The average GC time seems to be about 5 ms (1100 ms / 220 times). That squares fairly well with our last-percentile ApacheBench results.

The total heap size is modest: 14 megabytes. It does not leak memory, thankfully. If I run mem_usage.py on it, I get:

Mapped memory:
               Shared            Private
           Clean    Dirty    Clean    Dirty
    r-xp    1236        0     1028        0  -- Code
    rw-p      24        0       28      160  -- Data
    r--p      60        0     1580      112  -- Read-only data
    ---p       0        0        0        0
    r--s      24        0        0        0
   total    1344        0     2636      272
Anonymous memory:
               Shared            Private
           Clean    Dirty    Clean    Dirty
    r-xp       4        0        0        0
    rw-p       0        0        0    11652  -- Data (malloc, mmap)
    ---p       0        0        0        0
   total       4        0        0    11652
   ----------------------------------------
   total    1348        0     2636    11924

Native ahead-of-time compilation would allow for more shareable, read-only memory. Still, as it is, this seems acceptable.

a quick look at more dynamic tests

That's all well and good, you say, but it's a fairly static test, right?

For that I have a couple of data points. One is the simple SXML debugging test in examples/test/debug-sxml.scm, which simply spits back the headers that it receives, wrapped in an HTML table. The values are printed as the Scheme object that they parse to. Currently, I have a version of this script running on my site. (You can see the headers that Apache adds on, there.)

Testing it as before tells me that Guile can serve 6000 of these pages per second, on my laptop. That's pretty respectable for an entirely dynamic page that hasn't been optimized at all. You can search around the net for comparable tests in other languages; I think you'll find Guile's performance to be very good.

The other point to mention is Tekuti itself. Tekuti includes a caching layer in the application, so after the first request, it's not really a dynamic test. It does check to make sure its caches are fresh on every round, though, by checking the value of the refs/heads/master ref. But still, it is a test of pushing a lot of data; for my test, the page is 50 KB, and Guile still reaches 5700 requests per second on this one core, serving 280 megabytes per second of... well, of my blather, really. But it's a powerful blather-pipe!

related & future work

Here's a similarly flawed test from a year ago of static servers, serving small static files over localhost. Flawed, but it's a lot like this "hello world" test in semantics. We see that nginx gets up to about 20K requests per second, per core. It also does so with a flat memory profile, which is nice.

Guile's bundled web server is currently single-threaded and blocking, which does not make it a good frontend server. There's room for a project to build a proper web server on top of Guile, I think, but I probably won't do it myself. In the meantime though, I do want to offer the possibility for the built-in web server to be multithreaded, with some number of I/O threads and some more limited number of compute threads. I've been testing out some code in that direction -- in fact, this server is running that code -- but as yet Guile's synchronization primitives have too much overhead for it to be a real win. There's more runtime and compiler work to do here.

As far as web servers written in safe languages, it would be remiss to not mention Warp, a Haskell web server. Again, their tests effectively utilize multiple cores, but it seems that 20K per core is the standard.

Unlike Haskell, Guile lacks a proper event manager. I'm not sure whether to work on such a thing; Havoc seems to think it's necessary, and who am I to argue.

Finally, I mention a benchmark of python WSGI servers from a couple years ago. (Is March the month of benchmarks?) The python performance is notably worse; hopefully PyPy has improved things in the meantime. On the other hand, GEvent's use of greenlets is really nice, and makes me envious.

conclusions

Hey, you read the thing! Congratulations to you! Good thing you didn't just skip down to the end :)

If I have a message to send, it's this: that you should consider using Guile to be perfectly acceptable for implementing dynamic web applications with high performance requirements.

It's a modest point, I know. There are all kinds of trade-offs here, but hey, Guile is plucky and still a little bit shy, but would love it if you to ask it to the hack.

If it works for you, boast about it to your colleagues! And if it doesn't, let us know, over at the usual places (guile-user@gnu.org, and #guile on freenode). Happy hacking!

Andy Wingohttps://wingolog.org/fscons 2011: free software, free societyhttps://wingolog.org/2011/11/28/fscons-2011-free-software-free-society2011-11-28T11:38:36Z2011-11-28T11:38:36Z

Good morning, hackersphere! Time and space are moving, in the egocentric coordinate system at least, but before their trace is gone, I would like to say: FSCONS 2011 was fantastic!

FSCONS is a conference unlike any other I know. I mean, where else can you go from a talk about feminism in free software, to talk about the state of the OpenRISC chip design project, passing through a hallway track conversation on the impact of cryptocurrency on the welfare state, approached from an anarchist perspective?

Like many of you, I make software because I like to hack. But I make Free Software in particular because I value all kinds of freedom, as part of the "more beautiful world our hearts know is possible". We make the material conditions of tomorrow's social relations, and I want a world of sharing and mutual aid.

But when we reflect on what our hands are making, we tend do so in a context of how, not why. That's why I enjoyed FSCONS so much, that it created a space for joining the means of production to their ends: a cons of Free Software, Free Society.

As a GNU hacker, I'm especially honored by the appreciation that FSCONS particpants have for GNU. FSCONS has a tithe, in which a portion of the entry fees is donated to some project, and this year GNU was chosen as the recipient. It's especially humbling, given the other excellent projects that were nominated for the tithe.

So thank you very much, FSCONS organizers and participants. I had a great time!

are you bitter about it?

I gave a talk there at FSCONS, GNU Guile: Free Software Means of Production (slides, notes).

Unlike many of my other talks, this one was aimed at folks that didn't necessarily know very much about Guile. It was also different from other talks in that it emphasized Guile as a general programming environment, not as an extension language. Guile is both things, and as the general-purpose side gets a lot less publicity, I wanted to emphasize it in this talk. Hopefully the videos will be up soon.

In the last 20 minutes or so, we did a live-hack. Inspired by a tweet by mattmight, we built Bitter, a one-bit Twitter. I tried to convey what it's like to hack in Guile, with some success I think. Source code for the live-hack, such as it is, is linked to at the end of the page.

For a slightly more extended example of a web application, check out Peeple, originally presented in a talk at FOSDEM, back in February. Peeple has the advantage of being presented as a development of separate git commits. Slides of that talk, Dynamic Hacking with Guile, are also available, though they are not as developed as the ones from FSCONS.

Finally, for the real documentation, see the Guile manual.

Happy hacking, and hopefully see you at FSCONS next year!

Andy Wingohttps://wingolog.org/types and the webhttps://wingolog.org/2011/01/11/types-and-the-web2011-01-11T07:18:07Z2011-01-11T07:18:07Z

An essay expanding on the theme of types and the web. I wrote this for Guile's manual, but it applies generally, I think. The point about SXML can apply fruitfully to python as well. -- Andy

It is a truth universally acknowledged, that a program with good use of data types, will be free from many common bugs. Unfortunately, the common practice in web programming seems to ignore this maxim. This subsection makes the case for expressive data types in web programming.

By "expressive data types", I mean that the data types say something about how a program solves a problem. For example, if we choose to represent dates using SRFI 19 date records (see SRFI-19), this indicates that there is a part of the program that will always have valid dates. Error handling for a number of basic cases, like invalid dates, occurs on the boundary in which we produce a SRFI 19 date record from other types, like strings.

With regards to the web, data types are help in the two broad phases of HTTP messages: parsing and generation.

Consider a server, which has to parse a request, and produce a response. Guile will parse the request into an HTTP request object (see Requests), with each header parsed into an appropriate Scheme data type. This transition from an incoming stream of characters to typed data is a state change in a program---the strings might parse, or they might not, and something has to happen if they do not. (Guile throws an error in this case.) But after you have the parsed request, "client" code (code built on top of the Guile HTTP stack) will not have to check for syntactic validity. The types already make this information manifest.

This state change on the parsing boundary makes programs more robust, as they themselves are freed from the need to do a number of common error checks, and they can use normal Scheme procedures to handle a request instead of ad-hoc string parsers.

The need for types on the response generation side (in a server) is more subtle, though not less important. Consider the example of a POST handler, which prints out the text that a user submits from a form. Such a handler might include a procedure like this:

;; First, a helper procedure
(define (para . contents)
  (string-append "<p>" (string-concatenate contents) "</p>"))

;; Now the meat of our simple web application
(define (you-said text)
  (para "You said: " text))

(display (you-said "Hi!"))
-| <p>You said: Hi!</p>


This is a perfectly valid implementation, provided that the incoming text does not contain the special HTML characters <, >, or &. But this provision is not reflected anywhere in the program itself: we must assume that the programmer understands this, and performs the check elsewhere.

Unfortunately, the short history of the practice of programming does not bear out this assumption. A cross-site scripting (XSS) vulnerability is just such a common error in which unfiltered user input is allowed into the output. A user could submit a crafted comment to your web site which results in visitors running malicious Javascript, within the security context of your domain:

(display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
-| <p>You said: <script src="http://bad.com/nasty.js" /></p>


The fundamental problem here is that both user data and the program template are represented using strings. This identity means that types can't help the programmer to make a distinction between these two, so they get confused.

There are a number of possible solutions, but perhaps the best (in the Guile context) is to treat HTML not as strings, but as native s-expressions: as SXML. The basic idea is that HTML is either text, represented by a string, or an element, represented as a tagged list. So foo becomes "foo", and <b>foo</b>; becomes (b "foo"). Attributes, if present, go in a tagged list headed by @, like (img (@ (src "http://example.com/foo.png"))). See the SXML Wikipedia page, for more info.

The good thing about SXML is that HTML elements cannot be confused with text. Let's make a new definition of para:

(define (para . contents)
  `(p ,@contents))

(use-modules (sxml simple))
(sxml->xml (you-said "Hi!"))
-| <p>You said: Hi!</p>

(sxml->xml (you-said "<i>Rats, foiled again!</i>"))
-| <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>


So we see in the second example that HTML elements cannot be unwittingly introduced into the output. However it is now perfectly acceptable to pass SXML to you-said; in fact, that is the big advantage of SXML over everything-as-a-string.

(sxml->xml (you-said (you-said "<Hi!>")))
-| <p>You said: <p>You said: &lt;Hi!&gt;</p></p>

The SXML types allow procedures to compose. The types make manifest which parts are HTML elements, and which are text. (Using types to disallow nested paragraphs is an exercise for the reader.)

So you needn't worry about escaping user input; the type transition back to a string handles that for you. XSS vulnerabilities are a thing of the past.

Andy Wingohttps://wingolog.org/doing it wronghttps://wingolog.org/2010/12/23/doing-it-wrong2010-12-23T00:04:23Z2010-12-23T00:04:23Z

Intrepid hacker Aleix Conchillo Flaqué writes in to say that, against all odds, he actually managed to install the Tekuti blog software on his server. Rock on!

Of course, it was not without a couple of problems, and indeed the important one points to something more fundamentally wrong with existing internet technologies.

in which mod_rewrite fails the author

Whoa, Mr. Wingo, calm down there! Blaming the internet for a bug in your software! Hubris is a virtue and all that, but perhaps this is taking it too far? I'm sure my readers will let me know in the comments, but first, some background.

The symptom of the problem is like this. To refer to a post in a URL, tekuti has an identifier, the post key. For example, the path to access to edit a post is /admin/posts/key.

Tekuti doesn't have post numbers though, so the easiest way to generate the post key is to serialize its location in the data store, like 2010/12/22/doing-it-wrong. As you can see the post key can include any character, and indeed typically does include slashes, so the actual text representation of the URL to edit a post has to be percent-encoded, like /admin/posts/2010%2f12%2f22%2fdoing-it-wrong.

This scheme works fine, and indeed when accessing the Tekuti web server directly, everything works. But if you put it behind Apache using a mod_rewrite proxying rule, as Aleix did:

RewriteRule ^/blog(/?.*)$ http://localhost:8080/$1 [P]

Then trying to edit posts doesn't work! What's the deal?

parsing it wrong

The deal is, mod_rewrite does a textual match on decoded paths. So the string that mod_rewrite sees isn't the string given to it -- in this case, it sees /admin/posts/2010/12/22/doing-it-wrong. Then it does some arbitrary transformation on that decoded string, and somehow constructs the result.

But you cannot produce the desired result from the intermediate, decoded string!

There are various options to re-encode parts of the output string, or to not encode parts of it -- but you can't determine which slash in the intermediate string is to be re-encoded.

The problem is that mod_rewrite treats slashes specially, not re-encoding them. This problem is fundamental to any technology that processes URL paths using textual comparisons.

To parse a URL path properly, you must first split it according to the delimiters you are interested in (/, in this case). Then you percent-decode the path components into a list, and match on that list.

Query parameters need to be parsed similarly -- first you split on &, then split on the first = in each component, then percent-decode the resulting keys and values into an ordered list of key-value pairs.

Quoth the RFC:

When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters.

RFC 3986, section 2.4: When to encode or decode

This problem isn't specific to regular expressions -- it also occurs, for example, when dispatching a HTTP request to a URI handler. The dispatch should be based on path components, and not the against the path as a string.

it's a programming language problem

So why does this problem occur, even in technologies as venerable as mod_rewrite? Because it's one that can only be solved by programming languages. You need some sort of sequence data type to parse paths. You need some sort of map to parse query strings, conventionally at least. And then to reconstitute a URI, you need to do so from those data types (lists, maps, &c).

Most contexts in which you do dispatch, like in your .htaccess, aren't equipped with the expressivity to do it right. Regular expressions are only generally applicable on URI subcomponents like path components -- they can only be used in limited situations on full paths.

workarounds

In Aleix's case, the workaround is to use mod_proxy directly:

ProxyPass /blog http://localhost:8080/

It's unfortunate, as you can't use mod_proxy from an .htaccess file -- only from the main configuration, and requiring a restart of the server. Oh well, though.

You could still use mod_rewrite, but with special rules for the specific URI paths that might include escaped slashes, like this:

RewriteRule ^/blog/admin/posts/(.+)$ http://localhost:8080/admin/posts/$1 [P,B]

But such a special solution is just that -- special, i.e., not general.

en brevis

If you're hacking something for that web that is intended for general use, and if you parse or generate URIs, do your users a favor and give them an appropriate programming language. Your users will thank you, or more likely, continue in blissful ignorance, which is just as well.

Andy Wingohttps://wingolog.org/on the new posixhttps://wingolog.org/2010/12/17/on-the-new-posix2010-12-17T16:34:26Z2010-12-17T16:34:26Z

Here's some stuff I've been preparing for Guile's 1.9.14 release, which should come out later today. Readers interested in further web hackery should check out Tekuti for a larger example. Happy hacking!

When Guile started back in the mid-nineties, the GNU system was still focused on producing a good POSIX implementation. This is why Guile's POSIX support is good, and has been so for a while.

But times change, and in a way these days the web is the new POSIX: a standard and a motley set of implementations on which much computing is done. So today's Guile also supports the web at the programming language level, by defining common data types and operations for the technologies underpinning the web: URIs, HTTP, and XML.

It is particularly important to define native web data types. Though the web is text in motion, programming the web in text is like programming with goto: muddy, and error-prone. Most current security problems on the web are due to treating the web as text instead of as instances of the proper data types.

In addition, common web data types help programmers to share code.

Well. That's all very nice and opinionated and such, but how do I use the thing? Read on!

Hello, World!

The first program we have to write, of course, is "Hello, World!". This means that we have to implement a web handler that does what we want. A handler is a function of two arguments and two return values:

(define (handler request request-body)
  (values response response-body))

In this first example, we take advantage of a short-cut, returning an alist of headers instead of a proper response object. The response body is our payload:

(define (hello-world-handler request request-body)
  (values '((content-type . ("text/plain")))
          "Hello World!"))

Now let's test it. Load up the web server module if you haven't yet done so, and run a server with this handler:

(use-modules (web server))
(run-server hello-world-handler)

By default, the web server listens for requests on localhost:8080. Visit that address in your web browser to test. If you see the string, Hello World!, sweet!

Inspecting the Request

The Hello World program above is a general greeter, responding to all URIs. To make a more exclusive greeter, we need to inspect the request object, and conditionally produce different results. So let's load up the request, response, and URI modules, and do just that.

(use-modules (web server)) ; you probably did this already
(use-modules (web request)
             (web response)
             (web uri))

(define (request-path-components request)
  (split-and-decode-uri-path (uri-path (request-uri request))))

(define (hello-hacker-handler request body)
  (if (equal? (request-path-components request)
              '("hacker"))
      (values '((content-type . ("text/plain")))
              "Hello hacker!")
      (not-found request)))

(run-server hello-hacker-handler)

Here we see that we have defined a helper to return the components of the URI path as a list of strings, and used that to check for a request to /hacker/. Then the success case is just as before -- visit http://localhost:8080/hacker/ in your browser to check.

You should always match against URI path components as decoded by split-and-decode-uri-path. The above example will work for /hacker/, //hacker///, and /h%61ck%65r.

But we forgot to define not-found! If you are pasting these examples into a REPL, accessing any other URI in your web browser will drop your Guile console into the debugger:

<unnamed port>:38:7: In procedure module-lookup:
<unnamed port>:38:7: Unbound variable: not-found

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]> 

So let's define the function, right there in the debugger. As you probably know, we'll want to return a 404 response.

;; Paste this in your REPL
(define (not-found request)
  (values (build-response #:code 404)
          (string-append "Resource not found: "
                         (unparse-uri (request-uri request)))))

;; Now paste this to let the web server keep going:
,continue

Now if you access http://localhost/foo/, you get this error message. (Note that some popular web browsers won't show server-generated 404 messages, showing their own instead, unless the 404 message body is long enough.)

Higher-Level Interfaces

The web handler interface is a common baseline that all kinds of Guile web applications can use. You will usually want to build something on top of it, however, especially when producing HTML. Here is a simple example that builds up HTML output using SXML.

First, load up the modules:

(use-modules (web server)
             (web request)
             (web response)
             (sxml simple))

Now we define a simple templating function that takes a list of HTML body elements, as SXML, and puts them in our super template:

(define (templatize title body)
  `(html (head (title ,title))
         (body ,@body)))

For example, the simplest Hello HTML can be produced like this:

(sxml->xml (templatize "Hello!" '((b "Hi!"))))
=| <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>

Much better to work with Scheme data types than to work with HTML as strings. Now we define a little response helper:

(define* (respond #:optional body #:key
                  (status 200)
                  (title "Hello hello!")
                  (doctype "<!DOCTYPE html>\n")
                  (content-type-params '(("charset" . "utf-8")))
                  (content-type "text/html")
                  (extra-headers '())
                  (sxml (and body (templatize title body))))
  (values (build-response
           #:code status
           #:headers `((content-type
                        . (,content-type ,@content-type-params))
                       ,@extra-headers))
          (lambda (port)
            (if sxml
                (begin
                  (if doctype (display doctype port))
                  (sxml->xml sxml port))))))

Here we see the power of keyword arguments with default initializers. By the time the arguments are fully parsed, the sxml local variable will hold the templated SXML, ready for sending out to the client.

Instead of returning the body as a string, here we give a procedure, which will be called by the web server to write out the response to the client.

Now, a simple example using this responder, which lays out the incoming headers in an HTML table.

(define (debug-page request body)
  (respond
   `((h1 "hello world!")
     (table
      (tr (th "header") (th "value"))
      ,@(map (lambda (pair)
               `(tr (td (tt ,(with-output-to-string
                               (lambda () (display (car pair))))))
                    (td (tt ,(with-output-to-string
                               (lambda ()
                                 (write (cdr pair))))))))
             (request-headers request))))))

(run-server debug-page)

Now if you visit any local address in your web browser, we actually see some HTML, finally.

Conclusion

Well, this is about as far as Guile's built-in web support goes, for now. There are many ways to make a web application, but hopefully by standardizing the most fundamental data types, users will be able to choose the approach that suits them best, while also being able to switch between implementations of the server. This is a relatively new part of Guile, so if you have feedback, let us know, and we can take it into account. Happy hacking on the web!

Andy Wingohttps://wingolog.org/meta datahttps://wingolog.org/2010/12/13/meta-data2010-12-13T21:08:36Z2010-12-13T21:08:36Z

Hey tubes! Long time no type, in this direction at least.

It seems that most of my writing energy these past few months has been directed towards Guile. For example, right now I should be writing documentation for new hacks, but instead I am typing at another part of the ether.

It's good and bad, this thing. The good thing is the hack-cadence in Guile is high. The bad thing is that not many learn about it, because, well, code doesn't blog about itself, does it?

Except in this case, perhaps. The tin can jiggling the electrons at the other end of this blogline has been my hack, of late. What you are reading is words about Scheme web servers, served by a Scheme web server.

That's right, I ported Tekuti to Guile 2.0. Delicious dogfood, yum!

In the process, I decided that mod-lisp, which I had been using, was stupid. There is already a simple, standard way of serving HTTP requests over a socket, and it is HTTP. So I wrote pieces of a web server, and put them in Guile. I'll probably write more about that later, so no more words about that for now, except to request that folks with spiders, bots, odd rss grabbers and such send me bug reports if things aren't legit.

ciao slicehost, ciao linode

About the same time, my bank decided to change my credit card, so all my old subscriptions stopped working. It was just the thing I needed to make me jump ship, finally, from slicehost to linode.

If you're still on slicehost, I heartily recommend that you switch. (Heartily! Strange word. Like gravy and meatballs or something.) Linode feels faster to me, it's half the price, and otherwise the quality is about the same or perhaps a little better. And from what I hear, the linode offerings continue to improve, while slicehost hasn't changed for the 2+ years that I was with them.

Anyway, rap at yall soon, and keep your parentheses warm in this at-times cold Northern winter. Peace!

Images courtesy of the excellent Hyperbole and a Half.