9 de enero de 2015

Ignored expressions in Racket bytecode

Introduction

Following the idea of the last articles, we will count the occurrences of expressions whose result is ignored in the bytcode of Racket. In general, Racket programs are almost functional, as a matter of style. Almost always the useful part of each function is the result that it returns and usually they have no side effects. But when it's convenient you can use side effects locally, enclosed inside a function that is functional when you looks from the outside.
It is important to remember that when analyzing bytecode, we are looking at what remains after multiple optimizations steps and multiple macro expansions steps. So the results we get are very different from those obtained by analyzing the original code. These steps include functions inlining and constant propagation, so expressions that would be ridiculous in the source code as (box? (List (f 1) (f 2) (box 7)) can occur. Moreover, such cases are optimized because it is sure that the result is #f and then it's reduced to (begin (f1) (f2) (box 7) #f). And that (box 7) is superfluous, because it never fails nor has any side effects, then in the same step it is optimized to (begin (f 1) (f 2) #f). (Actually, maybe (begin (values (f 1)) (values (f 2)) #f) if the optimizer is not certain that f returns a single value.)
The subrutine that optimizes the expressions that are ignored, only eliminate the calls to unnecessary functions. This is good because it eliminates a lot of constructors, so at run time it's not necessary to create objects that are going to be destroyed immediately. For example, in (begin (cons (f) 'z) (if (h) 0 1)) #f) it eliminates the cons, but it doesn't removed the if, so the final expression is (begin (f) (if (h) 0 1) #f) because the 'z can be ignored. It would be better it it also eliminates the if, so we would get (begin (f) (h) #f) because we will ignore the result if then it doesn't matter if it produces either 0 or 1. But every additional reduction rule makes the optimization slower and more complex.
The big question is how many of these things remain after all the optimizations, to see if it is worth adding an optimization for this kind of if or they are very few and we can just ignore them. Also we will count what kind of functions are in expressions that are ignored, in order to see if there are interesting things that are not optimized away.

Program to count the expressions

The program is a shortened version of decompile. It's similar to the one we used in the previous articles. The interesting part is how it handles the begin (represented by seq), the begin0 (represented beg0) and the let where the variable is ignored (represented with a let-one, with a #t in the unused? flag).
{define (explore-expr expr globs stack closed)
  (match expr
    [(struct let-one (rhs body type unused?))
     (let ([id (or (extract-id rhs) (gensym (or type (if unused? 'unused 'local))))])
       (when unused?
         ((current-explore-found) 
          (list '|let-unused;| (decompilex-expr rhs globs stack closed))))
       (explore-expr rhs globs (cons id stack) closed)
       (explore-expr body globs (cons id stack) closed))]
    [(struct seq (exprs))
     (begin
       (for ([expr (in-list (reverse (cdr (reverse exprs))))])
         ((current-explore-found) 
          (list '|begin;| (decompilex-expr expr globs stack closed)))
         (explore-expr expr globs stack closed))
       (explore-expr (last exprs) globs stack closed))]
    [(struct beg0 (exprs))
     (begin
       (explore-expr (car exprs) globs stack closed)
       (for ([expr (in-list (cdr exprs))])
         ((current-explore-found) 
          (list '|begin0;| (decompilex-expr expr globs stack closed)))
         (explore-expr expr globs stack closed)))]
    [...]
    [else (void)])}
When it finds an expression whose result will be ignored, it calls decompilex-expr (with an extra x because it's not the real decompile-expr) which translates it to something more friendly. Then it passes the result to the main program calling the function that is in the parameter current-explore-found , which is responsible for storing, classifying and counting the expressions.
{define (decompilex-expr expr globs stack closed)
  (match expr
    [(struct toplevel (depth pos const? ready?))
     (if ready? (if const? '#%topcr '#%topr) (if const? '#%topc '#%top))
     #;(list-ref/protect globs pos 'toplevel)]
    [(struct primval (id))
     (hash-ref primitive-table id (lambda () (error "unknown primitive")))]
    [(struct localref (unbox? offset clear? other-clears? type))
     (if clear? '%clear-localref '#%localref)]
    [(struct branch (test then else))
     (list 'branch (filter-simple-values then) (filter-simple-values else))]
    [(struct application (rator rands))
     (list 'application (decompilex-var rator globs stack closed) (length rands))]
    [(struct apply-values (proc args-expr))
     (list 'applicationv (decompilex-var proc globs stack closed) 'v)]
    [else (car (or (prefab-struct-key expr) (list expr)))])}
This function tries to decompile the expression just a little, only to understand and classify it better. For example, for the functions calls, it adds the number of parameters.
In the case were there is a local variable accesses, this function marks whether the access also cleans the variable reference. This cleaning step is necessary  to allow the heavily recursive programs (aka normal) to be run. Before calling a function it tries to remove the references to variables that will not be used again. In this way, the garbage collector can recover the used memory. In general this is done in the last access to the variable, but in some cases one of the steps in the compilation have to add these marks. In these cases, the access to the variables has a subtle side effect that is not visible internally in the program, but allow the program to make more recursions. For example in
{define (too-much-zeros n)
  (let ([z (random n)])
    (if (zero? z)
      (display z)
      (too-much-zeros n)))}  
the z would accumulate even though they will never be used again. After the compilation, the program is transformed in something equivalent to (version cleaned for clarity):
{define (too-much-zeros n)
  (let ([z (random n)])
    (if (zero? z)
      (display z)
      (begin (#%sfs-clear z) (too-much-zeros n))))}  

Results

To count the expressions I used the repository Racket, from the last days of November just before the Great Split. First, let's count the ignored expressions that are not functions applications. The table shows the number of times that certain types of expression and a representative decompiled example. In the examples, v is a local variable and u is an unused variable.
Count Example
2827 (begin (#%sfs-clear v) ... _)
1504 (begin0 _ (#%sfs-clear v) ...)
87 (begin (if ? #<void> ?) ... _)
20 (begin (if ? ? #<void>) ... _)
18 (begin (if ? ? ?) ..._)
81 (begin (let ([u ?]) ...) ... _)
Most of the things that appear are the cleaning of  variables references. There are only a few if, they are probably due to the expansion of when and unless, but in all the cases at least one of its branches seems to be interesting, so they are not good candidates for simplification. (Actually, since a few months, expressions like (if v 0 1), which can not generate errors nor have side effects, are eliminated during the optimization when the result is going to be ignored.)
For simplicity, the method that I described to detect the expressions that have results that are ignored is slightly naive and it doesn't detect nested constructions. For example in (begin (begin x Y) z) it is clear that it doesn't matter the result of Y, but the previous program doesn't know it. Another example is (begin (if x Y Z) w), where we will not not take into account either the result of Y nor Z To fix this, we must add an extra argument to explore-expr to indicates whether the expression we are analyzing will be ignored and recursively propagate this argument. This extends the program a little, but it's nothing surprising. With this modification, we can count more ignored expressions. In the next table we compare the amount of expression found with each method, with a representative example of the decompiled code. In the examples, v is a local variable and u is an unused variable.
Pos Count Dir Count Exp Example Direct
1 202 202 (begin (set-box! ? ?) ... _)
2 85 85 (let ([u (_tok-val:ref@(lib "parser-tools/cfg-parser.rkt") ?)]) ... )
3 72 80 (begin (v ?) ... _)
4 0 32 (begin (raise-syntax-error ? ? ? ?) ... _)
5 0 29 (begin (raise-type-error ? ? ?) ... _)
6 29 29 (begin (set-mcdr! ? ?) ... _)
7 29 29 (let ([u (_stx-car:P@(lib "racket/private/stx.rkt") ?)]) ... )
8 18 24 (begin (v ? ?) ... _)
9 0 23 (begin (error ? ? ?) ... _)
10 17 17 (begin (namespace-variable-value ? ? ?) ... _)
11 11 17 (begin (fprintf ? ? ?) ... _)
12 11 17 (begin (fprintf ? ?) ... _)
13 0 17 (begin (printf ? ?) ... _)
14 5 15 (begin (_check-type ? ? ?) ... _)
15 5 15 (begin (hash-set! ? ? ?) ... _)
16 11 11 (begin (vector-set! ? ? ?) ... _)
17 0 8 (begin (raise-type-error ? ? ? ? ?) ... _)
18 5 8 (begin (? ? ?) ... _)
19 6 7 (begin (? ? ? ? ?) ... _)
20 7 7 (begin (apply ? ? ?) ... _)
21 0 7 (begin (error ? ?) ... _)
22 0 7 (begin (for-each ? ?) ... _)
23 1 6 (begin (? ? ? ?) ... _)
24 5 5 (begin (_check-sigs:P@(lib "racket/private/unit-runtime.rkt") ? ? ? ?) ... _)
25 0 5 (begin (_idY21.62:f@(lib "syntax/boundmap.rkt") ? ? ? ?) ... _)
26 5 5 (begin (_check-unit:P@(lib "racket/private/unit-runtime.rkt") ? ?) ... _)
27 5 5 (let ([u (cadr ?)]) ... )
Most are some version of set! for some structure, for example set-box! or set-mdr!, which are called for its side effect that stores a value in the structure. It's not surprisingly to find them here and clearly they can not be deleted. Other functions are obviously called for its side effects, such as fprintf. There are also many expressions that generate explicitly an error, so they can't be eliminated.
With both counting methods the kind of thing we find are similar. The biggest difference is that in the second version, there many more functions that generate error messages. I should analyze this more carefully, but I guess that they correspond to code like
{define (f l)
  (unless (list? l)
    (error 'error))
  (something-useful-with l))}
which expands to something like (version cleaned for clarity )
{define (f l)
  (begin
    (if (list? l)
      #<void>
      (error 'error))
    (something-useful-with l))}
The first method do not realize that either of the results of both branches of the if  are going to be ignored, but the second method does. This is a very usual construction in Racket source code, so I guess this is the cause of most of the differences.
The moral is that (almost?) all expressions that which results are ignored have side effects, and there isn't an obvious idea to optimize any of them.