My favourite JavaScript bug

Published May 27, 2013

To be clear, this isn't a bug *in* JavaScript, it's a bug in my own code. But one which various properties of JavaScript made it both easy to write and very hard to detect later.

I am currently writing a LOLCODE interpreter in JavaScript. It's not overly complex - it just recursively evaluates an AST directly.

LOLCODE allows for user defined functions:

HOW DUZ I ADD YR NUM1 AN NUM2
  SUM OF NUM1 AN NUM2
OIC

VISIBLE ADD 1 AN 3 MKAY  BTW => prints 4

The AST for the function definition looks something like:

{
    "_name": "FunctionDefinition",
    "name": "ADD",
    "args": [
        "NUM1",
        "NUM2"
    ],
    "body": {
        "_name": "Body",
        "lines": [
            {
                "_name": "FunctionCall",
                "name": "SUM OF",
                "args": {
                    "_name": "ArgList",
                    "values": [
                        {
                            "_name": "Identifier",
                            "name": "NUM1"
                        },
                        {
                            "_name": "Identifier",
                            "name": "NUM2"
                        }
                    ]
                }
            }
        ]
    }
}

The easiest way to evaluate this is not to do anything special to compile that function, but just to use the magic of JS to represent the evaluation of the FunctionDefinition node as an interpreter action in its own right, similarly to how we'd evaluate an identifier or any other construct.

If we can create a function which represents the evaluation of 'ADD', then we have a nice consistency with evaluation of built-in functions, which are implemented natively, like SUM OF, e.g:

lol = function() {
    var self = this;

    this.symbols = {
      'SUM OF': function(a, b) { return a + b; }
    }


    var evalFuncDef = function(node) {
        self.symbols[node.name] = function() {
            return self.evaluate(node.body);
        }
    }

    this.evaluate(node) {
       // delegate to appropriate sub function
       if (node._name === 'FunctionDefinition') {
           return evalFuncDef(node);
       }
    }
}

I've omitted setting up the argument list, figuring out the return value, etc, but the basic point is that both 'SUM OF' (a native function) and 'ADD' (a user supplied LOLCODE function) both exist in the symbol table in the form of an executable JavaScript function.

It's a fairly innocent looking piece of code.

Except for one thing.

The interpreter also has the ability to pause the program and evaluate watch-statements, like what you'd find in Firebug or Chrome's debugger. For various uninteresting reasons1, the easiest way to do this is to clone the current symbol table and other scope into another interpreter and execute it there.

Something interesting happens here.

In the above code, I've used the this/self idiom to get a reference to the current object into a nested function.

When we clone the symbol table into a different object, the self reference comes across unchanged. What that means is that the second interpreter happily executes any expression you give it correctly, until you supply it with one that tries to invoke a user written function. At this point, the first kicks into action, and continues executing. Imagine how difficult to debug this was - you can trace it all you want, you will see you're always in the right functions. The key is realising you've suddenly switched to the wrong object, which in this case, always has very similar (if not identical) state!

The solution to this is obvious - don't rely on self inside the function we create, instead require it to be called with symbols[node.name].call(this, ...).

But the bug itself is admirable in its subtlety.
____
1. Mostly to do with keeping track of an awkward asynchronous callback. Call it an 'implementation issue'.

Filed under: javascript, lolcode, loljs, programming

Talk is cheap

Leave a comment:

HTML is not valid. Use:
[url=http://www.google.com]Google[/url] [b]bold[/b] [i]italics[/i] [u]underline[/u] [code]code[/code]
'