Thursday, June 05, 2008

So, WebKit JS used an AST walker before?

Color me flabbergasted.

WebKit recently announced a new JavaScript interpreter, named SquirellFish, claiming it is much faster than the previous interpreter in WebKit.

That's good and all, but in the "Why it's fast" section of the linked article, they say:

Like the interpreters for many scripting languages, WebKit’s previous JavaScript interpreter was a simple syntax tree walker.

It was? Oh my... They also say that:
SquirrelFish’s bytecode engine elegantly eliminates almost all of the overhead of a tree-walking interpreter. First, a bytecode stream exactly describes the operations needed to execute a program. Compiling to bytecode implicitly strips away irrelevant grammatical structure. Second, a bytecode dispatch is a single direct memory read, followed by a single indirect branch. Therefore, executing a bytecode instruction is much faster than visiting a syntax tree node. Third, with the syntax tree gone, the interpreter no longer needs to propagate execution state between syntax tree nodes.

Why, yes, indeed!
For the record, Rhino has been doing this for ages - AST is compiled to internal "JS bytecode" format that strips away grammar, and then interprets it. This works like this since, well, around the turn of the millenium. (Actually, Rhino can even kick it another notch and can also optionally compile the JS bytecode to Java bytecode, eliminating the interpreter altogether. Which bytecode then the JVM JIT compiler can further compile to machine code at its own discretion.)

Anyway, I digress. All I wanted to say is that I'm honestly amazed that a supposedly professional implementation of JavaScript (the one shipped in WebKit before SquirellFish came along, and by consequence, shipped in Apple's Safari browser) would use an AST walking interpreter. Yeah, I know, you'll say "most scripts run once on the page, so optimization is overkill", but with AJAX this is no longer true, and apparently, the WebKit team also thinks so.

On the positive side, they're finally moving on to a better solution, and I congratulate them on the undoubtedly hard decision to finally take the leap toward a more modern execution architecture.


Oliver said...

It's important to realise the the old JSC AST interpreter was very heavily optimised. At the point that the SquirrelFish branch merged the AST interpreter was the fastest browser implementation of JS (including pre-release versions of firefox, opera, etc).

The reason for switching to SquirrelFish was quite simply that there was no further room for improvement in the AST interpreter -- a fairly fundamental issue is that loading anything (even literals) requires a virtual call.

Oliver said...

It also seemed kind of erroneous to imply that by using an AST that it was not a "professional" implementation, especially given its performance :p

Attila Szegedi said...

Don't get me wrong - I did not mean to imply that using AST walker --> not professional. I was just genuinely surprised that WebKit - which I do consider to be a professional implementation - would utilize such a naive execution engine. I was just under the impression - without having investigated - that an AST walker would be slow and that no doubt modern day browsers are already using more sophisticated engines. That's all -- I was just very surprised to discover it was not the case.

Oliver said...

Ah, apologies for mistaking your tone (mistaking is the wrong word here.. miscomprehending? hmmm... unsure) then :D

Yes, it does seem kind of peculiar, and yes it had issues (very shallow recursion limit for instance) but it had many advantages, the code was simple and easily hackable (much more so than the main squirrelfish run loop), tools like profilers, debuggers could provide a huge amount of information just from the stack trace (because of all the accursed calls).

Squirrelfish obviously resolves a number of problems, like the recursion limit, and the accursed virtual function calls :D But there's a lot of complexity there now that wasn't there before, adn that is a fairly significant problem, especially if any new developers show up wanting to work on it. On the other hand, i shall repeat this, no more accursed virtual function calls.

Attila Szegedi said...

Well, yeah, visitor patterns over abstract tree nodes leads to a virtual call galore, which is nicely replaced with a threaded interpreter.

JamVM (an altenative Java VM) used a threaded interpreter for a while, then I believed it switched over to a "template interpreter" - same that Sun's Java HotSpot uses as its "interpreter". I'm putting it in quote marks as it's rather a "non-optimizing compiler" really, where each bytecode instruction is replaced with a machine code sequence that implements it, and the resulting code run directly by the CPU. Of course, the code gains a larger memory footprint, but you even eliminate the GOTOs of the threaded interpreter :-) Of course, it's also possible to use a hybrid approach where you'd have pure machine code blocks and only optimize tight loops...

Just throwing ideas around :-)