Nov 17, 2015

Benchmarking JavaScript parsers for Eclipse JSDT

The JavaScript parser for the Eclipse JSDT project is outdated. It lacks support for the latest EcmaScript 2015 (ES6) standard and has quality issues. Moreover, the parser on JSDT is derived from the JDT’s Java parser, hence it is not adopted by the JavaScript community at large, leaving the JSDT committers as the sole maintainer. Luckily, there are good quality JavaScript parsers that already support a large number of tools built around it. However, these parsers, like most of the JavaScript tools, are developed using JavaScript and requires additional effort to integrate with Eclipse JSDT which runs on a Java VM. In the last few weeks, I have been experimenting with alternatives that enables such integration.

Parsers

Before I go into the details of integration let me quickly introduce the parsers that I have tried.

Acorn

Acorn is a tiny parser written in JavaScript that supports the latest ES6 standard. It is one of the most adopted parsers and used by several popular JavaScript tools. It parses JavaScript to ESTree (SpiderMonkey) AST format and is extensible to support additional languages such as JSX, QML etc.

Esprima

Esprima is also a fast, tiny parser that is written in JavaScript, which also supports the latest ES6. Its development has been recently moved to JQuery foundation and has been in use on Eclipse Orion for a while. Just like Acorn it also uses the ESTree AST format.

Shift(java)

Shift(java) is the only Java based parser on my list. It is a relatively new parser. It uses Shift AST as its model which is different from the widely adopted ESTree.

Note
Why does AST model matter?

AST model is what actually what tools operate on. For instance a JavaScript linter first uses a parser to generate an AST model and operates on the model to find possible problems. As one can imagine, an IDE that uses a widely adopted AST model can utilize the ecosystem of JavaScript tools more efficiently.

Eclipse JSDT already comes with a JSDT AST model that is used internally that is very hard to replace. Therefore, regardless of the AST model generated by the parser it will be converted to JSDT’s own model before used. Which renders discussions around the AST models moot in JSDT’s context.

Integration

The parsers other than Shift, which already runs on the Java VM, need a mechanism to play nice with the Java VM. I have experimented with 3 mechanisms for running Acorn and Esprima for JSDT so far.

Node.js

Utilizes node.js to run the parser code. node.js runs as an external process, receives the content to be parsed and return the results. I have chosen to use console I/O to communicate between node.js and Java VM. There are also other techniques such as running an http or a socket based server for communication. In order to avoid the start up time for node.js, which does affect the performance significantly, node.js process is actually kept running.

J2V8

A JNI based wrapper that bundles V8 JavaScript VM. It provides a low level Java API to execute JavaScript on bare V8 engine. Although it uses V8, it does not provide the full functionality of node.js and can only be used to execute selected scripts, fortunately Acorn and Esprima parsers can be run with J2V8.

Nashorn

The JavaScript engine that is nowadays built into Java 8. Provides a simple high level API to run JavaScript.

Performance Benchmarks

The criteria for choosing a parser may vary from the feature set, to AST model used, to even community size. However performance is the one criteria that would make all others relevant. So in order to compare the performance of different alternatives I have developed a number of benchmark tests to compare parsers and the mechanisms.

All benchmark tests produce a result with an AST model, either in JSON form or as a Java object model. Tests avoid the startup time for their environments, for instance the startup time for the node.js process affects the results significantly but are discarded by the tests. The current test sets use AngularJs 1.2.5 and JQuery Mobile 1.4.2 (JQM) as the JavaScript code to be parsed.

Table 1. Average time for each benchmark
Parser(Script) Runtime Score Error

Acorn (AngularJS)

J2V8

118.229 ms

± 1.453

Acorn (JQM)

J2V8

150.250 ms

± 4.579

Acorn (AngularJS)

Nashorn

181.617 ms

± 6.421

Acorn (JQM)

Nashorn

177.265 ms

± 9.074

Acorn (AngularJS)

NodeJS

59.115 ms

± 0.698

Acorn (JQM)

NodeJS

34.670 ms

± 0.250

Esprima (AngularJS)

J2V8

98.399 ms

± 0.77

Esprima (JQM)

J2V8

114.753 ms

± 1.007

Esprima (AngularJS)

Nashorn

73.542 ms

± 0.450

Esprima (JQM)

Nashorn

73.848 ms

± 0.885

Shift (Angular)

JavaVM

16.369 ms

± 1.019

Shift (JQM)

JavaVM

15.900 ms

± 0.325

As expected Shift parser which runs directly on top of JavaVM is the quickest solution. To be fair, Shift parser is missing several features such as source location, tolerant parsing and comments that may affect the parsing performance. However even after these features added it may remain the quickest. I feel that the performance for J2V8 can also improve with more creative use of the low level APIs however there is so much memory copying between Java heap to JNI to V8 heap and back I am not sure if it would be significant.

The surprise for me is the Esprima’s performance with Nashorn. It is unexpected in two ways. It is actually the third quickest option however Acorn does not give the same level of performance.

Read more →