PDoc: inline documentation for Prototype

Posted in JavaScript, Prototype

As 2008 turns into 2009, it’s past time to dust off some dormant projects in the Prototype realm. I’ve been playing around with PDoc for the first time since April in an effort to get it ready for the next major Prototype release.

Wait — have I not talked about PDoc yet? How is that possible?

OK, here’s what you need to know:

  • It’s RDoc for JavaScript.
  • It’s the brainchild of Tobie Langel and evidence of his mad genius.
  • It has a Prototype bent. Technically, there’s nothing Prototype‐specific about it, but we designed it so that Prototype’s idioms and conventions would feel at home.
  • It’s implemented far differently than most JavaScript inline‐doc tools.
  • One day soon, it’ll be the way we document Prototype.

What’s different about it?

We started PDoc because we were frustrated by tools like JSDoc — which, though they make a valiant effort, fall short at drawing inferences about modern JavaScript. JavaDoc (the standard‐bearer for inline documentation) works so well because Java itself guides the user into One Way of doing things. Static languages are comparatively easy to write inline‐doc tools for.

JavaScript, on the other hand, has a countably‐large assload of ways to define an API, most of which will act the same from the outside even though they look so different from the inside. Prototype employs some strange techniques to define strange APIs: functions that work both as generic methods and instance methods, run‐time definition and re‐definition of functions based on browser capabilities and quirks, and so on. When doc tools try to read our JavaScript, they get angry, and people get hurt.

Tobie’s solution was to make a doc tool that wouldn’t try to read our JavaScript.

What it does

PDoc is unconcerned with the code itself. It’s looking only at the comments (which are /** delineated like this **/). Here’s an example:

/** alias: Array.from, section: Language
* $A([iterable]) -> Array
* - iterable (Object): An array-like collection (anything with numeric
*    indices).
*
* Coerces an "array-like" collection into an actual array.
*
* This method is a convenience alias of [[Array.from]], but is the preferred
* way of casting to an `Array`.
**/

You, as a human being, can read this comment, and once you’ve read it you can probably explain just what $A does. PDoc can read it, too: it does some clever parsing to extract a lot of information from this comment. It can tell the name of the method is $A; it knows it’s just another name for Array.from; it knows to put it in the Language section of the generated documentation; it knows the method’s one argument is named iterable and is optional.

The last two paragraphs are a human‐readable description of what the method does. PDoc mostly delegates to Markdown here — except for the [[bracket]] syntax, which it recognizes as a way to link to the documentation for another method.

A few other things make PDoc different. First of all, it knows about Prototype’s conventions; for instance, I can define a class, then document its individual methods, and PDoc knows to organize them into instance methods and static methods. PDoc also knows about mixins (e.g., Enumerable); should a class mixin one or more objects, PDoc will include those objects as part of its metadata about that class. Look at the syntax documentation and you’ll find other examples.

How it works

PDoc is both a parser and a generator. The PDoc parser converts all those special comment blocks into an abstract representation of the code; a PDoc generator transforms that model into readable documentation. For now, there’s only one generator — it produces HTML — but we hope other generators will emerge over time.

It uses Treetop, the most excellent Ruby library, to define its language grammar; and ERB to echo all these Ruby objects into HTML snippets. For more about the nuts and bolts, consult the README file.

Where we need help

PDoc is alpha software. Here’s where it needs help:

  • It needs to be faster. There are some simple things I’m sure we can do in order to improve performance; nodes’ awareness of where they are in the tree (i.e., their parent node and child nodes) relies heavily upon sluggish Enumerable methods. The good news is that this node tree doesn’t change after the parsing stage, so memoization might be the answer.
  • At the same time, it needs to be less of a memory hog. Holding the whole node tree in memory at once is less than ideal. I’ve experimented with using Ruby’s Marshal class to store the parse tree as a file on disk; I did this so I wouldn’t have to re‐parse every time I wanted to re‐generate the docs, but with some tweaking it could also (perhaps) be used to reduce memory consumption by keeping only part of the tree in memory at any given moment.
  • It needs feedback, bug reports, and feature requests. Could PDoc grok your APIs? If not, what can we add to it to make that possible?

PDoc is hosted on GitHub, of course. Help us get it from alpha to beta.

Comments