Tuesday, September 02, 2008

Package private access in Open Source code

I recently got in a same situation three times: someone wanted to use code I wrote in an Open Source project written in Java, and they couldn't, because the class/method in question had package private ("default") access, rendering it inaccessible outside of its package.

  • First, Charlie Nutter needs access to package-private class BeanMetaobjectProtocol in Dynalang-MOP.
  • Next, John Arkley needs access to package-private class AllHttpScopesHashModel in FreeMarker to help push it into Spring.
  • Finally, Steve Yegge needs access to the package-private constructor of Context class in Rhino.
Now, you know, when the same thing hits me three times in a row in short timeframe (incidentally, all three coming from people with high geek cred), that gets me thinking: What good is package-private access in an Open Source project anyway?!

It's a very valid question, really. Think about it: people can see the code - it's open. People want to use the code - it's normal. And when they want to, you frustrate them by declaratively preventing them from linking their code to your open source code. They can look, but they can't touch.

That seems wrong.

Of course, you could argue for package private access' validity. Here are some arguments you could come up with:

Package private helps implementation hiding on a package level! Well, duh, it does. However, if a class/method is useful to another class you wrote and that lives in the same package, it might also be useful to some poor outsider schmuck too! It must already conform to quite rigorous coding standards, as it will be used by some other classes, ones you wrote in the same package, so it'd better maintain state invariants and so on. Does it really make a difference if it's another class of yours, or another class of another developer, living in another package? I say: it shouldn't. If you think the class is just an auxiliary, and it's just cluttering the JavaDoc, just move it to a *.support subpackage instead (Spring does that extensively).

Package private helps you hide bits you don't want to be tied down by backwards compatibility! This is really a corollary of the first one. I used to be big on this one (that's probably why I got asked to loosen up access restrictions in the first place - because I used to place them there in the first place). Common wisdom is that once people start using your publicly available API, you'd better not break it on the next  release. By making lots of stuff public, you increase the surface area that needs to be kept backwards compatible, right? Right, sort of.

Now, how about this instead: make it public anyway, just note in a JavaDoc entry (or even better, in an annotation, say @SubjectToChange) that there's no backwards compatibility guarantee on this method. If it's an annotation, people can even have an automated tool for checking for its use before they upgrade. Hell, you can even have a @BreaksCompatibility annotation! What it boils down to is: don't treat users of your library as children and decide what's good for them. Inform them that the API is volatile, but open it up, don't close it. It's not really closed anyway, as they can see the source. You're just erecting a glass wall in front of them; they can look but they can't touch. 

And they'll come to bug you about it anyway. 

My point of view right now is that it's better to provide a suggestive hint that an API is volatile in the documentation or annotation and let anyone use them at their own risk rather than build a non-negotiable restriction into the code (as in, can't negotiate it with a compiler; you can still definitely negotiate it with me).

Also, I'm not saying either that package private access is completely inadequate for Open Source libraries. I'll admit there might be valid use cases for it, but if there are, they're very few.

Now, before you'd accuse me of being a "make everything public" proponent, let me say that this reasoning doesn't apply at all to private access, and only partially applies to protected access. Let's see why:

Private access is completely justified. Let me point out one clear distinction between packages and classes: classes and instances of classes can have state. Packages can not. That's pretty much what makes the difference. Through refactorings, you can end up with methods in a class that violate its state invariants. Other methods in the class can call these methods as steps to transform the object from one valid state to another valid state, but it might be invalid in the interim. You would never expose such a method publicly. Well, you shouldn't expose it even package-privately either; that'd be sloppy programming (see my above remark about package-privately accessible methods having to be just as rigorously coded as public ones). A class (and class instance) is a unit of state encapsulation, a package is not. Hence, classes need private access.

Protected methods remain usable by 3rd party code, as long as it has classes that extend your base  class. It mightn't be too ideal a constraint (forcing subclassing in order to access functionality), and you might rather want to design your libraries to favor composition (has-a) over subclassing (is-a) architecture. But if you must have classes intended to be subclassed, and there's functionality that's only ever used/extended/implemented by a subclass, make it protected. Especially when you have abstract protected methods that act as poor Java programmer's mixins (are intended to be called by code in the base class). Sometimes you'll allow (or downright expect) such abstract methods to violate an invariant in the object's state temporarily, as it'll be called as a step in a more complex state transition implemented by an algorithm in the base class method. In these cases, protected access is justified. But in other cases though, you might still consider making some protected methods public.

In conclusion: I'm currently fairly convinced that what package private access is good for  is preventing your users from linking their code to useful bits of your code for purposes you didn't anticipate up front; it's often nothing more than an unnecessary garden wall.