Constantly Changing: 2008

Wednesday, December 03, 2008

IP telephony on iPhone

I spend quite a lot of time on calling my colleagues in US or UK (I am in Hungary), which can result in quite exorbitant phone bills. I tried to fix this lately by looking for IP telephony solutions for iPhone - I often have conference calls lasting upward of half an hour, and up to even two hours, so I don't want to be confined to my desk or any other physical location. I will often walk up and down in the apartment, subconsciously tracing patterns of the carpets with my feet (it's funny when I consciously realize I'm doing it), staring out of various windows at the sky, sometimes sprawling on a sofa and so forth. (Hey, working from home has its perks!). So, I need mobility within the apartment, hence I still want to use the phone.

Now, Skype would be the obvious choice - for a flat fee of 10.29 EUR/month I can get unlimited calls to both US and UK numbers. Major hurdle is that Skype ships no client for the iPhone. Fring actually does the job, it is a multiprotocol chat/voice client for iPhone (and some other phones) that supports Skype accounts too. Voice quality is not particularly good, but it's within tolerable. Fring however lacks an essential (for me) feature - it does not provide a DTMF keypad. I always dial into the company conference center, and must be able to punch in the conference number I wish to join. Fring can't do it. I can still use it to call direct extensions, but the bulk of my calls is through the conference system, and it's a no-go. I asked Fring support, and all they told me was:

Currently, it is not available to dial DTMF sounds on fring, this should be implemented in future versions.

For the record, Skype desktop client does have a DTMF keypad.

Next choice is TruPhone. They offer a client for iPhone (and also a bunch of other makers' devices). TruPhone operates their own call service, much like Skype does, and while there's no flat-fee subscription model, their 0.06 USD/minute plan is sure quite attractive (heck, even calling numbers within my cell provider's network, T-Mobile Hungary, is more expensive than that!). The voice quality is very good, even better than cellphone (one coworker remarked after I first used it to dial in to a conference that I sounded much clearer than I usually do). The client app also has a muting button and a DTMF keypad, the first useful, the second essential for conferencing. It would all be rosy, if only the client weren't extremely buggy. Sometime, with iPhone OS 2.1, the UI would slow down (you could see it animate scrolls pixel by pixel) and it'll then claim there's not enough memory. iPhone OS 2.2 seem to have fixed that, but now it instead often complains that it can't find "Wi-Fi coverage" (even though the iPhone itself clearly sees the network). The aforementioned DTMF keypad has the tendency to not react to touch, or conversely, to get jammed as if I touched the key but didn't lift my finger, sending a prolonged shrieking "beeeeep" of that particular number key tone through the earphones into my skull, full volume. (The iPhone built-in phone app nicely tones down the DTMF beeps on my end of the line, so I still have an audio feedback that I'm typing them, but don't hear them full-volume). Regardless, the "low memory" and "unable to find wi-fi" are the most critical problems - even turning the phone off and on again doesn't always help it with its wi-fi disorientation syndrome (and mind you, I should never be forced to power cycle my phone just to use an app!).

So, it's a mixed bag. I'd use Fring if it had a DTMF keypad because of a flat-rate unlimited calls provided by Skype. The second best right now is TruPhone, which provides very good voice quality and otherwise pleasant service with low pay-as-you-go rates, but their iPhone client is bordering on unusable because of all the bugs. I'll be eagerly awaiting app updates to both.

Tuesday, November 18, 2008

World of Goo

I found a fantastic game. It's named World of Goo. Here's a trailer. And here's the Wikipedia entry. Check those ratings - 9/10, 9.5/10, 10/10, 93%, 96%.

It's hard to summarize what's right about this game. Most everything is.

Let me start with few obscure details that might appeal specifically to a computer geek like me.

First, this game runs on Windows, on Nintendo Wii, and on Mac OS X. Linux version comes soon. And guess what, the Mac OS X version is not a lame "Win32 API linked with Wine" kind of port. Proof that it isn't is that Wine-linked Win32 API using Mac software only runs on Intel CPUs. Not so this game. It runs on Macs with PowerPC CPUs (either a G4 or a G5). It's pretty much a native version on Mac OS X. It runs crisply on my iMac G5. I asked the developers in e-mail about their development methodology, and the developer Ron Carmel was kind to reply. (The whole company seems to be just two guys, working from wi-fi equipped coffee houses in San Francisco. They have years of experience working for big name game companies, though.) Anyway, Ron's response was:

at some point, when we realized we're going to go multi-platform (wii/pc at the time) we simply abstracted away anything and everything platform dependent from the code. this meant graphics, sound, input handling, window creation, threading, that sort of thing. the game only deals with a set of abstract classs that it gets from a singleton "environment" class. those classes provide a set of services (like graphics, sound, etc) and we have different implementations of those classes for windows, mac, wii, and soon linux.

I can't say just how much I respect people who take this approach. It's one of hallmarks of professionalism in my book. More often than not, even much more complex software systems can (and should) be developed to be really multiplatform. At least on unices, it is fairly common to maintain software as CPU-agnostic (and often OS-specifics-agnostic) source code with small CPU&OS adapters. These guys however didn't only do it across different unices, which are all closely related OSes. They did it across Windows, Mac OS X, and whatever is the OS of the Wii. That's 3 operating systems, and likely 3 different CPUs too (Intel, PPC, and whatever the Wii CPU is, although AFAIK, it's also some PPC variant). Consider how most large software houses can't be bothered to do this - they just target Windows primarily, even though they'd have lots of internal benefits if they cleaned up their source code to be multiplatform.

Second obscure detail, the game has no DRM. None. These guys were smart enough to realize it'd be a waste of resources to bother with it. Another big tip o'the hat to them.

What this game does have though is: captive, immersive, innovative gameplay based on building structures out of gooey balls that bind to each other, and leveraging mechanics in a very realistic physics simulation (on various levels, you'll need to use tension, elastics, gravity, wind, flotation, and so on). It's tremendous fun. Immersion is further helped by the fact that there's no UI aside from the mouse pointer - all interaction with the game world is by directly dragging and dropping the otherwise aimlessly wandering goo balls with the mouse.

There's lot of humor and cuteness baked in. You can get emotionally attached to little goo balls as you could to those lemmings back in the day.

And then there's the beautiful visual art (that at times reminds me of Tim Burton movies, particulary The Corpse Bride), and music that perfectly matches it to create the immersive atmosphere. There are overarching motives and hints of a backstory that are intriguing on their own.

It's ideal for a casual gamer who needs to blow of steam for an hour after a hard day's work. My son is playing it every evening. My daughter is playing it every evening. I am playing it every evening. You can play it too in minutes - there's a downloadable demo with the full first chapter of the game (the full game has four chapters and an "epilogue").

Wednesday, October 22, 2008

FreeCC, the modern JavaCC

If you ever needed to write a parser, you hopefully didn't do it by hand but rather used a parser generator. The two predominant parser generators in Java are ANTLR and JavaCC. JavaCC project however suffers from paralysis for too many years now. The original developers aren't present for too many years now, and the current set of active committers didn't make any significant improvements for years. My FreeMarker colleague Jon Revusky has recently taken interest in it (FreeMarker having used JavaCC in the last six years, when Jon ditched the hand-written parser of the 1.x versions). Since his improvements to the JavaCC codebase weren't accepted by mainstream (largely dormant) JavaCC project, he did the only correct move: he forked it.

Enter FreeCC.

FreeCC is backwards compatible on source level with JavaCC - it can generate parsers from existing JavaCC sources, and the parsers behave identically (since it's a continuation of the original codebase with different name, it's not too surprising, but doing the work under fresh name, it doesn't hurt to emphasize), FreeCC, however has some modern new amenities, such as:

typesafe collections in its own code, instead of Java 1.1 Vector and Hashtable

instead of hardcoded prints in the source code, the parsers are generated using editable templates (can you say "multiple target languages support?")

other long overdue source code sanitization

generated parsers don't rely on static singletons anymore

These are all nice and dandy; code reworks in particular improve the project's long term maintainability by making its comprehension more easy for newly joining developers (in this regard, FreeCC is much more welcoming than JavaCC for a newbie hacker). These however don't really give you much of an advantage when you're writing your grammars. The following ones do, and they're the real kickers:

code injection feature eliminates (or at least strongly reduces) the need for manual post-editing of files

grammar include feature, which allow smaller grammar files to be reused in larger ones (with JavaCC, you had to copy-paste), as well as allow you to organize larger grammars into separate smaller files.

These features are also the sign of what kind of features are still to come: features that provide you with modern conveniences a programmer in a need of a parser generator would wish to have. These two new features above help you create maintainable grammar sources, and they are just the start. If you adopt these features however, there's no going back. Since JavaCC doesn't have these features, your grammar files will no longer be compatible with JavaCC, only with FreeCC. But you'll hardly want to go back. FreeCC has taken this particular parser generator codebase much further in its few months of existence than the JavaCC project did in years, and is gaining traction. Given the fact that FreeCC is a continuation of the current JavaCC codebase (which didn't really progress further since the fork), it is really risk-free to try it out for your next project (or even in current project!) instead of JavaCC. You can also expect that the developer will be more open to your feature requests, as Jon has a good track record of listening to community wishes in FreeMarker.

FreeCC is the JavaCC you'd want to use in 21st century.

At FreeMarker project, we've already switched to FreeCC; FreeMarker 2.4.0 and 2.3.15 will both have a parser built using FreeCC. Since Jon also works on FreeMarker, he's truly eating his own dog food. (Actually, there's even more to that. Since FreeCC in turn uses FreeMarker as the templating system for its output, he's eating his own dog food doubly! This seemingly creates a circular dependency between FreeMarker and FreeCC, except that luckily all that was needed was a FreeMarker JAR with parser built still using JavaCC to bootstrap the process initially; the projects are self-sufficient since and don't need JavaCC anymore.)

Wednesday, October 08, 2008

HTML encapsulated JSON

Continuing the early morning post on having an XSL-like solution for JSON, where your webapp only outputs JSON files, and has attached stylesheet(s) that the browsers can use to display it as a nicely formatted HTML it intended for human audience. All solutions I outlined there needed an active change to existing technologies: custom HTTP header, or extension to JSON, but in any case they would need explicit support from browsers.

In other words, they would never get widely adopted.

But then, just as I went to sleep, it hit me that there's a solution that operates completely within the currently existing technologies. I'll call this solution "HTML encapsulated JSON". The premise is that you create a simple HTML page that has the JSON payload in its body, and has script elements that pull in the JSON-to-DOM transforming script. Something like this:


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <script src="/carlist-json-to-html.js" type="text/javascript" />
    <link rel="stylesheet" type="text/css" href="carlist.css" />
  </head>
  <body>
    <pre> 
        {   "cars": [
                {   "color": "red",
                    "make": "Ford",
                    "model": "Mustang",
                    "year": 1986,
                    "price": { 
                        "amount": 250,
                        "currency": "USD"
                    }
                },
                {   "color": "yellow",
                    "make": "Mazda",
                    "model": "6",
                    "type": "GT",
                    "year": 2002,
                    "price": { 
                        "amount": 14000,
                        "currency": "USD"
                    }
                }
            ],
            "dealer": {
                "name": "Honest Joe",
                "address": {
                    "street": "123 7th Avenue",
                    "city": "Dustfield"
                },
                "phone": "1-456-789789"
            }
        }
      </pre>
    </body>
</html>

As you can see, the JSON is put into html/body/pre (using "pre" to be fully DTD conformant). A machine client should be able to easily parse it out from there. Granted, it'll need to use both an XML parser and then a JSON parser, but that shouldn't be too big deal.

But the best part is definitely that you can very easily transform this into a nicely formatted HTML, by including a script (emphasized in red) to build a DOM from the JSON extracted from the body. Actually, you can just generate a fairly naked HTML, and then use a separate CSS stylesheet (as shown above) to format the HTML for presentation! And it's all standards compliant, and works with any modern browser!

Next, I'll try to come up with typical code for a JavaScript code template that would be used for a JSON-to-DOM conversion.

CSS for JSON

The back-in-the-day promise of XSL stylesheets was that you could output an XML document from a HTTP request, and have it processed as-is by a machine recipient, or transformed into a pretty HTML for human viewing automatically.

Seeing how I (and probably many others) do prefer JSON to XML nowadays for my machine-to-machine communication, I'm thinking of an equivalent for JSON. In similar vein, XSL is almost entirely replaced by CSS today, and I keep thinking that we'd need an XSL or CSS equivalent for JSON.

Actually, we probably don't even need a new language for describing the transform. JavaScript would probably do just fine - read the JSON, build the DOM.

The other problem, assuming we already have the language for transforming JSON to pretty formatted HTML: how would you declare a transform in a JSON document?

In XML, you can use a processing instruction. In HTML, you can use the <link> element. Now, unlike XML's processing instructions, or HTML's <link> element, JSON notoriously lacks any support for any out-of-band information that isn't the actual content. Heck, it doesn't even support comments!

I can see three possible ways around it:

Declare the transform in the transport header, i.e. just slap on an "X-JSON-Stylesheet" or similar header into the HTTP response. The obvious problem is that this ties the solution to the transport.
Add it to a conventional location into the JSON data itself. I.e. have the top-level object literal have a "__metadata__" object, and put this and similar other information into it. i.e.:
```
{ "__metadata__": { "links": [{"url": "myStyleSheet.cssjs", "type": "stylesheet"}]} ... }
```
The obvious problems are: a) the top-level value in a JSON document ain't necessarily an object (can be an array, or even a primitive value) b) some JSON based formats won't be happy about having extra elements with magic names inside of them.
Extend JSON to add a syntactic equivalent of XML processing instructions. Honestly, I prefer this approach. Just adding single-line comment support, and then having a special prefix for the "processing instruction" comment (i.e. using a question mark, to keep it similar with XML PE) would do, I think:
```
//?stylesheet myStyleSheet.cssjs
{ real payload goes here ... }
```

In any case, it is up to browser vendors to get the ball rolling on this - both the declaration mechanism, and the transform language.

I think this would be a great technology to have, where I could just output JSON from my web applications, and have it both consumed by software, and presented in nicely human readable form in browsers.

Tuesday, September 02, 2008

Package private access in Open Source code

I recently got in a same situation three times: someone wanted to use code I wrote in an Open Source project written in Java, and they couldn't, because the class/method in question had package private ("default") access, rendering it inaccessible outside of its package.

First, Charlie Nutter needs access to package-private class BeanMetaobjectProtocol in Dynalang-MOP.
Next, John Arkley needs access to package-private class AllHttpScopesHashModel in FreeMarker to help push it into Spring.
Finally, Steve Yegge needs access to the package-private constructor of Context class in Rhino.

Now, you know, when the same thing hits me three times in a row in short timeframe (incidentally, all three coming from people with high geek cred), that gets me thinking: What good is package-private access in an Open Source project anyway?!

It's a very valid question, really. Think about it: people can see the code - it's open. People want to use the code - it's normal. And when they want to, you frustrate them by declaratively preventing them from linking their code to your open source code. They can look, but they can't touch.

That seems wrong.

Of course, you could argue for package private access' validity. Here are some arguments you could come up with:

Package private helps implementation hiding on a package level! Well, duh, it does. However, if a class/method is useful to another class you wrote and that lives in the same package, it might also be useful to some poor outsider schmuck too! It must already conform to quite rigorous coding standards, as it will be used by some other classes, ones you wrote in the same package, so it'd better maintain state invariants and so on. Does it really make a difference if it's another class of yours, or another class of another developer, living in another package? I say: it shouldn't. If you think the class is just an auxiliary, and it's just cluttering the JavaDoc, just move it to a *.support subpackage instead (Spring does that extensively).

Package private helps you hide bits you don't want to be tied down by backwards compatibility! This is really a corollary of the first one. I used to be big on this one (that's probably why I got asked to loosen up access restrictions in the first place - because I used to place them there in the first place). Common wisdom is that once people start using your publicly available API, you'd better not break it on the next release. By making lots of stuff public, you increase the surface area that needs to be kept backwards compatible, right? Right, sort of.

Now, how about this instead: make it public anyway, just note in a JavaDoc entry (or even better, in an annotation, say @SubjectToChange) that there's no backwards compatibility guarantee on this method. If it's an annotation, people can even have an automated tool for checking for its use before they upgrade. Hell, you can even have a @BreaksCompatibility annotation! What it boils down to is: don't treat users of your library as children and decide what's good for them. Inform them that the API is volatile, but open it up, don't close it. It's not really closed anyway, as they can see the source. You're just erecting a glass wall in front of them; they can look but they can't touch.

And they'll come to bug you about it anyway.

My point of view right now is that it's better to provide a suggestive hint that an API is volatile in the documentation or annotation and let anyone use them at their own risk rather than build a non-negotiable restriction into the code (as in, can't negotiate it with a compiler; you can still definitely negotiate it with me).

Also, I'm not saying either that package private access is completely inadequate for Open Source libraries. I'll admit there might be valid use cases for it, but if there are, they're very few.

Now, before you'd accuse me of being a "make everything public" proponent, let me say that this reasoning doesn't apply at all to private access, and only partially applies to protected access. Let's see why:

Private access is completely justified. Let me point out one clear distinction between packages and classes: classes and instances of classes can have state. Packages can not. That's pretty much what makes the difference. Through refactorings, you can end up with methods in a class that violate its state invariants. Other methods in the class can call these methods as steps to transform the object from one valid state to another valid state, but it might be invalid in the interim. You would never expose such a method publicly. Well, you shouldn't expose it even package-privately either; that'd be sloppy programming (see my above remark about package-privately accessible methods having to be just as rigorously coded as public ones). A class (and class instance) is a unit of state encapsulation, a package is not. Hence, classes need private access.

Protected methods remain usable by 3rd party code, as long as it has classes that extend your base class. It mightn't be too ideal a constraint (forcing subclassing in order to access functionality), and you might rather want to design your libraries to favor composition (has-a) over subclassing (is-a) architecture. But if you must have classes intended to be subclassed, and there's functionality that's only ever used/extended/implemented by a subclass, make it protected. Especially when you have abstract protected methods that act as poor Java programmer's mixins (are intended to be called by code in the base class). Sometimes you'll allow (or downright expect) such abstract methods to violate an invariant in the object's state temporarily, as it'll be called as a step in a more complex state transition implemented by an algorithm in the base class method. In these cases, protected access is justified. But in other cases though, you might still consider making some protected methods public.

In conclusion: I'm currently fairly convinced that what package private access is good for is preventing your users from linking their code to useful bits of your code for purposes you didn't anticipate up front; it's often nothing more than an unnecessary garden wall.

Thursday, July 24, 2008

Undeclared, Undefined, Null in JavaScript

So, a coworker of mine run into a situation recently where they were testing for an expression in JavaScript as:

if(obj.prop != null && obj.prop.subprop > 0) ...

This basically guards you against dereferencing "subprop" if "obj.prop" is itself undefined or null. They were running into a situation where the first predicate would pass, and the second would fail. However, "obj.prop" always printed "undefined". Turns out, it actually had the string "undefined" as its value.

Ouch.

Anyway, said coworker pointed me to a blog entry titled Three common mistakes in JavaScript / EcmaScript. Said blog entry says:

if (SomeObject != null) {

Well, in JavaScript, which is a dynamic language, something that has not been assigned to is not null, it's undefined. Undefined is different from null. Why? Don't ask me. Well, anyways, you can use typeof to explicitly check for undefined, or use other more or less clean tricks, but the best way to deal with that is probably to just rely on the type-sloppiness of JavaScript and count on it to evaluate null and undefined as false in a boolean expression, like this:

if (SomeObject) {

It looks uglier, but it's more robust.

I have to disagree. It's not more robust. It will also catch the cases of SomeObject having numeric value of 0, or string value of empty string. Because their boolean coercion is also false. To make matters worse, the original example, using SomeObject != null actually works and is in most cases actually the most appropriate!

See, in JavaScript, null and undefined are actually equal according to the == and != operator! (ECMA-262 standard, section 11.9.3 tells us so.) In vast majority of cases, you don't care about the difference at all, so using someExpr != null is good enough.

If you really-truly must distinguish between undefined and null, you have some options. Curiously, while there is an actual built-in language literal for the null value (namely, null), the sole value of the Null type, there is no built-in language literal for undefined, the sole value of the Undefined type. The identifier "undefined" can be assigned to, so you can't write something simple as if(x === undefined):

var undefined = "I'm defined now";
var x; // he's really undefined
print(x === undefined); // prints false

Inconvenient, huh? So, how to test for undefined? Well, the common practice found in most books and tutorials on JavaScript seems to be using the JS built-in typeof() function, but I really don't like it, because this is implemented by way of a string comparison:

var x;
print(typeof(x) == "undefined");

will actually print true. But as I said, I think it's ugly.

My solution instead relies on the fact that undefined is equal to null, but is not strictly equal to null, therefore this expression also works:

var x;
print(x == null && x !== null);

will also print true, and it involes only two simple comparisons.

Which brings us to the question of what is the actual difference in JavaScript between an undeclared variable, a variable with undefined value, and a variable with null value. Let's see:

var x = {}; // empty object
var u; // declared, but undefined
var n =  null; // declared, defined to be null

function isUndefined(x) { return x == null && x !== null; }

print(isUndefined(x.x)); // prints true - access to undefined property on an object yields undefined
print(isUndefined(u)); // prints true - declared, but undefined value
print(isUndefined(n)); // prints false - the value is null, not undefined
print(isUndefined(z)); // runtime error -- z is undeclared

So, it's an error to dereference an undeclared variable, "z" in above example (it's okay to assign to it, which creates a new global variable). It is not an error to dereference a declared variable with undefined value, "u" in above example. Its value is the undefined value. Further, access to any undefined properties on objects also result in undefined value, duh. As for null, well, null is just a value like true, or false, or 4.66920166091; it's the single value of the type Null.

Hope this clears up the whole topic of undefined/null values (and undeclared variables) somewhat.

Saturday, June 07, 2008

Objective-J

So, here's some news: people at 280 Slides created a web application that allows you to build presentations in your web browser. It does look very nice, people are comparing it to Apple's Keynote application. All in all, yet another webapp out there; what's the big deal, right?

Well, the people who created 280 Slides were previously Apple employees. 280 Slides wasn't just written in JavaScript. No. These people created something called Objective-J, which is to JavaScript as Objective-C is to C. And then they implemented part of Apple's Cocoa application framework atop of it (named it Cappuccino), and finally implemented the application atop of it.

Now that's quite amazing.

Dion Almaer writes that

Objective-J is the language that takes JavaScript and makes it Objective (as Obj-C did to C). Lots of square brackets. When the browser gets served .j files, it preprocesses them on the fly. This means that you can do things like, use standard JavaScript in places.

Interesting. Objective-J will eventually be open sourced at objective-j.org, and I'll be quite curious to see what did they do. I suspect they have a transformer from Objective-J source code to plain JavaScript (presumably itself written in JS), and then the browser's JS runtime converts the source code to JS when it downloads it. But I might be wrong.

Then there's the interesting issue that Objective-C improved C with OO features. But what did Objective-J improve? JavaScript is extremely object-oriented to begin with, so this sounds more as if they wanted to bring the actual Objective-C flavor of OO to JavaScript instead, because that's what they're comfortable doing. They need to drive nails into a different wall now, and they'd still prefer to do it with their old hammer!

Don't get me wrong, I'm not making fun of them. Shaping one's tools in a new environment after ones you knew and loved in a previous environment is a valid activity if you percieve it as the path that allows you to be most productive. To build a new language atop of JS and then build an application framework atop of it, and then build a very usable and visually appealing application on top of it (very cross-browser compatible too) gets you a metric shitload of geek cred in my circles. It might turn out to be a catalyst for getting a lot of similarly nice future webapps out there. It might turn out to be the next big thing for JavaScript in browser.

I'm eagerly waiting for content to start popping up at objective-j.org, although of course the Objective-J.js can be readily inspected.

Thursday, June 05, 2008

So, WebKit JS used an AST walker before?

Color me flabbergasted.

WebKit recently announced a new JavaScript interpreter, named SquirellFish, claiming it is much faster than the previous interpreter in WebKit.

That's good and all, but in the "Why it's fast" section of the linked article, they say:

Like the interpreters for many scripting languages, WebKit’s previous JavaScript interpreter was a simple syntax tree walker.

It was? Oh my... They also say that:

SquirrelFish’s bytecode engine elegantly eliminates almost all of the overhead of a tree-walking interpreter. First, a bytecode stream exactly describes the operations needed to execute a program. Compiling to bytecode implicitly strips away irrelevant grammatical structure. Second, a bytecode dispatch is a single direct memory read, followed by a single indirect branch. Therefore, executing a bytecode instruction is much faster than visiting a syntax tree node. Third, with the syntax tree gone, the interpreter no longer needs to propagate execution state between syntax tree nodes.

Why, yes, indeed!
For the record, Rhino has been doing this for ages - AST is compiled to internal "JS bytecode" format that strips away grammar, and then interprets it. This works like this since, well, around the turn of the millenium. (Actually, Rhino can even kick it another notch and can also optionally compile the JS bytecode to Java bytecode, eliminating the interpreter altogether. Which bytecode then the JVM JIT compiler can further compile to machine code at its own discretion.)

Anyway, I digress. All I wanted to say is that I'm honestly amazed that a supposedly professional implementation of JavaScript (the one shipped in WebKit before SquirellFish came along, and by consequence, shipped in Apple's Safari browser) would use an AST walking interpreter. Yeah, I know, you'll say "most scripts run once on the page, so optimization is overkill", but with AJAX this is no longer true, and apparently, the WebKit team also thinks so.

On the positive side, they're finally moving on to a better solution, and I congratulate them on the undoubtedly hard decision to finally take the leap toward a more modern execution architecture.

Thursday, May 29, 2008

"Mixed Language Environments" interview

in case you're interested, the video interview where Kresten Krab Thorup interviewed Charlie Nutter, Erik Meijer, and myself about mixed language environments on virtual machines is now online. This was shot at the JAOO conference last September.

Wednesday, April 16, 2008

Interview about Rhino

Floyd Marinescu interviewed me about Rhino at the JAOO conference in Aarhus, Denmark last September. You can watch the interview here. Yes, I know I said "scratch to itch" :-)

I actually stepped down from the Rhino maintainer role since this interview was shot, although I do remain with the project as a committer.

Wednesday, April 09, 2008

Airport Extreme disappoints

UPDATE: Apparently, firmware upgrade 7.3.2 fixed the disk access problems. I'm now able to use a USB-attached disk to AEBS just fine. Skype also started working fine some time ago (even before the firmware upgrade). The speed issue (not being able to exceed 8 Mbit/sec on 802.11g with all-Apple equipment) unfortunately still remains.

I got a brand new Airport Extreme Base Station (AEBS) this Friday. (Kirk hauled it for me from Las Vegas to Hungary - Apple gear costs roughly twice as much here than it does in US.)

I already own a 250GB external HDD, so I wanted to attach it to AEBS for either file sharing or, hopefully, wireless Time Machine backups.

(Too bad Apple officially said only three days later that they don't support Time Machine backups with AEBS. Meaning, if it works for you, you're lucky, but if it doesn't, you're on your own.)

Anyway...

My experience is that even when Time Machine is not involved, AEBS can't handle a USB drive at all. Just attaching the drive to the base station will cause it to come online much more slowly (needs several minutes after plugged in). Also, whenever I wanted to mount the drive through Finder on any of the Macs in the house and transfer a file to/from it, AEBS would become inaccessible in a minute or two, then spontaneously reboot. With no drive attached, it's stable as a rock.

Of course, you could claim that the drive is at fault. Now, the same drive works perfectly when attached to either my MacBook Pro, or to my iMac G5, both running Mac OS X 10.5.2. It was formatted and partitioned (single partition, GUID scheme, HFS+ journalled filesystem) on the iMac G5. As in, freshly reformatted and repartitioned once again in a desperate attempt to make it work with AEBS. Run Disk Utility's "Repair Disk" on it too, just in case.

Now, when the drive is attached to either Mac and shared from it, the other Mac can mount it wirelessly and even do Time Machine backups to it, without any problems. So, you know, the drive itself seems in order. But if I connect it to the AEBS, it kills the AEBS. AEBS firmware is also upgraded to latest, 7.3.1. I wanted to reduce the clutter on my desk by moving the HDD and its wires to the AEBS and am fairly frustrated with the inability to do so.

I repeat: the symptoms have nothing to do with Time Machine. With Time Machine off, and trying to access the drive over AEBS as a vanilla networked volume, AEBS dies in a minute or so. I suspect a firmware bug.

And then there's the speed.

My MacBook Pro is a first-gen Core Duo, meaning it can only use 802.11g, and can't be upgraded to use 802.11n. Still, 802.11g is 54 MBit/sec, right? Well, the fastest data transfer rate I could achieve moving data to the AEBS-attached HDD (before it crapped out) was a bit over 8 MBit/sec. On a distance of few feet between the devices. Not happy. I can understand various protocol overheads would chip off the bandwidth but I can't understand getting 14% of advertised bandwidth. Not even if it were half-duplex.

Okay, what about Ethernet access? I tried connecting over Ethernet to see whether the AEBS crashing problem is inherent to Wi-Fi, or is also present over Ethernet (result: it is present over Ethernet too). The MacBook Pro said the connection is 1000MBit/sec - fair enough, as both it and the AEBS are supposedly Gigabit Ethernet. Both devices use speed and duplex autoselect based on the cable actually plugged in. So, with 1000MBit/sec theoretical maximum speed, what effective throughput could I get? 72MBit/sec, tops. It ain't the HDD that's slow - it's a 7200RPM drive with USB 2.0, and when connected directly to the computer it is pretty much screaming fast.

And then there are application compatibility problems.

My wife Kriszti speaks quite a lot with her aunt over Skype, and with AEBS she experiences much more stalled audio/video over Skype. I actually subscribe to two broadband connections: a 4MBit cable and also a 1MBit ADSL as a backup. I moved the old router to the 1MBit ADSL (which previously wasn't available over wireless, being only a backup). When Kriszti had repeated Skype problems, I told her to switch to using the old wireless router, and lo and behold Skype works okay for her again, so her Mac tends to remain connected to the old router's wireless network (so our machines aren't on the same network, which is also a source of inconveniences). So: cable ISP with old router = Skype ok; cable ISP with AEBS = Skype not okay; ADSL ISP with old router = Skype okay. There's a pattern here, although in order to be able to claim with 100% certainity that it's a problem with AEBS, I'd also need to test Skype with AEBS connected to the ADSL line which I won't do now as it'd require too much uncabling/hauling/recabling.

So, we have a base station that crashes when you try to use an USB HDD with it, with disappointing speed figures, and apparently also causes problems for Skype traffic. The only really good point is that contrary to the old taiwanese no-name router I replaced, it has a better range and implements WPA/WPA2 in a manner that Windows laptops can also understand; both of these don't really mean anything to me, but it certainly makes the nice girl next door (who we give free Internet access in exchange for occasionally babysitting our kids) lot happier.

All in all, a disappointment.

Friday, March 21, 2008

Xstream with Hibernate

People have been asking in the comments to my post on XStream about how to properly integrate Hibernate with XStream, as there are few pieces of puzzle that don't necessarily fit together perfectly. I answered in the comments, but Blogger annoyingly doesn't allow a lot of formatting in comments, so I decided to write a separate follow-up post; here it goes.

First of all, I use Hibernate 2. I know, it's old, but it does the work and I won't fix it unless it is broken. I expect most of the advice will also apply to Hibernate 3.

I'm aliasing all classes using XStream.alias(). Reason being that on the server side, some data model classes are actually subclassed for additional (although non-Hibernate) related functionality. You might or might not need this.

This is however essential: We need to ensure that XStream will treat Hibernate lists, sets, and maps as Java lists, sets, and maps. I believe this is what helps avoid the infamous "com.thoughtworks.xstream.converters.ConversionException: Cannot handle CGLIB enhanced proxies with multiple callbacks..." problem. There are three things that need to be done:

use Xstream.addDefaultImplementation() to tell XStream to treat all Hibernate-enhanced collection classes as plain Java collections:

xstream.addDefaultImplementation(
        net.sf.hibernate.collection.List.class, java.util.List.class);
xstream.addDefaultImplementation(
        net.sf.hibernate.collection.Map.class, java.util.Map.class);
xstream.addDefaultImplementation(
        net.sf.hibernate.collection.Set.class, java.util.Set.class);

Finally, in order for XStream to actually handle these collections I needed to define and register some custom converters that are able to handle Hibernate collections as Java collections:

Mapper mapper = xstream.getMapper();
xstream.registerConverter(new HibernateCollectionConverter(mapper));
xstream.registerConverter(new HibernateMapConverter(mapper));

These custom converter classes are rather trivial, here are their definitions. All they really do is extend the XStream built-in collection and map converters, and declare their ability to handle Hibernate lists, sets, and maps:

import net.sf.hibernate.collection.List;
import net.sf.hibernate.collection.Set;
import com.thoughtworks.xstream.converters.collections.CollectionConverter;
import com.thoughtworks.xstream.mapper.Mapper;

class HibernateCollectionConverter extends CollectionConverter {
    HibernateCollectionConverter(Mapper mapper) {
        super(mapper);
    }

    public boolean canConvert(Class type) {
        return super.canConvert(type) || type == List.class || type == Set.class; 
    }
}

and

import net.sf.hibernate.collection.Map;
import com.thoughtworks.xstream.converters.collections.MapConverter;
import com.thoughtworks.xstream.mapper.Mapper;

class HibernateMapConverter extends MapConverter {

    HibernateMapConverter(Mapper mapper) {
        super(mapper);
    }

    public boolean canConvert(Class type) {
        return super.canConvert(type) || type == Map.class; 
    }
}

That's all I did and it eliminated all of my Hibernate+XStream problems - hope it will also help you.

Thursday, March 20, 2008

Attributes and items

So, here I am trying to further my metaobject protocol library. I'm now in the eat-my-own-dog-food phase of sorts, as - after having written a MOP for POJOs, I'm trying to write actual MOPs for some dynamic language implementations, most notably, Jython and JRuby. And pretty soon, I hit an obvious design problem (which is okay, really - this is still an exercise in exploring the right approach). Namely, lots of languages actually have two namespaces when it comes to accessing data belonging to an object. I'll refer to one of them as "attributes" (as that's what they're called in both Ruby and Python) and the other are "items", which are elements of some container object. All objects have attributes, but only some objects (containers) have items. Some languages don't make a distinction, most notably, JavaScript. In JavaScript, all objects are containers and they only have items (and an item can be a function, in which case it functions as a method on the object). Other languages (Ruby, Python) will distinguish between the two; the containers are arrays/lists and hashes/dictionaries/maps. As a matter of fact, it helps thinking of Java as having the distinction - JavaBeans properties are the attributes, and arrays, Maps, and Lists will have items.

I'd like to think that most people's mental model of objects actually distinguishes the two.

Now, to make matters a bit more complicated, in lots of languages the container API is actually just a syntactic sugar. Give an object a [] and a []= method, and it's a container in Ruby! Give it __getitem__, __setitem__, and few others, and it's a container in Python! Honestly, this is okay - as a byproduct of duck typing, one shouldn't expect to there be any sort of an explicit declaration, right?

For ordered containers, most languages will also make it possible to manipulate subranges as well.

Bottom line is, I feel this is a big deal to solve in interoperable manner, as the raison d'être of this library is to allow interoperability between programs written in different languages within a single JVM; I imagine in most cases the programs will pass complex data structures built out of lists and dictionaries to one another, so it feels... essential to get this right. It also feels like something that can rightfully belong in a generic MOP as most languages do have the concept of ordered sequences and associative arrays. Of course, I might also be wrong here; it is also an essential goal to not end up with a baroque specification that contains everything plus the kitchen sink.

So, here am I wondering whether this is something that can be made sufficiently unified across the languages to the point that if a Ruby program is given a Python dictionary, and it calls []= on it, it actually ends up being translated into a __getitem__ call. The goal seems worthwhile, and is certainly possible but I'm not entirely sure how much of an effort will it take. There's only one way to find out though :-)

Thursday, March 06, 2008

Running Half-Life 2 natively on Mac OS X

I run Half-Life 2 natively on Mac OS X yesterday evening, in the same configuration I run it on Mac: 1280x1024 resolution, 8x anisotropic filtering, High Dynamic Range bloom rendering, all graphic settings set to "High". It run like charm. I'm impressed. Half-Life 2 was the only reason until today to keep a Windows XP partition on my Mac, and use Boot Camp. But it was inconvenient, the rebooting, even if I did it about once a week on a sunday afternoon for few hours of gameplay.

The magic behind this is called CrossOver Gaming.

Yesterday, I got an e-mail from Codeweavers where they announced CrossOver Gaming Beta. For those unfamiliar, CrossOver is a Wine-based Win32 API compatibility layer for Mac and Linux, allowing Windows application binaries to run natively under these operating systems on machines with x86 CPU architecture. I did test drive CrossOver earlier (that's why I was on their e-mail list), but it left me unexcited, not by its own fault, but really because I had no need for any Windows applications at the time.

I was intrigued by the CrossOver Gaming product though, as it does answer a need I have, namely playing a game I own without needing a reboot (VMWare fusion doesn't run Half-Life 2). Or needing a Windows license. The difference between "regular" CrossOver and CrossOver Gaming is that Gaming will see more frequent releases, will be more bleeding edge, basically less conservative as to what goes into it and frequently updated to accommodate game compatibility problems. You won't necessarily want that from a software that you use to run, say, your Windows-only accounting package, but for games, this model makes perfect sense.

Well, it works pretty much as advertised on the box. It knows of a bunch of "supported" applications, one of them being Steam (and all games available through it). It will install a surrogate HTML library (lacking Internet Explorer, right, there's no Microsoft-shipped mshtml.dll in the system) that allows Steam's built-in store browser to work, then download and install MSXML redistributables and Steam itself. Steam then launched, I logged into my account, downloaded Half-Life 2: Episode One, crossed fingers, and launched it.

It runs just as it did under Windows, and I think I couldn't say anything more praiseworthy about this CrossOver product, even in its beta. I'm totally buying this when it comes out.

I'll briefly mention that there is another player in this space, Transgaming, but unlike Codeweavers, they forked off Wine and aren't donating code back upstream to Wine, so if you need to choose, it seems as if supporting Codeweavers seems like a better option from the moral point of view, as Codeweavers do donate back to Wine. Also, Transgaming's Cedega product doesn't allow users running unaltered Windows games on Mac, only on Linux. They don't offer Cedega for Mac; they have Cider, but that's not a runtime but rather a library that developers need to link against to produce Mac-runnable versions of their games written against the Win32 API. So in reality, CrossOver is the only solution for running a Windows game natively under Mac OS X. Fortunately, it seems to be a good one.

Tuesday, March 04, 2008

While waiting for the fourth season...

I avoid most forms of filmed science fiction, reason being that most attempts underachieve badly compared to written works. The quality of written sci-fi is rarely matched or even approximated by movies and TV series. Notable exceptions are few, and the finest filmed sci-fi for the last few years is without a doubt the reimagined Battlestar Galactica (Firefly would come in close second). The final, fourth season starts in April, and I can hardly wait for it to start. In the meantime, there's a great two part interview with the show's creators about legal system, torture, and morality (part one), as well as economy and politics (part two) in the series.

If that weren't enough, here's a gorgeous Battlestar Galactica reinterpretation of Last Supper on Flickr.

Friday, February 22, 2008

Microsoft Open Source

I was surprised to discover today that Microsoft has a hosting site for Open Source projects (analogous to, say, SourceForge.net or GNU Savannah): witness Codeplex. To my further surprise, there's also an Open Source community site at Microsoft, named Port 25. I haven't got time to investigate either of them more deeply yet, but plan to do so in future. This is intriguing.

(Found them both following Microsoft's Open Source Interoperability Initiative FAQ.)

Mind you, I was aware earlier that Microsoft has software released under OSI-approved licenses; there are more than few Microsoft-backed projects hosted on SourceForge.net (or did they move to Codeplex since? Hm...). What I was not aware of, and what I believe is a big deal is that Microsoft is now providing its own hosted infrastructure for Open Source project hosting and community discussion.

Thursday, February 21, 2008

Reap What You Sow

You won't catch me writing about politics too often, but I need to express my view on the declaration of independence in Kosovo. Actually, I'll mostly bother you with some of my family history, but the two are, for better or worse, somewhat intertwined.

As you might or might not know, I grew up on the territory of former Yugoslavia, in a small village in northeast Croatia, bordering on Serbia. My family lived in Croatia, but we had relatives in Serbia as well. My family's roots are from the Serbian Vojvodina province, which belonged to Hungary under name Vajdaság until it was annexed into the Serbian-Croatian-Slovenian Kingdom (forerunner of Yugoslavia) as part of the breakup operation of the Austro-Hungarian empire after World War I. (In a way, Serbia then gained a province in north not unlike to how these days it lost one in south.) As such, Vajdaság has a high (alas, dwindling) Hungarian population, and I come from this ethnicity.

I never felt any drawback growing up a non-croatian in Croatia. Nobody in Croatia ever as much as made a remark about me being ethnic minority. Not so in Serbia. Whenever I visited my grandmother in Novi Sad (Serbia) during summer vacation, I experienced strange things. She'd hush me to not speak Hungarian on the street or on the bus. She did neither. The name plate on her door had her name spelt in Serbian (serbian "Jelisaveta" instead of hungarian "Erzsébet" for "Elizabeth"). It was clear you can get into trouble for being different. My whole experience of Serbia was - as far as I can remember - that people there are highly xenophobic and intolerant of their ethnic minorities.

Then came 1991 and the breakup of Yugoslavia. The Croatian region I lived in was overrun by serbian paramilitary troops with full backing by Milosevic's serbian state army. They ruthlessly drove away or slaughtered nonserbian population from the territories they occupied. My family fled with one car trunk worth of belongings when these thugs were approaching. We lived next to an improvised Croatian police station, and we later learned we were targetted as "Croatian collaborators" by paramilitaries because we were on cordial terms with the police officers. They broke into our home on a night after they occupied the region. I have no doubts as to our fate if they found us there.

Mind you, at the time police officers with handguns were the only armed force the just-born Croatian Republic could stack against the Serbian-controlled "People's Army of Yugoslavia", the biggest and most heavily armed force in Balkans in 1991. They had all the chance of a snowflake in hell to defend our homes against the occupators.

While Serbia was significant territorial influence in the breakup of Yugoslavia, it clamped down even harder on its own ethnic minorities, trying to prevent further loss of grip on its remaining territories with the oppression in both Vajdaság and Kosovo growing year after year under Milosevic regime. It culminated when Serbia attempted to eradicate the Albanian minority (minority when viewed against overall population of Serbia, but a 95% majority in Kosovo) in 1999 using its military. This led to the well known NATO intervention when Serbia was bombed by US and its allies until its warlords lost the backing of the population and were overthrown in a revolution.

But the damage has been done. The Serbian state consistently over several decades mistreated and oppressed its ethnic minorities. After what they experienced under Serbian regime for decades, ethnic Albanians of Kosovo wouldn't trust'em as far as they can throw'em. The Serbian state is reaping what they sow now.

It's ironic, but I do actually believe that the recently elected Serbian government might actually be a modern european democratic government that would treat its minorities as a modern european democracy should. (Provided they don't assassinate their prime minister again for being too European...)

But it's simply too late.

There was a huge demonstration this evening in Belgrade. There were atrocities. Embassies and banks were burned. The prime minister spoke to the crowd, fueling it, and the police didn't stop the hooligans. It's sad how they still lay the blame everyone for the situation except themselves, and their decades of hostile politics. I have no illusions this will change soon. I have no doubts that the long oppressed Albanian people of Kosovo are better off in an independent state. They finally will have the chance to bring prosperity to the long neglected region. The region will finally have a government that feels it belongs to the land. As far as I remember, Kosovo was always extremely poor. Serbians have strong emotional ties to the region because Kosovo is the historical site of birth of Serbian state and church, but aside from that, Serbia was a very lousy custodian of the region, not bothering developing it, or helping it develop, or even just not actively hindering any economic progress in it in recent history.

My father packed his two children and wife into his car on 20th August of 1991, and pressed the pedal to the metal until we crossed the border to escape certain death from Serbian paramilitary thugs. Dad spent the rest of his life in exile. Even after our former Croatian homeland was liberated, the six years of Serbian rule set it back economically, infrastructurally, and most importantly socially for decades - it still didn't recover as most young people, including me, departed the region and didn't go back, decimating the society's renewal potential. There was simply no place to go back to, as that land was no more the same land we left. So Dad didn't return either although I know his heart ached for an alternate reality where all of this didn't happen, the peaceful continuation of days of old, something that is not ours to experience in our lifetimes, taken away from us by force by aggressive neighbors' selfish geopolitical interests. I wish Dad was still alive to witness how those same aggressive neighbors are now in pain too; while it doesn't cure our wounds, he would certainly find some poetic justice in it.

Schadenfreude? Damn well yes, we're entitled to it.

Wednesday, February 20, 2008

Oh, the irony

I'm reading Vernon Vinge's "Rainbow's End". At one point, he describes a day of an old man who's been cured from Alzheimer's in near future. Everyone is using wearable ubiquitous, always connected devices to access any data anywhere, and he's given a foldable electronic paper like device (rudimentary compared to what young kids are using) to access the web:

He wandered around the house, found some of his old books in cardboard boxes in the basement. Those were the only books in the entire house. This family was effectively illiterate. Sure, Miri bragged that many books were visible any time you wanted to see them, but that was a half truth. The browser paper that Reed had given him could be used to find books online, but reading them on that single piece of foolscap was a tedious desecration.

The irony? Rainbow's End is available for free here legally, and I'm tediously desecrating it in my web browser :-)

As a matter of fact, I don't like ithe medium. It's not the first novel I read on my computer, and probably not the last (I read few Cory Doctorow novels this way, bought them all in book form since), but I much prefer holding a deadtree book in my hand for my night reading. Especially when I spent the entire day anyway in front of the said computer. (But have been few time in a situation when I wished for Command+F to quickly go back to something while reading a deadtree...)

OTOH, Tor books started a free ebook program "Watch the skies" recently (non-DRMed PDFs); Jon Scalzi's Old Man's War is coming out soon on it. Karl Schroeder's Ventus is also available for free. Neil Gaiman's American Gods will also be e-published for free availability soon. I sense a trend here.

Ignorance is bliss

"Hi. My name's Attila, and I write shitty code."

The latest Really Bad Practice I managed to implement was making some business-level code aware of its underlying infrastructure. In particular, made them aware of Spring's ApplicationContext and such.

Ignorance is bliss, and this goes for code as well. A protocol handler unaware of transport peculiarities can be reused with any transport. Code that is unaware of memory management will automatically benefit from improvements in garbage collection.

The less your code knows about the context it is used for, the easier to reuse it in different context, but even more importantly, the easier for the context to manage it as it is supposed to do.

With dependency injection (DI) stuff like Spring, making components aware of the existence of it is bad, but it won't necessarily become apparent immediately. But when you want to implement something more involved; say, a graceful shutdown for your service, you'll suddenly no longer be able to have the infrastructure do the work for you. In my particular case, I could no longer rely on the dependency graph maintained by Spring after some of my components directly pulled some other components from the application context.

Of course it was a stupid thing to do. I usually know better.

As an excuse, let me say I only resorted to this in rather ugly situations. There are asynchronous callbacks from external systems, through mechanisms that make binding to the infrastructural context "normally" hard. And there's Java deserialization, the notoriously DI-unfriendly mechanism where you either resort to thread locals or statics (which reminds me of Gilad's new intriguing blog post "Cutting out the static", by the way). (Dependency injection in deserialized objects is something Guice user's guide will also admit being a problem for which the best and still quite unsatisfying solution is static injection.)

So yeah, I have the excuse of committing the faux pas when faced with a tough situation, but still. (Mind you, eradicating all Spring-awareness alone won't solve my problem of graceful service shutdown while it might still be waiting for asynchronous responses from an external system, but would certainly go long way toward it.)

The lesson is however clear; it is often the path of least resistance to reach from your code down to a supporting layer, but it can easily come back to bite you when the said layer was meant to be invisible. You think you might need to expose only a bit of a plumbing, but as time goes on, you realize that if you continue, you'll end up either uncovering the whole goddamned kitchen sink, or having to reimplement some of it. Then it's time to finally notice what you're doing (better late than never, I guess), backtrack, and refactor; bury the infrastructure back to where it belongs, not visible at all from the higher level. It sure does make some fringe cases harder to implement, but the important thing is that it keeps the easy bits easy.

Monday, February 11, 2008

Tom Lantos died today

Tom Lantos died today. One less great Hungarian and one less great American in this world; I was a serious admirer of him and his work. Even if not always agreeing with all his views, I do believe he made the world a better place through most of the things he did. I remember being amazed by quite a lot of things he did, but I won't rehash them here - Wikipedia is as good source as any for this. I distinctly remember him from two years ago when I saw in the news that the (then) 78-year old member of the Congress (elected 14 times, no less) was arrested for civil disobedience while protesting in front of the sudanese embassy in Washington against violence in Darfur. I was proud. I'm sorry though that even his influence and chairing of House Foreign Affairs Committee was not enough to move US into ending the Darfur conflict. Maybe if he was given a bit more time...

Isten nyugasztalja békében, Tamás!

Thursday, February 07, 2008

Laptops at risk at US ports of entry

This keeps resurfacing in media every few months. This time, it's a Washington Post article about US Customs and Border Protection officers confiscating travelers' laptops (for indefinite time - some people didn't get theirs back for more than a year, despite being promised they'll get them in 10 to 15 days), or making copies of data on them, and/or forcing the people in possession of them to reveal their logon passwords. Also, people objecting to the procedures are denied entry to US.

Well, one more reason not to travel to US. At least, not with a laptop. Although, regardless of whether you have a laptop, they'll take all your ten fingers' prints when you enter, and that's also a rather strong cause not to. Over here, they take your fingerprints when you're taken in custody as a crime suspect. So depending on your cultural conditioning, having your fingerprints taken can be quite a humiliating experience. (I had my two thumbprints taken already on my previous US visits, and I detest the practice very much.)

Back to laptops and data.

As you might have seen from a previous weblog entry, I use FileVault on my laptop. Back when I used Windows, I used E4M for a similar purpose (although today I'd probably use TrueCrypt instead). FileVault is a 128 or 256-bit AES encrypted disk image for your home directory on Mac OS X. I even use encryption on my swap files.

I have very good reasons to keep all of this encrypted, reasons of both private and professional nature, that I do not wish to elaborate on further. If I were faced with the choice of handing over that data or being denied entry to US, I'd choose to not enter. Owners of some of the data that I keep on my laptop would certainly agree. (Yes, I keep data that doesn't belong to me but I'm trusted with it. If you work for a company in any significant position, chances are, you keep such data too).

Alternatively, in the near future, I can burn a BluRay disc with the contents of my home folder (encrypted), send it to my temporary US address in mail, and travel with laptop erased (or quickly erasable). Which still doesn't save me from the prospect of having my laptop confiscated at the border just because they can.

I'm lucky, 'cause I can mostly avoid going to USA if I don't want to. Some people on the other hand return home there; they don't have much choice aside from not leaving the country. Ugh.

Tuesday, February 05, 2008

Time Machine + FileVault experiences

I was reluctant to use Leopard's Time Machine "set it and forget it" backup because I also use FileVault (which basically mounts an AES-encrypted disk image in place of your home folder). The web was full of warnings how Time Machine does not work with FileVault, or does but it only backs up your home folder only when you log out, and you lose the ability to restore individual files through the GUI and need to fiddle with manually mounting the backed up images if you want to fish out something from them. Seeing however how I was getting undisciplined with my manual backup routine, I decided it can't be worse than having no backups, and went ahead and gave it a try.

At first, I was surprised to see that contrary to advertised, it did actually back up the encrypted disk image that hosts my home folder. It did it every hour. Every hour, it'd push the 30GB disk image over to the other drive. That filled it up, well, rather quickly.

Digging around, it turns out that the reason for this is that I kept using the Tiger-created FileVault, that uses a single file for the disk image. And Time Machine will happily back it up. 30 GB/hour.

So, next step was trying to upgrade to Leopard FileVault format, which uses a new "sparsebundle" disk image format, which is basically a folder with 8-MB files called "stripes" that hold the contents of the disk, plus some other files for tracking what's where. The ugly part of it is that in order to "upgrade" FileVault, you have to actually turn it off first (so it unpacks your disk image contents on the main filesystem), then re-enable it. I left it to decrypt over night (it probably only took an hour, but I left it there and went to sleep), then re-encypted in the morning (which took 40 minutes for 15GB of content). And then a secure wipe of the free disk space.

An immediate enormous benefit is that my disk image shrunk from 30GB to 15GB. That's right: my old disk image took 30GB even when it was hosting only 15GB of content, and no amount of compacting would've taken it lower. And it wasn't because of filesystem slack - inspecting the image with Disk Utility showed that there's indeed 15GB in there reserved but not used.

Now it's only 15GB, as I would expect it to be, with another 15GB reclaimed on my HDD. Hooray.

Another enormous benefit is that I no longer have Time Machine push 30GB over the FireWire every hour. Whenever I log out though, FileVault will compact the disk image (as it did in Tiger), and then Time Machine will back up - only those 8-MB stripes that actually changed, so the process is rather quick.

It is easy to understand why doesn't Time Machine back up the FileVault home directory while it's mounted - it would be too easy to back it up in inconsistent state as data is shuffled across stripes. Of course, I wish Apple engineers had more time to think about this, and solved it in a smarter way. I myself could tell them two better ways to handle this:

One: add a shadow file to the disk image while backing up to hold concurrent changes, merge changes into the image file upon backup finish. The underlying BSD foundation of the OS supports this. It would, however, probably create a perceptible temporary freeze of the system while the changes in the shadow file are merged with the disk image.

Two: create a similar encrypted disk image on the backup drive, mount it when the backup reaches the home folder, and just perform the whole Time Machine backup procedure between two disk images. I actually had a homegrown solution that did precisely this using rdiff-backup back on Tiger. Actually, when I first heard of Time Machine, I sort of hoped Apple will base it on rdiff-backup, and use this method to handle FileVault accounts.

rdiff-backup has the advantage that it can incrementally back up small changes in large files using the rsync algorithm (Time Machine copies whole modified file each time), and my method also preserved this incremental backup property in FileVault accounts, on a per-file basis, preserving both filesystem and backup semantics. I guess they lacked one smart guy in the engineering division for Time Machine, who was probably busy helping the iPhone division make their deadline... Oh well.

Anyway, now with tolerably fine-grained FileVault backups, I'm happy. Yes, I need to log out in order for my home folder to get backed up, but I was doing this for a while anyway, using CCC or Disk Utility to copy the whole internal disk to external. I used to do backups once a week; now I get automatic backup of the system every hour (which could come in handy if ever, say, a software install goes awry; never happened on Mac with me before though), and automatic backup of my home directory whenever I log out (which is not less frequent than once per week). Of course, the majority of my machine's state change happens in my home folder, so having its backup be more frequent than the system backup would of course be preferred, but such are compromises - I can't afford to run without FileVault.

(You might ask what happened with my homegrown rdiff-backup solution? It fell victim to my switch from a PowerPC to Intel Mac, as it would've required me to recompile a bunch of GNU stuff from source (rdiff-backup and its paraphernalia) which I didn't have time to do at the time of the switch, so it fell into oblivion...)

Monday, January 14, 2008

Kurt Gödel: hacking the U.S. constitution

I've been wandering through Wikipedia yesterday, and at one point ended up reading the page on Kurt Gödel. Gödel's achievements in the field of logic are indispensable to modern mathematics, and his incompleteness theorem has very far reaching implications in disciplines other than mathematics¹.

It is safe to say he was your typical deep thinker, and inward facing, not-too-closely in touch with reality type of person. This snippet from the page made me laugh real hard, because it so perfectly illustrates certain aspects of a math nerd. Listen:

Einstein and Morgenstern coached Gödel for his U.S. citizenship exam, concerned that their friend's unpredictable behavior might jeopardize his chances. When the Nazi regime was briefly mentioned, Gödel informed the presiding judge that he had discovered a way in which a dictatorship could be legally installed in the United States, through a logical contradiction in the U.S. Constitution. Neither judge, nor Einstein or Morgenstern allowed Gödel to finish his line of thought and he was awarded citizenship.

(emphasis mine)

Does anyone know whether he was allowed to finish his line of thought at some other time? (Not that I'd be personally interested in executing the idea, mind you.)

¹ Including the fact that free will of any mind is just an illusion it has because it can't contain a complete model of itself it could use for correctly predicting its own behaviour in advance. Enjoyable presentations of incompleteness theorem for layman include Douglas Hofstadter's "Gödel, Escher, Bach" (if you're 16 or older), or Raymond Smullyan's "The Lady or the Tiger?" (if you're under 16).

Sunday, January 13, 2008

Sim City goes open source

The source code for the original Sim City (the game responsible for countless hours I spent sitting in front of a computer when I was 16), have been released under GPL v3. If you ever played it (you did, right?), you had to admire all the cross-interaction of your planning decisions and certainly wondered about the underlying mechanics. Well now, you can read it first hand!
One of insightful quotes from the announcement:

The modern challenge for game programming is to deconstruct games like SimCity into reusable components for making other games!

That is a very important point, and illustrates well the unique aspect of exponential utilization opportunities of open source software. (Even if I personally believe more liberal licenses than GPL contribute towards this aspect more).

Tuesday, January 08, 2008

Java 6 on Leopard. Well, almost

So, it looks like there's finally a developer preview of Java 6 for Mac OS X Leopard. Am I happy? Nope. Why? Here's why:

		CPU Architecture
		PowerPC	Intel
CPU bits	32-bit		Machines I have
CPU bits	64-bit	Machines I have	Machines running Java 6

It only runs on Macs with 64-bit Intel CPU. Now, I have a 64-bit PowerPC Mac, and I have a 32-bit Intel Mac, but no 64-bit Intel Mac, so no cake for me. Darn. Hopefully the final release will run on all hardware that Leopard itself can run on.

Boot Camp experiences

So, I used Boot Camp (2.0, the one that comes with Leopard) to install Windows XP on a small (16 GB) partition on my MacBook Pro. Reason being that Half-Life 2 won't run under VMWare Fusion...

I'll summarize the experience here, with emphasis on problems I encountered and how to get around them.

First, the good things. In contrast with older Boot Camp versions (1.0 and 1.1 for Tiger), you no longer need to burn a CD with Windows device drivers for Apple hardware. They are now included on the Leopard install DVD, so after Windows is installed, you just need to pop in the Leopard DVD into the drive, and let Windows autorun feature start up the driver installer from it. It will also install Apple Software Update in Windows, and it will presumably keep the drivers up to date (there are no updates right now, so I can't verify this. It will offer to install QuickTime and iTunes though...). Mac keyboard extras work perfectly (volume control buttons, brightness control buttons, eject button). All hardware - graphics, sound, wireless seems to be perfectly supported. The trackpad gestures also work as expected. I only maybe wish they could have made the Windows recognize Cmd+Tab instead of Alt+Tab for application switching...

Now for the first obstacle: Windows would not display image on the external display attached to my MacBook Pro no matter what I did. Others reported the problem on various message boards, with different proposed solutions. After several dead ends, I went exploring on my own. (Note: this solution works for MacBook Pro machines with an ATI chipset.) I ended up downloading ATI Catalyst Software Suite. It unpacks into "C:\ATI" folder by default. Within that is "C:\ATI\CCC" (for "Catalyst Control Center") folder, with a "setup.exe". It will tell you that it needs .Net 2.0 runtime to run. You can get it here.

So, install .Net 2.0 runtime, then the Catalyst Control Center (CCC). Don't try to install the full ATI suite, as it will not succeed, only install CCC. Launch CCC after installed. There is a tab where you can enable/disable various outputs and specify their order. Turns out the ATI chip in my MacBook Pro has 3 outputs (built-in display and two externals), of which two are enabled (built-in and one external), and one external is disabled. The trick is to disable the default enabled external output, and enable the one that was disabled! Voila, the external display lits up! (Regardless of whether you're connecting directly via DVI, or, as I do, through a DVI-to-VGA adapter -- there were reports on the net claiming only DVI displays will work. Fortunately, this is wrong.) Finally, you can use CCC to swap the order of the displays; if you want the external display to be the primary display (one carrying the start button and the tray (as I do)), you'll also do this.

That's all there is to it.

Second obstacle: with a newly created Windows partition, Spotlight had problems after booting back to Leopard. It started indexing it, and never finished. It was pegged at "3 minutes remaining" for about a day, and couldn't be used for searching during this time. This was particularly painful for me, as Spotlight in Leopard is so much improved that I use it all the time, especially as a lightning-quick application launcher. Solution was to disable Spotlight for the Windows partition. I guess you could do it using Spotlight preferences, but I did it by typing

sudo mdutil -i off /Volumes/Windows \HD

from Terminal. (Of course, your Windows partition might be named differently, i.e. "NO NAME" instead of "Windows HD".) This immediately stopped indexing, and Spotlight was usable again. This might be a problem for you if you keep some content you wish to search on your Windows partition, but since I only keep few games on it, it's not a big deal for me.

Other than these two issues, I experienced no problems and am a happy camper (pun intended).