Constantly Changing: 2006

Thursday, December 28, 2006

No H.264 playback on Sony Ericsson W850?

My new gadget, the Sony Ericsson W850, is able to play back video (in addition to being able to record video, take photos, play music, and only incidentally, to make phone calls). It has a 320x240 screen (same as gen-5 iPod), and according to its freely available tech specs, supports "MPEG-4 Simple Visual Profile Level 0, H.263, H.264 Baseline Profile Level 1". My experience though is that it can't play back H.264, at least it didn't play back anything I could produce. I tried both encoding a few minutes clip from a DVD using Handbrake, as well as converting another few minutes DivX clip using iSquint, and the results are the same with both: I tried every combination of settings in both programs, and the combinations where vanilla MPEG-4 is used result in a video file that plays okay on the phone, and the combinations where H.264 is used result in a video file that results in a black screen on the phone (although the audio plays okay).

I have to stress that I'm not saying the device doesn't play H.264. I'm just saying that several video files converted by the above two programs with H.264 option selected didn't play. I could've run into some profile capability violation or something similar - I'm not that much knowledgeable in video encoding. Some people say I should try experimenting with command line ffmpeg. Honestly, I won't. It's not that important, really. While I can find time to listen to music on the go, I really can't imagine I'll want to watch any video content squinting at the 2" screen. The ammount of time I decided to invest in this went only so far as to try comfortable GUI options, with motivation being "because it was there." If anyone has more success than me though, please let me know.

Wednesday, December 27, 2006

Another kludge in Java - varargs

Varargs in Java. Seriously, who needed this?

You're probably aware of my position on Java Generics, how I view their implementation to be a horrific kludge, a compromise made in name of backward compatibility where the backward compatibility issues are in fact nonexistent.

Now, here's a new discovery - vararg methods. Someone asked on the Rhino list whether we can support them, and I answered that as long as he passes the vararg argument explicitly as an array, then yes. I decided to investigate the problem further within a codebase I'm truly familiar with - FreeMarker's BeansWrapper. I quite quickly came up with a scheme for supporting non-overloaded vararg methods. However, for overloaded ones, the exercise becomes, shall we say, a bit more involved. As I had to discover, the overloaded method resolution algorithm from Java Language Specification (JLS) 2nd edition was seriously overwritten into a three-stage algorithm in JLS 3rd edition, of which the "old" (mathematically quite elegant) algorithm from JLS 2 became the second stage. I probably need some additional time to grasp it, but it looks quite horrible now. If you wish, compare:

"Method Invocation Expressions" section in JLS 2

with

"Method Invocation Expressions" section in JLS 3.

Of course, the chapters regarding type conversions (required, as overloaded method resolution relies on the definition of "method invocation conversions", a subset of conversions that is allowed between actual and declared parameter types when invoking a method) are also substantially rewritten to cope with boxing, unboxing, and generic types.

And the worst part? Varargs are yet another Java the Language feature implemented without a Java the Virtual Machine support. They're purely syntactic sugar, with vararg argument being an array on bytecode level. Several annoying things:

reflection is unaware of it. You can't just pass more arguments to the method, you have to pass the vararags arguments in an explicitly allocated array.
No automatic object-to-primitive conversion when using reflection. If the vararg method's type is "int...", you must pass int[], not an Integer[]. Compare this with automatic
conversion for regular argument types.
Unintuitive behaviour. Quick, should the following example print 1 or 0:

public class Test {
  static void x(Object... i) {
    System.out.println(i.length);
  }
  public static void main(String[] args) {
    x(new Object[0]);
  }
}

Transparent Desktop Trick

Take a photo of what's behind your computer screen, and set it as your desktop wallpaper. With little care, the results are amazing. See it here.

Labels

So, Blogger got out of beta, and finally it added a labelling feature, which is something I wished for for quite a long time. I decided to label my post to sort them into categories. The categories are (in decreasing order of number of posts at the moment):

software for anything related to software development in general. Java, Open Source, concurrency, programming languages theory. Stuff you'd want to read if you're a fellow software developer.
side interests for serious stuff that I'm interested in as an amateur. Quantum mechanics, astronomy, math, society, politics etc.
mac is for stuff you don't care to read if you aren't interested in Apple computers.
personal is for writings of personal nature.
fun is for art/entertainment/humour things I came across and want to share them with you.
blog a very low volume, blog-related announcements (like this post).

I tried to keep the number of categories to a reasonable minimum, trying to partition the content into as few of them as possible, taking into account possible reasons people read the blog (i.e. it's different for those who track what I do in the software industry, and again different for friends and family).

The links above have associated RSS feeds, so you can choose to syndicate in your reader only those categories you care about. This will probably cause less frustration for me in the future as well, as sometimes I was reluctant to post a more personal writing because I didn't want to decrease the signal-to-noise ratio for (I presume) majority of people who only care for my software-related posts. Heck, I might even post some personal writings in Hungarian now :-)

Friday, December 15, 2006

MacHeist

Via Lifehacker:

MacHeist Bundle is a bundle of 10^* highly popular Mac shareware applications. This week only (which means for the next 2 days and 18 hours now), you can purchase them in a bundle for $49 instead of their combined list price of 308$.

Additionaly, 25% of your purchase goes to charity.

What are you waiting for?

^*To be completely honest, it's 9 apps at the moment. 10th app is TextMate, and it'll get unlocked (meaning a license code will be sent to everyone who purchased the bundle) if the total raised for charity reaches $100k. It's currently at $67k, so there's realistic chance to get TextMate too as part of the bundle.

Tuesday, December 12, 2006

AAC is better than MP3, Walkman instead of iPod, and what Think Different means to me

Fun fact I learned today: AAC encoding sounds better than MP3 at the same bitrate. Seen here. Re-ripping my CDs right now.

I also bought myself a new cellphone, a Sony Ericsson W850i, and despite some concerns I had initially, the gadget is just brilliant. It has a very decent digital music player built-in (Sony markets them under its Walkman brand) and comes with a 1GB memory stick (I can plug in a bigger one, up to 4GB, and am actively considering buying one), and I can just throw music files from my iTunes library to it using Mac OS X built-in Bluetooth File Exchange app, and it'll nicely index them and add them to its own library. Plays both MP3 and AAC, and recognizes Apple's proprietary iTunes metadata. Just sweet. Guess I'll defer buying myself an iPod for now. Kriszti and me are going on a longish trip to Graz for New Year, so we'll see how it holds up against prolonged use, battery life and overall experience-wise. The phone's contact list and calendar are also syncable with Mac's Address Book and iCal using iSync (after buying a plugin online for 1.43 GBP), a critical feature for me.

As a matter of fact, I initially bought a Samsung Z560, since it's the only phone on the continent at the moment offering HSDPA (think GPRS, only at 1.5MB/s!), but had to return it to T-Mobile after an hour of use and a short frantic search on discussions.apple.com after I realized that it can't sync with Mac, and is not likely it ever will. (Why? Because apparently Samsung has several divisions producing cellphones,
and each of them uses a different sync protocol, and all of them are proprietary...)

And this brings me to the point of that I'm slowly learning what the Apple slogan "Think Different" might mean to me. It means "shake off your Windows-user assumption that any periferial device will work with your computer". I bought a laser printer earlier this year that only works with Mac after installing some cross-compiled Linux drivers. I bought a cellphone that doesn't sync with Mac (fortunately could exchange it for another). I've learned the lesson. I'm shopping for a joystick now, but am not even looking at anything that doesn't feature "Mac compatible" on the box. There's simply no such thing as "maybe it'll work". It's either "it is supported and then Just Works(tm)", or it is "don't even try". In very rare circumstances you can find stuff that is not supported, but works with more or less effort, but it is really a question of dumb luck. Like Sony Ericsson W850i (with a 3rd party plugin - although to be perfectly honest, Apple supports all Sony Ericsson phones, W850i is just too new, and next release of iSync will probably support it out of box), or my Xerox laser printer (works if you dare install a cross-compiled inux CUPS driver). Think Different = Don't Assume It'll Work. Get Used To Doing More Research Than When You Used Windows.

Wednesday, December 06, 2006

Dungeon Master miscellanea

My 16-17 year old self has spent more time playing Chaos Strikes Back on Atari ST than my 32-year old self dares to admit.

Turns out, there's a whole fan subculture for Dungeon Master and Chaos Strikes Back out there. There's The Dungeon Master & Chaos Strikes Back Encyclopaedia with information about games as well as a download section with several reincarnations of the games.

Most notable among them is one by one Paul R. Stevens who created an exact replication of the game. Here's how he did it, in his own words found here:

Finally, after many years, I got my hands on the binary executable for the game. I wrote a disassembler to turn it into human-readable op-codes and proceeded to translate it to C++ using the Microsoft Version 6.0 C++ compiler. Eight hours a day for six months. About 120,000 lines of pseudo-assembly language. Crazy thing to do. But it works.

Just insane. Clearly, one of the highest examples of computer nerdiness I've ever came across. Hat off. The biggest achievement is that the resulting code is fairly cross-platform; there are distributions for Windows, Linux, and Mac OS X. And I can certify, the gameplay feeling is identical to that you get from playing it on an original Atari ST. Well, except it loads much faster. And the pixels look blockier on these damned modern high-res displays than they did on a TV screen back in the day. And the Mac OS X port for some reason doesn't deal well with keymaps, so you lose the ability to strafe from the keyboard, which can make the game tedious at times. The Windows version fares better in this regard, and you can even assign mouse clicks on a particular screen coordinate to keys on the keyboard. As anyone who played these games in original will attest, this is an invaluable feature to have as it is much faster to hit 6,4,4,spacebar than to click on four icons on the screen to cast a fireball :-)

Then there's Dungeon Master Java by Alan Berfield, an also cross-platform version of Dungeon Master engine written in Java, with custom raytraced art, giving the game a bit more polished feel.

And then if you look a bit more around on the net with a search engine, you can also find an incredible ammount of custom-designed dungeons fans are creating.

Jeez, now if only I again had as much time as I had 16 years ago...

Saturday, December 02, 2006

Poor Man's Message Prioritization

Imagine that you have a distributed enterprise system held together with messaging middleware (JMS, to be specific). Imagine that it's processing a steady load of work in bulk, but not too often there is some time sensitive operation and your system needs to expedite a message or two to meet its time constraints.

If you're lucky, your chosen JMS provider supports message prioritization. Turns out that unfortunately many do not - they simply ignore the priority parameter in MessageProducer.send(). The one I happen to be using also does not support prioritization. Switching to a different provider turned out not to be an option unfortunately - after spending a nontrivial ammount of time (about a month) testing few other, both open source and commercial offerings, none had both the performance and high-availability of the one we originally used.

What I did to overcome the lack of prioritization was "inventing" (quote marks as quite possibly other people had the same idea already) what I'm referring to as "poor man's message prioritization" - instead of using a queue FOO, I now use two queues FOO and FOO_H (for "H"igh priority) and split the JMS sessions used to process the FOO queue into two separate sets of dedicated processors for FOO and FOO_H, typically with only few sessions servicing the "_H" queue, i.e. a 95%/5% split.

What's important to realize is that what's implemented here ain't a message prioritization. It's just separation of traffic. Logically, the queues FOO and FOO_H are treated from the system's point of view as a single queue, and I refer to the individual "physical" queues as "lanes". Sometimes I refer to this technique as "laned JMS".

It does however create the illusion of prioritization as long as the ratio of prioritized traffic to nonprioritized traffic is lower than the ratio of threads processing prioritized traffic to threads processing nonprioritized traffic. It's a workaround for an annoying lack of a feature in the JMS provider. I won't pretend I'm happy with this - it works, and the implementation I came up turned out to be quite clean, but it's still more work to do for me and more complexity in my code; it should've been taken off my back by the JMS provider vendor.

As for technical implementation, it might sound tedious to implement this across the whole codebase, but is actually not that awful:

On the message producing side, it's sufficient to write a wrapper implementation of JMS Connection interface that is mostly pass-through, except it creates wrapper Sessions, that are themselves mostly pass-through, except they create wrapper MessageProducers, that are again mostly pass-through, but implement the send method with a priority parameter so that if priority is less than 6, the message goes to the base queue, otherwise to the "_H" queue. This way, client code that sends messages just keeps using the Connection.createSession(), Session.createProducer(), and MessageProducer.send() API with priority parameters and is just blissfully unaware of laning in the background.

On the message consuming side, it'd be tricky to implement this across the board, but luckily all my JMS based message consumer services build on a common codebase that I modelled after the Servlet container/Servlet API model - basically, I have a "JmsServiceContainer" (analogous to Servlet container) that deals with session management and other JMS plumbing, and a "JmsService" interface (analogous to the Servlet interface) whose implementations are more-or-less factories for session-specific MessageListeners that do the actual work on a single message. So, I only had to write the enabling code for laned processing in the "JmsServiceContainer" - basically assign the same JmsService to both lanes of a queue. This way, the "JmsService" implementations also remain blissfully unaware - they're just creating message listeners, and don't care that the messages for those listeners now come from two different queues.

Business trip DTD


<!ELEMENT business-trip (
    airport,
    (office,drink?,dinner,drink+,shower,sleep,breakfast?)+,
    airport
)>

There were efforts to somehow fit a new element, sightseeing into a revised specification, and specify it in the business-trip's content with an optional quantifier, however usability research has shown that real-world documents would almost never contain an instance of it, mainly due to external forces that work to keep the document size at minimum.

Wednesday, November 29, 2006

Broken LEGO (myth)

My wife found a broken LEGO brick in our son's room this morning. My kids apparently managed to break a LEGO brick in half, and with it the associated myth (well, at least a belief I held) that the damned things are in fact unbreakable in a domestic environment.

Absolutely coincidentally, several hours later today, Slashdot's cover page is graced with an entry "How They Make LEGO Bricks" that actually links to a BusinesWeek article "The making of a... LEGO".

(Very Useless Fact of the Day: I keep my laptop (when it sits on my desk connected to an external display, keyboard, and mouse) sitting on 4 LEGO Duplo bricks I stole from kids to provide better airflow to it.)

Tuesday, November 14, 2006

Rhino relicensed under MPL/GPL

Mozilla Foundation decided to relicense the Rhino JavaScript interpreter to use the MPL/GPL dual-licensing instead of NPL/GPL dual licensing. The reason is primarily that Apache Software Foundation published a 3rd-party licensing policy recently, and NPL was listed as "Excluded Licenses" in it, therefore prohibiting several Apache projects (Coccon, Batik) to redistribute Rhino in the future. MPL, on the other hand falls into Apache's "Binary Licenses Only" group, which'll allow those projects to bundle Rhino binaries with their distributions. Also, Mozilla's lawyers concluded that Mozilla has the legal power to change the licensing from NPL/GPL to MPL/GPL easily, so it looks like the best solution for the time being.

I plan to release the latest stable branch as Rhino 1.6R5 soon. The binaries will be completely identical to 1.6R4, the only difference being that they'll be recompiled from relicensed source, thus the line number tables in classfiles will reflect the difference in length of the boilerplate license code on top of each source file. Apache folks can then include Rhino 1.6R5 instead of Rhino 1.6R4 binaries in their project distributions, or if they're impatient they can compile it themselves from the "Rhino1_6R3_PATCH" branch in the CVS. (Of course, CVS HEAD is now MPL as well.)

1 << 5

Today's the day when my life duration has doubled for the fifth time, relative to the last such event, (counting my first birthday as the 0th). (With bit of a luck, I'm in for yet another such event down the road.)

For those less math nerd folks out there: I turned 32 today.

I don't usually care too much about arbitrary milestones in the flow of time, my own birthday not really being an exception, yet last night I couldn't help thinking about it (not being able to fall asleep for about two hours after going to bed - I turned from good sleeper to quite erratic sleeper lately). When trying to assess what happened since the last birthday, it always seems life only brings gradual changes (the previous year carries a rather big exception, hopefully I'll be able to write about it sometime soon). However, on such a round number occasion (both %100000 and 0x20), it made me wonder about everything that happened or changed since I was half this age.

And I have to say, it's a hell of a lots of things. My 16-year old self was a quite blissfully unaware gymnasium student in Croatia whose biggest problem was how to skip history classes to go hack on the Apple-II machines in the school computer lab, as well as to write games and fractal generators for Commodore 64 or later for Atari ST. And girls. Were a problem too, that is.

Shortly thereafter, I endured one war and one exodus that completely uprooted me and my family, had to settle in a new country. A bit later found true love, then attended university, and over the passing years grew into several different roles, including that of a husband (for more than eight years now), of a father (for more than seven years now when my son Ákos was born, redoubled the role two years later with Zsuzsi), and of a (well, what at least feels like) reasonably respected IT industry professional. I'd really like to elaborate a bit on all of this here and now, but there are two big driving forces against it: (a) not wanting to bore you to death, and (b) the aforementioned three roles unfortunately don't leave me with much time at present to write sentimental weblog entries about myself.

All in all, the previous 16 years were probably the most dynamic period of my life, past and future. Regardless, I'm looking forward to the next 32 years; stay tuned for the "1 << 6"

Wednesday, November 08, 2006

Java Notes

When searching few days ago, Google accidentally turned up Java Notes, created by one Fred Swartz (who apparently teaches at University of Maryland Unversity College). Poked around it a bit, and I must say I was pretty impressed by the quality of those examples I looked at; it indeed looks like a nice teaching material for beginning Java programmers, one Mr. Swartz put into many hours to assemble. I mean, just look at the exhaustive discussion of the algorithm for finding the maximum element of an array!

What's an especially nice additional touch is that the author indeed donated all of this to public:

Many textbooks show useful code examples, but ironically copyright them so you can't legally use them! All Java code examples in Java Notes and Java Basics are placed in the public domain.

Wednesday, October 11, 2006

Of empty bags, or how to coerce a null

Let's say you implement a language (we'll call it "higher layer") atop of an existing language (we'll call it "lower layer"). Examples I work on are Rhino (JavaScript atop of Java) and FreeMarker (text generator template language atop of Java). Imagine that you wish to expose the lower layer objects within the higher layer. What do you do? Well, you implement a wrapper. Your wrapper object will then do its best to disguise the wrapped object as being native to the higher layer (you can use the terms "marshal" or "coerce" instead of "wrap" if you want to appear more serious when talking about it).

Now, this'd be okay if it weren't for the fact that sooner or later there'll inevitably be a situation where there's a semantic ambiguity between the languages, where you can design certain behaviour in more than one way. And inevitably, whichever way it is designed, you'll have users that'd prefer it be designed some other way.

One of such issues is how to coerce a lower layer representation of a null (or nil, or nothing, etc.). Suprisingly, coercing a null the right way is not actually ambiguous at all, yet folks from time to time request it to be done differently, or outright declare the current behaviour to be a bug.

Freshest example is here. Basically, this user would like it if in Rhino, the JavaScript == operator would find equal a null and a wrapper that wraps null. Basically, if x implements the Wrapper interface and ((Wrapper)x).unwrap() == null, he'd like it if ScriptRuntime.eq(x, null) == true.

Now, this'd be utterly problematic. This reminds me of one of my first set theory classes where the teacher stressed that an empty set is not equal to a single-element set containing the empty set (if it was, it would make it impossible to construct the set of natural numbers axiomatically starting from set theory, but I digress). An empty bag is not equal to a bag containing an empty bag, and null is not equal to a wrapper disguising null. (However, any two wrappers that both wrap a null could be equal if the higher-layer language considers null == null to be true for its own null representation, provided it has one at all.)

Although you'd probably be better off if when your custom wrapping code is faced with the request to wrap a lower-layer null, it'd just return a representation of the higher-layer null (if there is one - in Rhino at least JS null is represented by Java null, so custom wrappers can have it quite easily) instead of creating a wrapper object for it. That's the most acceptable way to coerce a null.

Friday, September 29, 2006

Gradual typing

Via Lamda the Ultimate: Gradual typing as the ultimate unification of static and dynamic typing. The basic idea is that the type system does not enforce you to specify types, but when type information is present, it is used for static checking. This way, you can reinforce the structure of your program by using static types, but you need not be hindered by it when you need more flexibility. The paper's references are worth reading themselves, especially (a two year old, but just now discovered by me) "Static Typing Where Possible, Dynamic Typing When Needed: The End of the Cold War Between Programming Languages" by Erik Meijer and Peter Drayton. I was already quite delighted by the abilities of type inference in Scala as compared to Java, but the possibilities outlined in these papers are such that I just can't wait for them to get adopted by other mainstream programming languages on managed platforms.

I'm all for contracts in programming - the more intentions you can express in the source code in a form that the compiler can understand and enforce, the more errors you have caught early. Static typing is just a subset of possible contracts you can enforce on your code. Also, contracts allow you to write much terser code where eventual ambiguities arising from omitting declarations can be resolved by applying contractual expectations in effect. However, the critical point here is "can" - it gives you the most flexibility when you "can use it" to express the intents but you aren't forced into a "must use it" as you are with lots of today's statically typed languages. To me, an ideal language and compiler would be one that:

Has broad expressive power for programmer intentions in forms of type declarations and contracts, but

doesn't force you into using them, however

enforces them when they're used, and ultimately

can clearly indicate which pieces of code are compiled as dynamically typed so I can periodically scan the code for unwanted type weakness.

Of all this, it'd already be a big improvement if at least normal type inference got into Java in the foreseeable future, just as it got into newest C#. In the meantime, there's always Scala for more pleasant JVM work. Yes, I know I talk about Scala too much lately.

Wednesday, September 20, 2006

Manifold Destiny

The New Yorker published a rather fascinting (and rather long) piece about a month ago about the people involved in the proof of the Poincaré conjecture (guess we can start getting used to it being called a theorem now). I started reading it as I hoped to discover more about Grigori Perelman, and the article indeed provides plenty of information about him, but also provides insight into the politics of the mathematical communities, and much more.

Here's the article authors' rather amusing account of the quite unconventional way they managed to meet with Perelman:

Before we arrived in St. Petersburg, on June 23rd, we had sent several messages to his e-mail address at the Steklov Institute, hoping to arrange a meeting, but he had not replied. We took a taxi to his apartment building and, reluctant to intrude on his privacy, left a book—a collection of John Nash’s papers—in his mailbox, along with a card saying that we would be sitting on a bench in a nearby playground the following afternoon. The next day, after Perelman failed to appear, we left a box of pearl tea and a note describing some of the questions we hoped to discuss with him. We repeated this ritual a third time. Finally, believing that Perelman was out of town, we pressed the buzzer for his apartment, hoping at least to speak with his mother. A woman answered and let us inside. Perelman met us in the dimly lit hallway of the apartment. It turned out that he had not checked his Steklov e-mail address for months, and had not looked in his mailbox all week. He had no idea who we were.

Friday, September 08, 2006

The 9/11 Report: A Graphic Adaptation

I have a copy of a book whose full title is "The 9/11 Commission Report: Final Report of the National Commission on Terrorist Attacks Upon the United States, Authorized Edition" on my bookshelf - I bought it a year ago in a bookstore at the JFK airport. It's a hefty 567-page tome and a rather dense reading (I'll admit that by now, I managed to read just a small part of it). Anyway, it's definitely not something I'd expect to see in a graphic adaptation, yet that's exactly what you'll find over at the Slate magazine. (BTW, for whatever reason it opens at Chapter 13. If you want to go to the first page, just click the orange "9/11" at the top left of the page.). It turns out you can buy it in print, too.

UPDATE: fixed the link to Slate - sorry about the glitch...

JAOO

Just a heads up that I'll be attending JAOO again this year. If you too are around and feel like meeting me, look for a guy that looks something like this, only with his hair given 4 more months to grow by October :-)

Tuesday, September 05, 2006

Server sent events

Came across a post on Opera Web Applications Team blog saying Opera will soon support Server Sent Events. Basically, you embed a special HTML element into your document containing an event source URL, and then the browser opens a persistent HTTP connection to it and have the server stream events to it over it and interpret the events as commands for transforming the DOM tree of the document.

This looks like the formalization of an early rudimentary form of AJAX (not called that then, of course) that I stumbled across about four years ago, where they had a page with two frames, one for the content, and another 0-pixel wide one where the server kept streaming an endless HTML page with JavaScript commands that the browser executed in chunks as it received them and thus manipulated the DOM tree of the page in the other frame.

It sounds like a nice thing, with only a single worrisome bit - it doesn't scale. It'll definitely be nice for intranet applications, but I just can't imagine a web application on the Internet based on this technology that can serve several thousands of clients, because that means thousands of open persistent HTTP connections at once. It may work if you code your server using non-blocking IO instead of the traditional one-thread-per-connection model, but I have the feeling sooner or later you'll hit a resource limit, i.e. run out of file descriptors as the number of persistent connections goes up. Right, this can be a concern with ordinary HTTP as well, but at least there connections are usually short and not kept open while idle. On the (yet an)other hand, it might still beat the current AJAX technique of periodically polling the server for events - depends on the frequency of polls, I guess.

Anyway - seems like yet another Web 2.0 feature to keep an eye on.

Friday, September 01, 2006

Life's important questions, 5 year old's edition

My kids like to ask me questions after I've tucked them in for the night - a good technique to delay the sleeping a bit longer. They know I love explaining the world to them, so it's a quite sure thing. Last evening, my 5 year old daughter suddenly asked "Dad, how is plastic made?". I had to give it a bit of a thought and then proceeded to try to explain her the process as I could, hoping I could tell it adapted to the level of a bright, but nevertheless five-year old kid.

Halfway through my exposition, she impatiently interrupts me with "Okay, okay, but how does it end up being blue and shaped as a pony?". Ah, so that's the actual burningly important question :-)

Tuesday, August 29, 2006

Japanese Algorithm Dance

Japanese Algorithm Dance (compiled from various episodes of Japanese children TV-show "Pitagorean Switch").

Even if you think you don't get it, don't stop watching until you have seen the part with ninjas.

(Via Little Gamers)

Thursday, August 17, 2006

Scala might be Java platform's new hope

Last saturday was rainy, and I spent a big part of saturday's afternoon reading "Scala by Example". I was, and am still, rather blown away.

As many other fellow professionals working with Java on a daily basis and having it pay our bills, I'm somewhat dissatisfied with the language, especially seeing the innovation going on in the C# language. C# acquired lots of interesting traits in its 2.0 and upcoming 3.0 release, and let me list some of them here without striving for completeness:

Anonymous functions, and even

lambda expressions

type inference, so instead of "String x = new String()" you can write "var x = new String()"

related to type inference, it now has anonymous types, which is a very nice feature for i.e. adding strong typing to a results of a SQL projection

generator functions, using the "yield" keyword. This allows certain quite useful forms of continuation-passing programming techniques incl. coroutines without it looking much like continuation passing at all, much like in Python.

And I could go on. Now, there's little hope for Java to achieve this, however there's Scala - a language built on top of JVM, its compiler producing Java .class files that interface seamlessly with any other Java code, and supports a bunch of the above features.

One of quite mind-blowing aspects of the language is its support of generic types. You can explicitly require nonvariance, covariance or contravariance for type parameters and the generic types will act accordingly. I.e. a Stack[String] by default is not a subclass of Stack[AnyRef] by virtue of String being a subclass of AnyRef, but it can be, if Stack[T] is defined to be covariant in the T type parameter. Scala even has a type named "Nothing" that is the bottom element of the subclass relation lattice, a subtype of all types. You can declare the empty stack to be of type "Stack[Nothing]", and have it be compatible with any other stack type without annoying compiler warnings. Contrast this with Collections.EMPTY_SET in JDK 1.5.

There's some real innovation going on in the area of libraries in the C# world. Things that come to mind are LINQ and CCR. Both of them heavily leverage the new syntactic aspects of C#. It occurred to me that if one were to "port" these libraries to JVM, one should probably do it in Scala, and not in Java. In Java, you'd end up with lots of explicit interface declarations and anonymous inner classes, that have quite a syntactic baggage, i.e. compare the verbosity of

filter(x => x * x)

with


filter(new LambdaExpression()
    public float calculate(float x) {
        return x * x;
    }));

Rather straightforward, isn't it? Also, I realized that Scala's for-comprehensions basically already implement the basic LINQ. You can write expressions like:

for (val p <- persons; p.age > 20) yield p.name

To obtain a list of names of persons older than 20 years. LINQ does the same basically for in-memory objects. In Scala, a for comprehension works on any kind of a collection that appropriately implements the "map", "flatMap", and "filter" methods. Built-in lists, streams, and arrays all do, which is quite a good start. Unfortunately, this is something that can't be easily extended to relational data sources, at least not until Scala allows the argument to "filter" to take a parsed abstract syntactic tree of a lambda expression instead of a function object with the compiled bytecode for the said expression. C# people had to resort to a trick here with their DLINQ implementation - they now have the C# compiler emit a representation of the AST for a lambda expression if the type of the variable it is assigned to is a special "System.Expressions.Expression" type. That way, the DLINQ can analyze the lambda expression and convert it into a SQL query. As I said, Scala doesn't have this feature - yet. Being free of standardization lock-in and of legacy baggage, it could soon gain this feature as well.

As I said, if you work with Java, consider whether Scala could fit your next project. You needn't give up any of your Java infrastructure and libraries, as Scala compiles to bytecode, and you gain the expressivity and productivity that a fully featured functional language plus a big pile of accompanying syntactic sugar can give you.

Friday, August 11, 2006

Jailhouse innovation

Via Bruce Schneier's blog:

A collection of 11 prison shivs confiscated over 20 years ago in New Jersey.

Think about these, and the adverse conditions they were made under, the next time you see someone's pocket knife being taken away from them at airport security. We can't keep weapons out of prisons; we can't possibly expect to keep them out of airports.

Not entirely unrelated, Prisoners' Inventions, an exhibition of reproductions of objects created by prisoners from the available materials by an incarcerated artist. From paper mache dice to a tatoo gun.

Wednesday, August 09, 2006

Good concurrency article

Here's a good article on code concurrency on MSDN, written by Joe Duffy, a concurrency-obsessed Microsoftie whose (mostly concurrency-on-Win32 related) blog where I found the reference is otherwise here.

While the article talks about CLR when it brings up examples, the discussion is actually generic enough to be of interest even if you write code that targets the JVM. It covers many aspects and pitfalls that you need to keep in mind when developing parallel(izable) applications. You even get few theoretical equations you can use to calculate the optimal number of threads to use as well as the maximum achievable performance increase through parallelization. (Assuming you can figure out the values of the variables in those equations for your system... ahem...) At the bottom of the page, there's a box named "Recommended Reading" which links to three more articles that look like they're also worth giving a shot.

Also on MSDN, Jeffrey Richter (the guy who wrote "Advanced Windows", a book that taught me Windows programming back in 1995 (together with Petzold's) and was the first technical book I came across that was also full of good jokes) writes about the Concurrency and Coordination Runtime, a CLR library that promises to make writing concurrent code much easier than it is "the manual way" (read: managing your threads and synchronization on your own; y'know, that which used to be the only way). What's interesting is that he also points out how concurrency is especially of importance in robotics applications, where there is really a great deal of processing going in parallel - all data coming from different sensors, multiple motoric instructions, etc. I also learned that Microsoft apparently has a product called "Microsoft Robotics Studio" targetted for writing software for robots. Hm...

Sunday, July 30, 2006

Blown up car, cornfield, midnight

Here's what I was doing last midnight.

I'm standing beside my car in a middle of a dirt road that's cutting through a cornfield. I don't dare go further as my car has sporty, quite low-suspension and I'm afraid one of the holes in the bumpy dirt track ahead of me will prove too deep for it. My wife is trying to figure out how could I safely turn the car back. What are we doing in the middle of the nowhere at this hour anyway? Why aren't we at least on some regular road, if not safely tucked away in bed?

The problem is, few minutes earlier we came across a wreck of a blown-up car blocking the normal road, lying on its roof (or what remained of it), surrounded by one fire truck, several police cars, and a slew of firemen and policemen, impossible to drive past it. On advice of one of the by-standing villagers, we tried to get around it on a "back road", which turned out to be the aforementioned dirt track through a corn field. When it proved unpassable for my Mazda, we scrambled back to the road, to wait for the firemen to eventually clean up the wreck from the road.

My biggest problem though is that when we came upon this roadblock, we were only ten short kilometers away from our beds at my parents' house, after I spent my last thirteen hours behind the wheel (only stopping for gas), covering nine hundred kilometers. I'm exhausted beyond belief. I can't actually believe this is happening to me.

(Not too relevant to the story: the car blew up because its fuel tank was leaky. The kid driving it, his driving license only four days old, miraculously survived it with only burns to his legs, as bystanders told me.)

I was royally pissed off after all other things that happened to us earlier that day leading up to this. You might have noticed it took me thirteen hours to cover nine hundred kilometers. It's a very bad average speed, considering I drove most of it on highways. Short explanation for it is two words.

Italian highways.

We spent our vacation this year near Rimini in Italy. The people were kind, the sea warm, the food great. Everything was perfect, except for italian highways. We run into a congestion because of an accident, both ways. Took us more than an hour each time to get out of it, driving in lockstep. You'd say it's no fault of the highway system itself - accidents happen. That's true. However, there's also one 100% predictable, huge congestion that's coded into the system - the tollbooths near Venice. The idea is that you pay as you leave the section of the highway built and operated by one particular company. They do however have a throughput problem, which manifests itself in a nine kilometers long queue of cars before it. Yes, nine kilometers. In rows of three. Unfortunately, their business model completely defeats the function of the highway. The function, to me at least, being efficient road transportation. It took us an hour and a half of driving in lockstep from reaching the end of the queue to clearing the tollbooth. Together with the one-hour accident-caused congestion between Rimini and Bologna, this resulted in four and a half hours to cover the first two hundred and seventy kilometers of our trip back home. On a highway. Do the math.

By the time we reached the tollbooth, both me and my wife (and our seven-year old son, too) were red with fury. The poor clerk at the booth got on its receiving end. It isn't his fault, but he was the closest human representative of the company that operated the highway, the company we by that time hated fiercely for operating a highway where you stand in a queue by design with thousands of other cars for an hour and a half, or more if you're unlucky enough to run into an accident. And then you pay fourteen euros for the privilege. We told the clerk how this isn't a highway, this is an inhumane joke, a horror, not something belonging to western civilization, how even Balkans are better, and we know because we were to Balkans earlier. He had a look telling us he hears this kind of testimony of customer satisfaction regularly.

So, I won't make any service to that highway, as it didn't really do me any service either: I'm telling anyone reading this that if you can avoid using higways going through Venice-Mestre tollbooths, avoid them. You can't go any worse on secondary roads - they're free, and you'll probably have a far, and I mean far-far better average speed and fuel economy. You certainly won't get two and a half hours behind your travel schedule. And this isn't an isolated event - I drove on that highway five years ago, and it was the very same experience back then as well.

Those +150 minutes then led to me standing with my car in the middle of the night in a cornfield, exhausted, ten kilometers away from a bed I should've instead been in at that time.

Update: To be completely honest, we did acquire some more delay after we left Italy, while driving through Slovenia, due to one case of bad signage (in Maribor), a blindingly pouring rain, one case of detour (near Murska Sobota), and a general disagreement between our road atlas and the physical reality regarding existence of certain roads. The exploded car wreck and the cornfield were really just the finishing touches in that day's demonstration of God's sense of humour.

Tuesday, July 18, 2006

A grab-bag of two-week memories

This is really just a quick grab-bag about various things that happened with me in the last two weeks.

Been to Croatia a week ago. Went on workdays - thursday and friday (and saturday). Kids were at wife's relatives, wife was working on these days, so my absence had minimal impact on family :-) It also had minimal impact on work, as I took my laptop with me and stayed at a friend who could provide me with internet connection, so I worked during the day and visited friends in evenings. I even called into the work-related conference calls during these days, although calling US numbers from a Hungarian cellphone while romaing in Croatia earned me a call from T-Mobile customer relations next monday asking whether there's a chance my phone was used unauthorized as they registered calls worth 300$ in a single day on it. Whoops, here comes my record phone bill. Anyway, it was really great to visit childhood friends and go for swimming at sunset in the same lake I swam in every day of every summer of my childhood. This was my very first return to that lake in fifteen years - since I had to leave the region because of the then-war. Yes, I'm being sentimental. A bit.

I desacrated my MacBook Pro by installing Windows XP on a small 8GB partition on it few days ago. I guess I just couldn't watch my copy of Far Cry gather dust on the shelf anymore knowing that I didn't complete the game before I switched to the Mac. I have to report that all is peachy with it. It even takes advantage of the two CPUs reasonably well (i.e. it runs with over 50% CPU utilization). After few hours of installing XP, Far Cry, and all patches for Far Cry, I even got a chance to play with it for about an hour :-) Far Cry BTW is one hidden gem of a first-person shooter - it brought the same graphical excellence and gameplay experience to the market as Half-Life 2 did, only Far Cry hit the market about 9 months earlier than Half-Life 2 did. It is somewhat underappreciated compared to HL 2 though, unfortunately.

Been re-watching Futurama Season 1 lately as work-unwinding. It stuns me as a bit boring and predictable - well, maybe because I already saw it once, but still. I don't have the same feeling when rewatching The Simpsons. Pausing it a lot though to spot various not uncommon easter eggs that are visible for only a second or so. Speaking of work-unwinding, I'm trying to cram in at least half an hour of cycling or running lately in the evenings. I noticed that a bit of a physical activity after an all-day sitting in front of a computer really refreshes me for an evening of Uno with kids :-)

Been listening to "Kite" and "Sometimes You Can't Make It On Your Own" much lately. Not going to explain it - if you're a close friend, you understand anyway.

Oh, and here's your movie recommendation: make sure you watch "Hoodwinked!". It's an indie CG animation feature "loosely based" on Red Riding Hood. Better said, it turns it a bit upside down and is absolutely hilarious. It being indie shows at the CG models and animation, which are few years behind the big-budget Holywood state-of-the-art, but believe me it wouldn't diminish the experience at all - the lovable and zany characters, the twisty story and the jokes, provide for over an hour of fully immersive fun. My wife generally doesn't like animation, but even she said this was a cool one.

On professional side, few things are moving. Just asked Norris Boyd today to pack up the current Rhino CVS HEAD and release it as Rhino 1.6R3 - last release was over nine months ago, so it's about time we give people a bunch of bugfixes in an officially blessed release state. Watch the Rhino download page to see when 1.6R3 pops up for download. Shouldn't be more than a day.

I'm still trying to find myself a bit of a time to learn a new programming language. No specific reason, just trying not to narrow my view too much on Java and try a language that forces me to adopt/discover new ways of thinking about software architecture. The only problem is, there are too many candidates. LUA, Haskell, Ruby, to name just a few. There's one particularly interesting new language that seems to get lots of publicity lately: Scala. A fully OO (every value is an object, no primitive/object types dualism as in Java) and at the same time fully functional language, that also natively compiles to either CLR or JVM bytecode, allowing it to be used within a .Net or Java system seamlessly. This is quite an advantage since it makes it possible to use any Java library with it, something that I can maybe readily and easily introduce into daily work if need be. I sometimes find myself in a situation where an otherwise elegant idea takes quite a verbose and/or awkward code to be expressed in Java, and think that a language that is more friendly toward designing internal domain-specific languages (Ruby, as Martin Fowler demonstrated it nicely during his presentation at JAOO last year), or even comes standard with macro preprocessor of some description (yes, I know C macros are evil, but I don't generally use them in evil ways) would really help reduce clutter. Maybe Scala? Don't know yet. I did download its full documentation - something to print out and then read on my vacation in Italy next week. Wife is going to kill me for it, though :-)

Saturday, July 01, 2006

First penguin to climb Mt. Everest

Few days ago, while spending our vacation in a camping near the Hajdúszoboszló Aquapark, in the evening sitting on a bench in front of our trailer home, my son Ákos asked with a fully serious face: "Dad, what was the name of the first penguin to climb Mount Everest?".

I just adore my big seven year old son.

We tried to discuss it briefly, and he suggested that some alpinist could actually tie up a penguin and take it with him to the Top O' The World, but I told him that the animal rights activists would have a word or two about it, so it's highly unlikely. On the other hand, we speculated that as a rule, humans only climb Mt. Everest during summer and maybe penguins visit it in the winter, that sort of weather certainly suiting them better, when no human can observe them. We envisioned a crowd of penguins in full alpinistic equipment gathering at the foot of the mountain, looking enthusiastically to the challenges that await them. We rolled with laugther.

He then went on to invent a story about a penguin who left South Pole as he wished to see the world, climbed Mt. Everest, crossed the Kalahari desert where he found a small town, settled in it, then went on to earn a living first by being a street musician (we recently watched "Cars", and they show One Man Band before it, maybe that's where the idea came from) and later by building a power plant and wiring the houses and selling electricity to the city. There was also a wish-granting magic stick involved in the story that the penguin used to wish all his penguin friends he left behind are with him, but it backfired as they quickly died of dehydration in the desert (when I asked why our hero, also a penguin, didn't die as well, he explained that he travelled in a special aquarium car that kept him wet). Fortunately, for undisclosed reasons, the magic stick only transferred third of his friends, so he presumably got few more left back home.

I remember sitting on a chair placed to face opposite the bench, rendered too unwilling to move by the slight fever accompanying my bronchitis (ideal development for a vacation, huh? Right, I thought so too.) and not very willing to talk either due to a sore throat, and just listening to him fascinated as he tirelessly spun his story further and further for at least an hour. It finally ended when I told him we'd have to head for showers and then for bed soon lest we be totally consumed by mosquitos. By that time the penguin (who remained nameless, or at least, unnamed) was providing electricity for the whole world, but in the end got homesick, went back to South Pole (yes, I know strictly speaking they don't live at the Pole proper, but that's how he told the tale), and divided the wealth he accumulated during his electricity tycoon and street musician times among its (remaining two-third of) friends.

Few days later, on a similar evening while sitting on a bench and eating cherries he asked me whether there's a limit to one person's creativity.

Guess what I answered him.

Wednesday, June 14, 2006

Accomodating for everyday parallel computing

A bit more than a year ago, Herb Sutter published "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software" in Doctor Dobb's Journal (which is incidentally the only printed computer magazine I buy). If you don't know this article yet, go and read it, I'll wait right here.

In case you're hasty and decided to skip it in spite of better advice, here's the summary: the trend over the years was that you could afford yourself to keep writing ever less efficient code and have it be compensated by increase in hardware processing power. That trend is over. The reason: CPUs today aren't made faster by increasing their linear processing speed anymore. CPU manufacturers are encountering big technical difficulties on that route lately. Rather, the CPU processing power is increased by adding multiple cores. The bad news is that your sequential, single-threaded algorithm won't automatically benefit from this, as it did from clock speed bumps in the past years.

So, what's to do? Ignore the problem is one solution, but computers with multicore CPUs are today quite widely available on market. I'm typing this on an Intel Core Duo system myself. Running a test program with an infinite loop won't max the CPU utilization on this gear. It'll push it up to 50%. I need to run another instance of the program to push the CPU all the way to 100%. If you ignore the system, you'll produce software that can only utilize 50% of the CPU resources. People will inevitably find it slow over time and use your competitor's software who chose to not ignore the problem.

Another possibility is manual parallelization. Identify hotspots in your code. Rewrite them to be multithreaded. Use mutexes, locks, the whole shebang of multithreaded programming. If you have an array of 1 million elements (a big sound sample, maybe, or a picture), chunk it up and feed each chunk to a different thread. Even better, than chunking it into 2 equal parts on a 2 CPU system, code a producer-consumer pattern to chunk it into many small pieces and feed to threads adaptively. Of course, your code increases in complexity considerably. Of course, your program might have a runtime overhead for spawning new threads. And then there's the fact that concurrent programming on "low level" - using threads and locks explicitly is hard. It is easy to get it wrong and create race conditions and deadlocks at runtime. So, is this solution ideal? Far from it.

An ideal solution would be a programming paradigm that does yield readable source code, and still allows the compiler or the runtime system to identify paralellizable operations and paralell them, either statically (compiler) or dynamically (a runtime JIT compiler after it decides that the up-front cost of setting the paralellization is less than the gain from paralellization of an operation).

Just as today we have runtime environments with implicit memory management and languages designed for writing programs that run in such an environment, we could soon have environments that have implicit parallelization.

As a typical example, a transformation applied to all elements of a collection independently is a good candicate - provided your programming language lets you express it in such a way. I believe that functional languages are better prepared for this kind of implicit parallelization. There are already some academic-level efforts underway, witness Parallel Haskell.

One very interesting notion some people are vocal about is that stack-based computational models inherently stem from the sequential approach to programming, and that as we strive to embrace programming approaches that naturally lend themselves to paralellization, we'll gradually embrace computation models that aren't stack based. Like the just mentioned functional programming where you don't really express your program in terms of subroutine calls. Or how Excel 12 will also feature parallel computations. BTW, the previously linked blog entry also contains links to some interesting research going on in this general area of single-machine parallel computing.

Is it a say hello again to state machines? Maybe, maybe not. One thing I can more readily imagine is that a today's widespread architectural model for enterprise systems - asynchronous messaging - will somehow get adapted for single-machine single-process development that is meant to run on multiple CPUs.

Saturday, June 10, 2006

New Gear

I bought a MacBook Pro. It's a first-gen 15.4" 1.83GHz Intel Core Duo piece - it was the only one that the dealership could ship quickly, so I sacrificed 170 MHz per CPU core rather than having to wait one month for the machine to ship. I ordered a +1GB RAM stick and a bigger HDD, but they managed to screw up the order, so these will arrive next week. Sheesh.

The reason I bought a MBP: I need mobility. I realized that while my iMac G5 is an absolutely satisfying machine for all my work, it falls short if I need to do work, but kids insist that I take them down to a playground. Or we want to spend the weekend in our hobby garden out of town, and I want to have a computer with me for evenings. Also, lugging the 11kg 20" iMac is possible - I proved this to myself taking it with me from Szeged, Hungary to Reading, UK and back - but by no means a pleasant experience. (There were no problems with airport security, mind you - Heathrow has a policy that you must take your laptop out of the bag and have the bag and the laptop screened separately. I argued that my computer is not a laptop, and they let me not have to unpack it from its iLugger, thank God. The screening lady cheerfully said "Look, this guy carries a big computer screen with him!" to a passing colleague, so that I can feel like a complete dork.)

Anyway, I'm typing this on the MBP now, after a rather painless transition - all my files, applications, and settings were transferred automatically from the iMac. I had a strange incident though - when the setup got to setting up wireless, I forgot that my router is locked down and will only accept connections from predefined MAC addresses. After two unsuccessful attempts (I thought I forgot the password), the setup froze, and I had to reboot it. It was rather painless afterward though, but since the second time I opted to not transfer data from another machine (would take another three hours), I had to create a local account that I later deleted.

Also a hopefully minor problem - I have to reinstall Fink, so it's bootstrapping in the background as I'm writing this, fingers crossed it'll all work. The nice thing is, the Fink binaries transferred from the iMac actually worked after the migration, as the Rosetta (Mac OS X's built-in PowerPC CPU emulator) picks them up, but various update processes quickly notice that the CPU architecture suddenly changed to x86, and get mightily confused, so it looks like the best idea is to recompile everything from scratch. It does make me a bit anxious though. Fingers crossed, as I said.

Thursday, June 08, 2006

War booty

My Dad and me were chatting about motorcycles - the conversation was sparked by a Kawasaki parked near the designated smoking area of a medical diagnostic center where I took him to an MRI scan one evening early this week. (A plate on the wall next to center's entrance boasts the British-American Tobacco Company logo. I can just guess this means they partially funded it. Trying to relieve a guilty conscience, I guess, for all those lung cancers.)

Anyway, I digress.

My Dad owned a 1957 BMW R25 motorcycle back in his young days. (Here's a site with photos if you're curious how does one look like.) He and my Mom used to travel a lot with it all around the country when they were a young couple. As it happens, I was eventually born, and the bike was less and less used - as convenient a vehicle it was for a man and his woman, it was not really usable at all for transporting a family with a kid (and later, two kids). I remember the bike from its years when it was gathering dust in our shed where I sometime retreated to play with Dad's tools. Rarely, Dad would take it out and drive it for half an hour around the countryside. Then its battery died, and he didn't get to replace it. One day, Mom sat down with Dad, and told him that, in her opinion, they should sell it. Dad opposed the idea. He was too attached to the bike emotionally even if he didn't use it much anymore. In a strange twist of fate, the very next day a man from our village knocked on the door. After Dad invited him in and seated him down, he said that he came to ask whether the bike would be for sale. Dad agreed to sell the bike, although with a heavy heart. The new owner took good care of the BMW. He repainted it and repolished it, replaced all that needed replacing. When I saw it few months later, I could hardly recognize it - it looked gorgeous, as good as new. Now (about 20 years later) Dad told me that his heart sank every time he saw it in the village. I told him that that's silly - the BMW definitely had a better place with its new owner.

I asked my Dad whether the new owner still has it. He told me that he met the new owner's wife few years ago, and asked her the same. She told him that when we were invaded by serbian forces back in 1991, they fled with their car and left the bike behind. After the territory was liberated in 1996 and they went back, they were not too surprised to not find it - the chetniks took it. She told him that during the war, these thugs could drive any vehicle they laid their hands on over to their serbian homeland, and the authorities would happily register it to their name if they claimed it was a war booty.

I told to my father: "see, now this is something to make your heart sink."

Tuesday, May 30, 2006

More blogosphere echoes on continuations in JVM

Following people also reacted to Gilad Bracha's "no continuations in JVM" post (in alphabetical order):

Don Box

Tim Bray

Avi Bryant (developer of Seaside, continuations-based webapp framework in Smalltalk)

Miguel de Icaza (with a link to Ian Griffith's entry about continuations being considered harmful

David Megginson

All these posts revolve around whether webapps are a justifiable reason for bringing continuations in JVM. Now, I can actually agree they are not. However, there's precious little discussion out there about other reasons that are in fact justifiable reasons. At least on JVM, writing handlers for non-blocking socket I/O is one very practical usage. Writing distributed systems with execution location transparency and runtime migratability is other. Yet another is writing cooperative-threading systems where some domain-specific guarantee of scheduling fairness is explicitly encoded into the yielding policy of microthreads implemented by continuations. This would allow implementing i.e. a MMORPG server in Java, similar to how EVE online server-side is implemented in stackless Python, and according to its developers, they currently manage 26000 concurrent users on a 150 CPU cluster. It'd be a nice new server software market for JVM as well.

Wednesday, May 24, 2006

Gilad Bracha: No continuations in JVM

Seems like Gilad Bracha doesn't want to see continuations implemented in JVM. Too bad. His reasoning is that the major use for continuations would be web application flows, and that web applications increasingly tend toward stateless models, and only a minority of functionalities need multipage stateful flows.

Well, let's even allow for the moment that he's right about webflows. However, even if we supposed that, there are still lots and lots of valid uses cases for continuations in JVM. Here are few examples:

Distributed agents, where execution hops from one machine to another, because it's cheaper to bring processor to data than the other way round. As a special case, grid computing.

Implementing processes with massive parallelism (lots of work units being processed in parallel) but also with some long-blocking points. Like when you have batches of 100 000 work units in-flight, but they're frequently blocked on something, i.e. waiting for user input or better yet an external process communicating with the user gathering complex input, or just waiting for another processing window in case you're bound to specific time windows instead of operating 24/7. You just can't have 100 000 physical threads. No, you use 500 threads and send those that block as a serialized continuation to a database and keep the threads busy with those that aren't blocked. At moment, such systems can be implemented in JVM by coding i.e. in Rhino - Rhino is a JavaScript interpreter in Java that supports continuations. It is however quite unfortunate as at best you end up mixing Java and JavaScript, and the boundaries between those languages in your system are determined by whether a control flow can lead to suspension of execution via a continuation. If it can, then that control flow path - all the "for" and "if" blocks enclosing it must be coded in JavaScript, if not, it can be written in Java. As you see, this delineation between implementation languages in your system stems from a purely implementation-specific constraint, and is not something that naturally follows from the architecture of your system, resulting in suboptimal architectural design (and frustration in the architect because such a limitation is imposed on him). If Java supported continuations, the full system could be written in Java, with no need to reach out to JavaScript.

Protocol handlers for NIO-based servers. Think it's a coincidence we don't have full-fledged HTTP NIO servers in Java? Think again. Even handling the basic HTTP/1.1 handshake - with support for 100-Continue protocol and parsing of the headers is nontrivial to do if you are forced to code it as a state machine, trust me.

Cooperative threads. They're sometimes needed. I.e. for implementing a MMORPG where you need to be able to guarantee fairness in scheduling. Lots of MMORPGs use stackless python for this purpose. They could use Java, if only Java haad continuations.

There's one more strong reason why Sun should not eschew the idea of continuations in JVM: continuations are already happening in the .NET space. Not in the official Microsoft's implementation, but in Mono - witness Mono Continuations, bringing full continuation support to C#. I don't think Microsoft will not take this idea from Mono and implement it in mainstream .NET. As with proper generics implementation, or I could also mention LINQ, the .NET platform will gain yet another innovative advantage over Java, making Java look more and more "so 20th century" in comparison. That's something I'd be worried about if I were Sun's chief Java evangelist, Gilad.

Sunday, May 21, 2006

Slow Bob in the Lower Dimensions

Just discovered this: Slow Bob in the Lower Dimensions is a short psychedelic animation by Henry Selick, actually a pilot for an unfortunately never realized series (apparently targetted to be shown onn MTV). At the moment, Henry is directing (and has co-written its script with Neil Gaiman) the movie adaptation of Gaiman's short children's novel, "Coraline".

Saturday, May 20, 2006

Ok, so which one?

Problem at hand: Apple's line of Intel CPU equipped laptops is now complete, so you'd want to purchase one. Okay, but which one exactly?

The middle MacBook model and the middle MacBook Pro models look to me to have the best bang-for-buck value.

The only problem with MacBook is that I'd essentially have to start with throwing out the 2x256MB RAM modules and install 2x512MB third-party modules in them, or alternatively ordering it with at least 1GB BTO (whichever is cheaper). I'm running my current iMac G5 with 1.5GB RAM, and I definitely need at least 1 GB. So, that's already upping the price. With a MacBook Pro, I could keep the 1x512 MB and stick another 1GB in the second slot for 1.5GB (yeah, I know that using identical modules allows you double RAM transfer rates, but believe me, 1.5GB saves you a lot in transfer rates to and from a swapfile in return).

Also, would probably want to upgrade the HDD to at least 100 GB in the MacBook. The only real difference then between the MacBook and the Pro would be in the display size and the graphics chipset. Since I'm not really doing lot of 3D gaming, the graphics chipset is not much of an issue and a MacBook with 1GB RAM (80MB gone to the graphics chipset) and a HDD expansion would probably suffice me. The display size is not an issue as at home, I'll be connecting it to an external monitor anyway.

So, all in all, with 2x512MB RAM and 100GB HDD, a MacBook costs $1600, and the MacBook Pro costs $2200. For $600 extra, you get a bigger screen, a better graphics chipset, one ExpressCard slot. However, the advertised battery life is 1.5 hour shorter for MacBook Pro.

Well, looks to me that MacBook is winning on my pro/contra sheet, especially after applying a $600 saving to the "pro" side. I also just read the Ars Technica review of the MacBook, and they basically conclude the same. They even disclose that the HDD is very easily replaceable, so maybe it's worth not buying an upgraded HDD from Apple, but rather buy a beefy 120GB 7200RPM drive on my own instead.

Friday, May 19, 2006

Recognizing The Way Of The Continuation

Seems like lots of people come to the recognition that for really scalable long-running (and/or many-at-once-running) scenarios, you indeed need to build your system on continuations. Well, except if you want to build it on explicit state machines, like some less fortunate projects do, that is.

My just launched Rhino in Spring effort has some similarities with i.e. BPMScript, a project I discovered accidentally today - it's a Business Process Management solution that also uses Rhino with continuations to implement scalable long running processes in a maintainable way (that is, using a high-level programming language to express algorithms instead of explicitly managing a state machine). A prime example of a state machine-ish BPM is the one JBoss develops. Last year when I attended the JAOO conference, a guy from JBoss gave a presentation on JBoss's BPM and tried to convince us how their "graph oriented programming" (muhaha) is in fact much better for the purpose than object oriented programming, as object oriented programming "doesn't support suspend/resume of running processes". I almost fell out of my chair when I heard it. I tried whacking him with the cluebat of continuations after his presentations, but am not sure to this day that I got through. He apparently thought that people who design business workflows like drawing circles and boxes and connecting them with arrows more than writing proper programs in a proper programming language. Well, while there might be truth in that as well, I'm regardless glad to see more and more projects - like BPMScript - step on the more enlightened Way Of The Continuation :-)

The explicit dealing with a state machine is why it looks to me that it wouldn't make much sense integrating Rhino in Spring (RiS) with Spring Web Flow (SWF), by the way. Based on my current survey of SWF code, it's geared toward state machine execution, and can't nicely accomodate a totally different execution paradigm, which is a shame as there'd be some reusable bits if it weren't engineered with state machine approach throughout, assuming that the graph of flow states and transitions can be made readily available up front. It can when you write your program by coding a state machine directly. However as I pointed out in this comment on TheServerSide, enumerating all states of a JavaScript program is a futile thing, since it's a Turing-complete language and thus full state enumeration would be equal to the halting problem. This also goes to show that with a versatile modern programming language, you can build much more complex flows in a natural manner than by piecing together states and transitions manually. I.e. you can bundle data-validation loops, authentication subroutines (i.e. a set of page that logs in the user or allows him to complete a several pages long sign-up process before returning to the task at hand), etc. into functions, bundle those functions into libraries that are then included from main flowscripts.

I think it's an incredibly great thing that thanks to Rhino and continuations, more and more Java-based systems can be built without having to make a tradeoff between the comfort of a modern programming language and runtime scalability. We can have both. As usual, Smalltalk community knows this for decades. Via Rhino, it's breaking into JavaScript and Java as well finally.

Monday, May 15, 2006

Rhino in Spring

So, I've started a new open source project. Not really started it now, as it's been sitting in various states of incompleteness on my machine since last August, always waiting for the next chunk of time I could spend on it. Well, I'm pleased to announce it's ready now. So, what's it about?

The short story is, I integrated Rhino with Spring.

The longer story is, I implemented a custom controller for the Spring's web application MVC framework that allows you to express in JavaScript control flows that span several HTTP request-response cycles (commonly referred to as "webflows".

Below is the text of the announcement as I posted it on TheServerSide (no link, as it didn't show up yet):

A new Apache-licensed open-source project, Rhino in Spring aims to integrate the Mozilla Foundation's Rhino JavaScript interpreter for Java with the Spring Framework.

The current release includes a controller component for the Spring Web MVC that allows you to express complex multipage flows in your web applications as server-side JavaScript programs.

You can use all the amenities of a full-blown imperative programming language while designing flows. You can write libraries of reusable code encapsulated into functions (i.e. validators), you can use the familiar for(), if(), while(), switch/case etc. statements to express the control flow, and so on.

Rhino in Spring uses the Rhino's support for continuations to achieve high scalability - when a script is suspended between a response and the next request in the flow, its state is stored in a continuation (think of it as a snapshot of its stack), taking up no scarce system resources (i.e. no physical threads), allowing for any number of concurrently active flows.

Even more so, "old" states are preserved (with configurable expiration policies), so the users can go back and forward using the browser's back and forward buttons, or even split the flow in two using the browser's "New Window" menu, and the framework will take care of resuming the server-side script on each request originating from a backed or split response page at the correct point, with correct values of all variables automatically - no need to disable the back button or use custom navigation links on your pages to keep server and browser state in sync.

In addition to in-memory and JDBC server-side storage of states it even provides a facility for embedding an encoded textual representation of the continuation in the generated webpage, thus moving it to the client and completely eliminating any server-side state storage for the ultimate in scalability. Compression, encryption and digital signing can be enabled to protect the client-side stored continuations from tampering. As an added bonus, you also get generic Spring bean factories for Java Cryptography Architecture public and private keys as well as Java Cryptography Extension secret keys, that you can also reuse elsewhere.

Monday, April 10, 2006

Josh Bloch on API design

Came across slides of a Joshua Bloch talk titled "How to Design a Good API and Why it Matters" he gave in 2005. I'll just say it's worth reading through them.

Wednesday, April 05, 2006

April fool's day come late

Sounds like they're April 1 jokes, but they're not:

Apple officially supports dual-booting Windows and Mac OS on Intel-based Macs.

Microsoft offers its Virtual Server software free, with official support for running Red Hat Linux on it.

My head is spinning... (disclaimer: I can perfectly understand the market motivations for both moves. It's just that I didn't consider them very likely...)

Thursday, March 30, 2006

In UK next week

I'll be in Reading, UK between April 3 and 7. If anyone feels like meeting me over a beer in Reading or London area, drop me a note so we may be able to arrange something.

Tuesday, March 28, 2006

Bleep is about Copehagen interpretation

So, two days ago Kriszti and me watched "What the Bleep Do We Know!?™". Based on reviews, I had some good expectations about it, and while I must say my opinion is still quite fuzzy, the prevailing feeling is that of utter disappointment.

The movie does convey positive messages, promotes the benefits of positive thinking, saying that what you think affects who you are and what your destiny will be etc. In this regard, I can completely agree with it.

Then there's the big but.

First, the movie is shot as a half-documentary, half-fiction, densely interspersed. It aims to gain scientific credibility to its message (I'll talk about the actual message in a bit) by having lots of experts (as well as few "experts") telling their opinion in the documentary half, supposedly reflecting on the happenings in the fiction part. However, the things these people say often feel out of context and they create more confusion than they explain. For a movie supposedly wanting to promote a message and back it up with scientific credentials, the editing is done very poorly (assuming the raw interview material was not as bad in the first place). The Wikipedia page for the movie covers much of the controversy, including one of the interviewed scientists objecting that they edited his interview so that it looks like he supports the movie's claims where he really does not, as well as factual errors, displaying scientifically unproved experiments as facts, and lots of jumping to conclusions.

So, what's the movie about? Well, the movie bases its message on the Copenhagen interpretation of quantum mechanics. As you might know (and if you don't, go and read the link) in that interpretation, observing a quantum phenomenon causes the nondeterministic and irreversible collapse of the wavefunction. This interpretation's problem is that it requires an "observer", thereby introducing consciousness into the theory. The movie goes on to argue that this way, our consciousness actively affects the reality that surrounds us, by observing it, hence it jumps to the conclusion that we create the reality. A big negative point for the movie, in my opinion, is that unless you already studied quantum mechanics, you probably won't understand it. The interviewed persons say "quantum mechanics this" and "quantum mechanics that" all the time, but the explanation of the uncertainity principle is constrained to the scene in the basketball court, and while I was watching it I thought that if I didn't knew all of this already, I'd probably be no less left in the dark after this movie.

Anyway, I must disagree with the movie's conclusion about our consciousness creating the world around us, and us being indistinguishable from God. These views are very old, by the way. You can go back at least to Baruch Spinoza for the philosophical theory of unity of the nature (humans included) and God. You can refer to either George Berkeley or David Hume for the philosophy of subjective idealism. Nothing new here. Supporting these ideas with the Copenhagen interpretation seems to me a bit stretched.

Moreover, and this is my basic cause for disagreeing with the movie, is that there is a different quantum mechanics interpretation, the Many-worlds interpretation (MWI) that completely eliminates the need for any sort of observer for collapsing the wavefunction, as in this interpretation, the wavefunction never collapses. I won't go into explaining MWI here, I'll again direct you to the link above.

Rather, I'll tell you what does MWI mean to me. MWI, if you subscribe to it, does have one very interesting implication. Namely, that all possibilities realize themselves at the same time. On high level, whenever you are in a decision situation, reagardless of how you decide, all outcomes will realize themselves in the probabilistic space. As consciousness is widely regarded (not proven, though) as being a completely classical (in the "classical physics" sense, that is, not quantum mechanics level) phenomenon, the linear stream of events you experience as your consciousness is one path through the global wavefunction of the universe. Whenever there is a decision, the path forks, and you experience one of the paths, while multiple "you"s that share your identity up to that point will experience the other paths.

What does it mean in practice? It means that when you're maybe hesitating on something, like talking to that attractive girl sitting alone in the bar, or telling your coworker that he's being obnoxious about something, or dare to learn parachuting, etc., you need to realize that you will. And also that you won't. Both. At the same time. With different probabilities though. You only get to experience one of these paths, and there's no going back and retracing your steps once you did. You can consciously choose the outcome that'd otherwise be lower probability, leaving the higher probability but duller options to another you. (Although balancing bravery and foolishness is a good idea generally :-))

Sounds wild, and some will argue that assuming such constant forking is in violation with the Occam's Razor principle as it creates a continuum of parallel universes. Proponents will argue that there is no such thing, there is only one universe, represented with a single probabilistic wavefunction, particles exploring all paths through it, and the consciousness you're experiencing being one particular path of particles making up your physical self at the moment. There's also no information flow sideways or backwards that's a fond plot device of fiction works involving time travel and/or parallel universes. Proponents will say that the simpler formal expression of this interpretation actually makes it much more in line with Occam's Razor than the Copenhagen interpretation. (Indeed, MWI operates with less assumptions, is expressible with more elegance on mathematical level and doesn't need the concept of observer).

Also, it doesn't clear you of any personal responsibility, as free will is still completely realizable within this framework - remember, consciousness is a classic physics phenomenon, and regardless of the low-level mechanics and the fact that what you experience as yourself might be taking one path through the wavefunction, while others forking selves are experiencing all the other paths, it still makes you responsible for the acts on your path.

Whether I personally subscribe to MWI? Well, you see, it's hard to decide. I do. I don't. Both, at the same time :-) It's just a theory, and many regard it as unfalsifiable, which shuns it into the domain of belief rather than science. Sometimes, when things go bad, I can find comfort in thinking that at the same time, if this theory holds, then things also didn't go bad, and that some of my probabilistic parallel selves are having it better at the moment.

Tuesday, March 21, 2006

Check out Restlets

For all the fans of the REST approach out there (who also happen to code in Java), seems like someone is working to create a replacement for Servlet API that is explicitly designed for writing HTTP systems the REST way: check out Restlets. Haven't had a chance to look into it deeply, but it's definitely on my list of things to inspect more closely. The folks wisely created (similarly to Servlet spec) a separate spec and a separate reference implementation - this is quite important for widespread adoption, as it should allow things like alternate implementations, i.e. one built on top of servlet API (for leveraging already tried infrastructures of servlet containers out there).

That said, the reference implementation they ship is however a completely standalone server - real men handle their port 80 directly :-). Oh, the RI also uses FreeMarker as a default view technology, being a replacement for JSP :-). Well, I guess that makes me ever slighlty biased.

Wasting time on debugging memory errors again

I'm again losing days of work on debugging an OutOfMemoryError in a production system. The tricky part is that the code implements a very thin wrapper over a database, bulk processes messages, and is totally stateless. The software stack is JVM, then JDBC driver, then Hibernate, then Spring, then my code. There's no memory leak, I could confirm this much with a profiler - whatever was causing the trouble was allocating temp objects held by references on stack, so when the OutOfMemoryError unwound the stack, the smoking gun was gone...

Finally, I turned to JDK 6.0. It's in beta at the moment, but it has a very useful feature: a command line switch "-XX:+HeapDumpOnOutOfMemoryError" that'll cause a full heap dump (in HPROF heap dump format) whenever an OutOfMemoryError is thrown. After having the ops guys install the JDK 6.0 on the machine, I restarted the software under it, with the abovementioned switch, sit back, and waited for a memory error with a grin. And waited. And waited some more. Finally, waited for more than two hours while the system was running on full load. Nothing.

To my fullest and utter surprise, the memory error doesn't manifest itself when running under JDK 6.0, even after few hours of fully stressed operation. Damn. Isn't it typical? Maybe we have again hit a JDK-specific memory bug that got fixed in this later JDK? Unfortunately, I really cannot seriously propose to colleagues to run our production systems on a beta JDK...

Anyway, "-XX:+HeapDumpOnOutOfMemoryError" sounds like something that should have been part of the Java long, long ago. Big enterprise systems run into memory problems. That's a fact. There's few tasks as frustrating as trying to isolate them as the problem inherently manifests itself nonlocally. To have the JVM dump a heap snapshot at that point is invaluable. Don't having this feature caused me one sleepless night too many by now. I heard YourKit will have (or already has?) the ability to analyse HPROF snapshots, which would be really dandy for excavating in the results. Failing that, I still can use the HAT profiler, hopefully they have incorporated my patches to it in the past one year :-)

Sunday, February 19, 2006

Dilbert: Land of unrealistic business assumptions

Scott Adams constantly proves how his work on the Dilbert comic strip should be required reading for businessmen and managers, but I think this most recent storyline beats everything I read so far. The first piece contains a brilliant "strange loop" reasoning from Dogbert (as well as a nod to Chronicles of Narnia, but that's sort of beside the point), and the following pieces (two so far) are just cruelly on point. Whenever anyone tries to sell you a failsafe business plan, make sure they read this first :-)

Friday, February 17, 2006

Political alignment

For what's it worth....

You are a
Social Liberal
(80% permissive)

and an...
Economic Liberal
(36% permissive)

You are best described as a:

Democrat

You exhibit a very well-developed sense of Right and Wrong and believe in economic fairness. loc: (112, -50)
modscore: (22, 48)
raw: (2646)

Link: The Politics Test on Ok Cupid
Also: The OkCupid Dating Persona Test

Thursday, February 16, 2006

Wore out a Mighty Mouse in 3.5 months

My iMac was shipped with a Mighty Mouse back in November last year. I'm sorry to report that I wore out its little scroll ball in only three and a half months of use - it no longer gives a clicky sound when I'm rolling it down, and accordingly doesn't detect the roll. (Scrolling up, left, and right still function normally, only the most heavily used scroll down doesn't.) At $49 retail price this is no cheap mouse, so I kinda expected it to sustain more wear... Just phoned the Apple dealership and they promised the service guys will look at it...

Wednesday, February 15, 2006

FreeMarker Blog

There's now a FreeMarker Blog for all those people who want to keep an eye on the events related to the FreeMarker project - it's a "groupblog" to which yours truly as well as other active FreeMarker developers will be contributing.

Tuesday, February 14, 2006

Spurious wakeup of Java threads

Vlad Roubtsov today posted on the ADVANCED-JAVA list a message saying how he noticed that the JDK 1.5 documentation for java.lang.Object wait() method now contains this bit:

"A thread can also wake up without being notified, interrupted, or timing out, a so-called spurious wakeup. While this will rarely occur in practice, applications must guard against it by testing for the condition that should have caused the thread to be awakened, and continuing to wait if the condition is not satisfied. In other words, waits should always occur in loops, like this one:
synchronized (obj) {
     while ()
         obj.wait(timeout);
     ... // Perform action appropriate to condition
}
(For more information on this topic, see Section 3.2.3 in Doug Lea's "Concurrent Programming in Java (Second Edition)" (Addison-Wesley, 2000), or Item 50 in Joshua Bloch's "Effective Java Programming Language Guide" (Addison-Wesley, 2001)."

Now, I always used while() instead of if() because even if I didn't know about this possibility for "spurious wakeups", I was always a bit paranoid about the reliability of any execution environment my code could be run in. Nevertheless, it is now a documented best practice :-)

Monday, February 06, 2006

Magyar Crok

Found these on my kitchen table this morning, they presumably belong to my 6-year old son. You need to be a Hungarian to understand the cultural shock. (Click on image to see unobscured version on Flickr)

Thursday, February 02, 2006

Progressive Boink

Were you living under a rock and don't know Calvin and Hobbes yet then "25 Great Calvin & Hobbes Strips" can serve as a great introduction as it comes with commentaries. If you are a Calvin and Hobbes fan, it is still worth checking out because some of the commentaries are outstanding.

Needless to say, if you're a serious fan though, so to say a "fan-atic" of Bill Waterson's work like I am, you of course already own a copy of "The Complete Calvin and Hobbes", a 3-volume, 11 kg beauty and keep it on a central place on your bookshelf :-)