DOM Ranges and Document Fragments

I expect this article will get very few replies, indeed. :-p

A few weeks ago, I had this big block of XBL using a DOM range to show a set of nodes, and a DOM document fragment to hold the same set of nodes while the nodes were “hidden” from the user. It was really clunky, though, and a little hard to manage.

To clean it up a bit, I decided to introduce a new module, RangeFragment.jsm. The idea is simple: when you want to hide the set of nodes, you call rf.toFragment(). When you want to show them, you call rf.toRange().

I wrote this for both my <markup:repeat/> and <markup:children/> elements, but based on the hg diff, it doesn’t look like I actually removed or simplified very much. The reorg was only about making it easier to implement and test what I’m working on.

Sadly, when I went to look at the to-do list for Verbosio, this was so small a functionality change that it didn’t rate a spot on the list.

Oh, well. Maybe someone else will find a good use for an object that lets you switch from DOM range to document fragment and back.

Wanted: Wizard & Dialog Mochitests

Wizard tests and dialog tests.

Umm…

We don’t have a lot of tests to say that a wizard or dialog will function as desired. Since I’m building a pretty complex wizard, it’d be nice to have some assurance that when I hit “Next”, it will end up on the next page – and that the button changes to “Finish” when it’s supposed to, for example. I’d also like to write some tests specific to that wizard I’m building.

I’m not quite sure how to write such a test. The best idea I have is that the dialog check for a query string value indicating “I’m a test, do something different”. I need some sort of standard way for running tests a dialog’s opener specifies.

Asynchronous events and Mochitesting

Recently, I found that I needed my chrome Mochitest to wait for a XBL binding to apply, before continuing the test. For the end-user, waiting is hardly an issue – by the time they see the new user interface, all the bindings have applied. Mochitest (and most other automated test harnesses) runs straight through, and brakes for nobody. In some cases you can pull some trickery with nsIThreadManager, convincing Firefox to wait… but that’s hardly reliable.

So I came up with a three-part solution.

  1. Force my XBL bindings to fire DOM events when the constructor executes.
  2. Break up my test functions into (slightly) smaller functions, each of which handles a part of the overall test. Two parts cover the main functional testing, while a third part is specific to the XBL binding.
  3. Implement a generic testing harness for Mochitest which can manage the flow from one function to another.

This new test harness, which I call PendingEventsTest, adds a layer of complexity on top of SimpleTest. Each test function is added via PendingEventsTest.addTest(), and you can pause for a DOM event with PendingEventsTest.addEventLock(). The whole test sequence can abort with an error by calling PendingEventsTest.abortFail(). You can advance to the next test function in the sequence with PendingEventsTest.asyncRun() or PendingEventsTest.run().

There’s more complexity to it than that, of course. To move data from one test function to another, I also added “payload” support, for storing a generic object. Also, this is written for chrome Mochitests – it will likely fail in content Mochitests. Still, this new test harness helped me finish support in Verbosio for my equivalent of the <xbl:children/> element in markup templates.

Lastly, it (and the logging service I wrote a few weeks ago) is licensed under MPL 1.1/GPL 2.0/LGPL 2.1, so anyone who wants to borrow this for Mozilla code is welcome to. Cheers!

The strangest crash I’ve seen in a while

Last week, I wrote about a new logging infrastructure I built to track down
a bug. In the original post, I warned that logging wasn’t a panacea – that it
could cause more trouble than it solved. I was more right than I knew, as I
added a bit of logging which led inexorably to a crash by JS_ASSERT failure.
(For those who don’t know what a JS_ASSERT is, it’s an assertion inside the
JavaScript Engine that all is well. Since JavaScript is one of the primary
languages of the World Wide Web, these assertions are really important. If
JavaScript is misbehaving, it can’t be trusted. Fail one such assertion, and
the program aborts, forcing the developer to take notice.)

At first, the crash didn’t make a damned bit of sense. I prefer to write JavaScript,
not just because I entered into software engineering (and Mozilla development)
through JavaScript, but also because it’s super-stable. In fact, the only new
code I was introducing was in JavaScript. It shouldn’t crash. Through
an unlikely chain of events, it does.

This article details how I was able to analyze the crash itself, how logging
set up the conditions for the crash, and how other logging ultimately
proved that the crash resulted from a logging call. I’ve filed this as bug 543304. Ultimately, reducing
this test case to a usable bug report will not be easy – but worth it. More details in the extended entry. (Warning: technobabble content ratio is high. If you do not want to see how this
developer thinks and debugs Mozilla code, do not read on.)

Continue reading The strangest crash I’ve seen in a while

XPCOM and Logging, part 2

First off, thanks to everyone who took the time to reply to my original blog post. Several people pointed me to moz4js, and I’m quite pleasantly surprised to find out the Mozilla Thunderbird team uses it quite a bit. I hadn’t heard of it, but when I did through the comments, I found it had the same problem NSPR logging has: single language support.

Then I realized: if we put moz4js in a JavaScript module, then a component can reference that module and expose the same interfaces to XPCOM. Chrome JS and JS components can load the module, and all other languages can use the XPCOM component. Maybe not simple, but certainly workable.

No, I’m not signing up to own the toolkit moz4js bug. I’m not wholly convinced, personally, it’s the right way to do logging. What I do think, though, is that we need a general discussion on logging API’s and consumers.

In the bug, Flock mentioned their own API – which I also didn’t know about. I mentioned Skyfire has an internal logging system. It makes me wonder how many other companies and projects (Songbird, Komodo, Disruptive Innovations, etc.) have implemented their own logging API’s. This is one subject that reasonably affects the whole Mozilla developer community.

If MozPad were still active, this is one subject I’d put right to them and develop a standard logging API for. Since we don’t have MozPad anymore, I would like to put it to the community at large:

  1. What do you require logging for?
  2. What logging capabilities do you need?
  3. What logging infrastructure (API, code) have you already implemented or used?
  4. What tests exist and pass for the logging infrastructure you use?
  5. How extensively do you use your logging infrastructure?
  6. How easy is it to work with your logging infrastructure?
  7. If you use someone else’s logging infrastructure, what’s missing?
  8. Are you willing to work on getting your logging infrastructure into the Mozilla code base? (Writing code, tests, specifications)
  9. Are you willing to work on developing a common Mozilla standard logging API and infrastructure, and getting that into the Mozilla code base? (Writing code, tests, specifications)
  10. Are you willing to actively “own” and drive an effort towards adding a Mozilla standard logging API to the Mozilla code base?

Please reply in the comments. The quality and quantity will tell me how seriously I should be taking this.

(P.S. I found a bug in my logging interfaces already. Not the JavaScript code, but the IDL. Bonus points to the first person to spot it besides me – it’s a little undocumented gotcha.)

XPCOM and Logging

What we have now

Over the last several years, I’ve learned to appreciate good message logging. Mozilla’s NSPR Logging API is a very good example, and works well, for compiled code.

JavaScript components? They can’t use NSPR. Neither can chrome, PyXPCOM, or any other privileged code. The best alternative I’ve got is a dump() statement. dump() writes to the standard error console, commonly known as “stderr”. In debug builds, though, stderr gets populated with a bunch of other stuff that I really don’t care about at a particular moment. (I’m not entirely clear of blame, either – leftover dump statements from my code pollute that console as much as anything else.)

At Skyfire, I’ve had to implement a couple logging systems as well, and the generated logs have helped us find and isolate bugs relatively quickly. What you log can help you identify bugs in most cases. They’re not a panacea, though: doing it wrong can really cost you.

Over the past few days, I hit a really weird bug in Verbosio’s templates system (which I’ve been quietly working on for about five months now, in a complete rewrite with tests). One test fails at a critical point, but it’s pretty far removed from the starting point. Amid all the other dump() statements and debug output, I can’t quite figure out where the code started going wrong. To pin it down, I think the best approach is to insert some logging.

Oh, and did I mention that Mochitest includes logging for pass, fail, and todo conditions… but conveniently left out an informational-only logging feature? (Having that helps when you’re trying to trace where a log statement came from – and the conditions at the time.)

The five paragraphs above combine to form one idea: Maybe I should implement my own logging service with tests.

Thoughts in the design

  • First of all, keep it really simple. Most logging implementations I’ve seen use a module name, a severity level, and a summary string. I figure that plus an optional details string should cover most messages.
  • For where those messages go, files are good. Streams are better. With a stream, you can feed it into a file, or read it into dump(), a XUL page’s multiline textbox, or anywhere else.
  • Someone reading the logs probably wants a way of separating messages – so if I give them a “multipart” stream, where each part is a separate stream… it’s easy to tell each message apart.
  • Make sure the user requesting the logging stream can specify which messages they want to receive – both by module name and severity level, and whether or not they want the details with each message.
  • Make sure to cache all the messages, so that a logging stream created later still gets all the messages that came before it.
  • Finally, since each message ultimately ends up as a simple string – use a simple string stream interface to construct each message as needed.

Did I overdo the design? Probably. On the other hand, it really maximizes flexibility. I don’t know how much memory multipart input streams consume, though, so I don’t recommend using this in a production environment. If you use this logging service for debugging, make sure you remove all references to it before checking in!

Did it solve my problem?

No.

Oh, the logging itself worked beautifully. But to get some of the data I wanted to log, I had to call other functions. Some of those functions triggered JavaScript garbage collection, which tried to run some cleanup code, which tried to execute JavaScript code I wrote, which resulted in a 100% reproducible crash. I don’t yet know what I’m going to do about that. I’ll probably first try to reduce the testcase and post it.

Also, logging itself will never solve the problem you’ve got. All it will do is provide data, and hints about where you are in the code execution. You still have to tell it what and when to log, and you have to read those logs to find the bug.

The good news is, though, that a logging infrastructure like this only took one afternoon to write. Sometimes, that’s all I need: a quick, simple component to organize the dirty work into one place.

(The code I wrote I am making available under MPL 1.1 / LGPL 3 / GPL 3 tri-license. If you want it under another license, feel free to e-mail me, and I’ll almost certainly say yes. I don’t currently plan on submitting this in a bug for landing in mozilla-central, though.)

Forking internally – suggestions?

Imagine the following: You’re working on a new feature for your application, dependent on XULRunner 1.9.1. It’s complex, but it ends up being all chrome & JavaScript components. You write tests for it as you’re developing, and you write a specification. Everything’s coming along nicely.

Then you start thinking about moving to XULRunner 1.9.2 beta x. You compile the code, and just to be sure everything’s fine, you run a test you wrote a few weeks ago. The test and the code it is testing haven’t changed at all. The test that passed in 1.9.1.x fails in 1.9.2.x.

Now, what do you do?

This is the situation I faced a few weeks ago. It’s not an idle question. Obviously I want to find out what regressed, but my test case isn’t minimal – it’s approaching maximal. On the one hand, since the bug isn’t in code I’m working on right now (far from it), I could simply ignore it for now and keep working on my feature. On the other hand, my confidence in my code is now badly shaken. Did I misuse an API? Did the API change without my knowledge? Did I make a change in between the 1.9.1 test and the 1.9.2 test? I have to know what broke, so I can work around it for supporting both versions of XULRunner.

(As it turns out, it was a regression in the Mozilla code base itself.)

I started thinking about how to make this a deterministic test: something that I can point to and say “In this set-up, everything works… but in the other set-up, something’s off.” When I think about it though, it’s a bigger problem than just A/B comparisons. My whole approach to developing software is too informal.

More in the extended entry.

Continue reading Forking internally – suggestions?

NSID gone too far

I have a more serious article which I’ll try to write tonight… but this I couldn’t resist. I went to weatherbonk.com, because it’s getting rather chilly in Mountain View and San Jose, CA, USA. I spotted something quite disturbing when I typed in “Mountain View” in the Quick Forecast field..

nsid.PNG

What’s next? Wookies? Geico advertisements?

“Why are you doing this?”

One of the hardest questions for me to answer is, “Why are you doing
this?”
It’s also an excellent question, one that checks what I am
working on. I recently found myself asking this question when trying to use
something I’ve never tried to use before: weak references to DOM elements.

What are weak references?

For those of you coming from JavaScript-land, you know that once you hold
a DOM element as a property, it will last as long as the object you attached
that property to. (Depending on the usage, longer.) This is because XPConnect
generates a “strong” reference to the element, which through the magic of
XPCOM reference counting, requires the element not be deleted until all
strong references are gone – including yours. Weak references are different.
They are objects which hold a pointer to your real target, but they don’t
require the real target “stay alive”. In JavaScript pseudo-code, it looks a
little like this:

function getWeakReference(obj) {
var weakPtr = {
_realPtr: obj,
QueryReferent: function(aIID)
{
if (this._realPtr)
return this._realPtr.QueryInterface(aIID);
return null;
},
QueryInterface: function(aIID)
{
if (aIID.equals(Components.interfaces.nsIWeakReference) ||
aIID.equals(Components.interfaces.nsISupports))
return this;
throw Components.results.NS_ERROR_NO_INTERFACE;
}
};
return weakPtr;
}

There’s one thing this code can’t show you: Some time in the future,
weakPtr._realPtr could be destroyed, and set to null. That
doesn’t happen in normal code (thank heaven!). But with weak references, it
can happen at any time.

Why are you doing this?

There’s a long list of sub-projects and sub-sub-projects, etc., each of
which seems to require I look at something I hadn’t tried yet. Here’s the
current list, and how I started thinking (wrongly) I needed weak
references.

(Continued on the next page…)

Continue reading “Why are you doing this?”

Alex Vincent’s ramblings about Mozilla technology, authoring, and whatever he feels like.