So I got thorougly roasted on my original idea. Which is actually good, compared to the deafening silence I’ve seen when I post other ideas. 🙂 Believe me, I prefer discussion on my ideas. In this case, it’s helped me abort a bad approach – which is why I post my ideas.
In short, regexps are impractical in the context I had in mind. I refused to give up on this until Daniel Brooks (aka db48x) showed me “the regular expression that parses email addresses”.
So scratch regexp’s.
What I (very roughly) need for XML documents, as I understood db48x’s explanation, is:
- Support when parsing the source of the XML document for start and end points in the source text corresponding to the XML tags & attributes of the document.
- An algorithm for converting those boundary points into their equivalent contentDocument-in-the-text-editor boundaries (which would be much easier if I store the boundaries as line number + column number)
- A way to specify the classes for each set of boundary points
- A way to specify the CSS stylesheet for each class.
I was very concerned about performance, but apparently that’s a non-issue.
The first and third items on the list involves probably some XML parser hacking. Of course, I’ve never hacked our expat before. 🙂 The second item is straight mathematics. The fourth I could probably just apply to the editor’s contentDocument, or perhaps hack nsPlaintextEditor to support nsIEditorStylesheets.
For other types of source code (such as C++, JS, etc.), I’d need the same types of constraints as for XML, but not the same constraints. Some way to create a common set of XPIDL interfaces would really be cool, but I’m not at that stage yet. I’m still in the brainstorming-and-learning phase. (Maybe working backwards, from end-of-document to start, may mean not recalculating for offsets and new DOM nodes as iteration continues.)
Here’s the transcript of our conversation: #developers @ irc.mozilla.org on syntax highlighting
As always, your feedback in helping me clarify and organize these thoughts is welcome – as long as you’re informative. (I can take rudeness, but not without references to back it up.)