Python DOM HTML functionalPosted Sun, Jan 13, 2008 in:
DOM HTML Progress
Well, I’ve made some progress on the HTML layout engine, but it still isn’t complete enough to run yet.
When I got to the point where I needed to call ViewCSS.getComputedStyle from the DOM, I stopped to actually implement it, and decided that it was a good time to actually see if the DOM HTML code I had written would run.
It didn’t, of course.
So I spent some time fixing all the little bugs here and there, and set up some test code to pull a page from the internet and parse it into a full HTML DOM tree.
Since I’m using pxdom’s parsing functions and pxdom only knows how to parse proper XML, I also run the HTML through the python tidy lib first to ensure that it’s proper XHTML. Without doing that I couldn’t even parse the Google home page.
Here it is if you’d like to check it out. It needs pxdom to work. The parseString function will take a string containing HTML and return an HTMLDocument.
Remember, it is basically pre-alpha code, since I haven’t tested everything yet. I might get around to writing up some unit tests at some point, but until then I can’t guarantee that there are no errors.
SEE also handles memory management for you and you can fully separate interpreter instances so you don’t have to worry about thread safety, which is two less things to have to worry about.