nielsie95
06-24-2010, 12:06 AM
Ganon (http://code.google.com/p/ganon/) is a little something I've been working on every now and then. I decided to actually release this project on google code, because I believe it really has potential. It is a pretty fast HTML/XML DOM parser written in PHP, like SimpleXML (http://php.net/manual/en/book.simplexml.php) and Simple HTML Dom (http://simplehtmldom.sourceforge.net/). It provides easy access to the elements through objects and advanced CSS selector queries.
What this has to offer over the other projects is that it is fast, advanced, extensible and it works (whereas Simple HTML Dom doesn't always likes to do its job :p). It works because the document is actually parsed, instead of using regular expressions. You can do a lot with regex, but at some point a document just gets too complicated. Ganon is also better when it comes down to selectors. It supports JQuery/CSS3-like queries (namespaces, attributes selectors and filters supported) and it goes even further. Selectors are extended with the possibility to match multiple elements and a "not" operator is added too. This makes advanced queries like the following possible:
(div, a, p)[! class|="lol", id] --- Select div, a and p tags which class attribute do not start with "lol" and have an ID attribute.
(test|* + !a)[href + class] --- Select all tags with the "test" namespace, but are not "a" and have both the href and the class attribute.
Ofcourse, simple queries like "div > a" work too :) I hope to write some documentation on it in the coming weeks.
Nielsie95
What this has to offer over the other projects is that it is fast, advanced, extensible and it works (whereas Simple HTML Dom doesn't always likes to do its job :p). It works because the document is actually parsed, instead of using regular expressions. You can do a lot with regex, but at some point a document just gets too complicated. Ganon is also better when it comes down to selectors. It supports JQuery/CSS3-like queries (namespaces, attributes selectors and filters supported) and it goes even further. Selectors are extended with the possibility to match multiple elements and a "not" operator is added too. This makes advanced queries like the following possible:
(div, a, p)[! class|="lol", id] --- Select div, a and p tags which class attribute do not start with "lol" and have an ID attribute.
(test|* + !a)[href + class] --- Select all tags with the "test" namespace, but are not "a" and have both the href and the class attribute.
Ofcourse, simple queries like "div > a" work too :) I hope to write some documentation on it in the coming weeks.
Nielsie95