PDA

View Full Version : Ganon



nielsie95
06-24-2010, 12:06 AM
Ganon (http://code.google.com/p/ganon/) is a little something I've been working on every now and then. I decided to actually release this project on google code, because I believe it really has potential. It is a pretty fast HTML/XML DOM parser written in PHP, like SimpleXML (http://php.net/manual/en/book.simplexml.php) and Simple HTML Dom (http://simplehtmldom.sourceforge.net/). It provides easy access to the elements through objects and advanced CSS selector queries.

What this has to offer over the other projects is that it is fast, advanced, extensible and it works (whereas Simple HTML Dom doesn't always likes to do its job :p). It works because the document is actually parsed, instead of using regular expressions. You can do a lot with regex, but at some point a document just gets too complicated. Ganon is also better when it comes down to selectors. It supports JQuery/CSS3-like queries (namespaces, attributes selectors and filters supported) and it goes even further. Selectors are extended with the possibility to match multiple elements and a "not" operator is added too. This makes advanced queries like the following possible:

(div, a, p)[! class|="lol", id] --- Select div, a and p tags which class attribute do not start with "lol" and have an ID attribute.

(test|* + !a)[href + class] --- Select all tags with the "test" namespace, but are not "a" and have both the href and the class attribute.

Ofcourse, simple queries like "div > a" work too :) I hope to write some documentation on it in the coming weeks.

Nielsie95

Frement
06-24-2010, 12:23 AM
Looked pretty cool. Almost 3k lines, looks like a lot of work.

So, what use will this be? Like, name one good use of this :)

I mean, i couldn't think of anything I would need this for.

Tim0suprem0
06-24-2010, 12:34 AM
http://zelda.nintendo-europe.com/shared/user/media/contantImages/bosses_ganon_z5.jpg

Sorry I couldn't help it.

nielsie95
06-24-2010, 12:35 AM
I guess most people use these sort of things for scraping other websites. But you can also use it for XML/RSS feeds or HTML beautifying/formatting.
Here's an example for the temporary stats:


<?php

include 'ganon.php';

$html = file_get_html('http://scriptmanager.freehostia.com/scripts.php?sid=24');
foreach($html('h2 ~ table td:odd') as $e) {
echo $e->getPlainText(), "<br>\n";
}

?>


which outputs:



Coh3n
25 weeks, 4 days and 20 hours
No
406
74
15/01/2010
15/06/2010
< 1 minute ago


I'm not sure how you do it for your signatures, but you must admit that this is pretty easy :)

MylesMadness
06-24-2010, 12:40 AM
http://www.thg.ru/game/20030929/images/ganon2.jpg

Sorry I couldn't help it.I dont get it...

nielsie95
06-24-2010, 12:44 AM
Neither do I :huh:

Frement
06-24-2010, 12:49 AM
Well I use only a simple Between() function, and that is like 10 lines and does the job. :)

Thought that is simple and easy :) I have to take a closer look once I need such parsing again.

Tim0suprem0
06-24-2010, 01:00 AM
http://en.wikipedia.org/wiki/Ganon

For those of you who were confused.

MylesMadness
06-24-2010, 01:03 AM
http://en.wikipedia.org/wiki/Ganon

For those of you who were confused.What does that have to do with Toms hardware guide...

The Claw
06-24-2010, 01:59 AM
http://www.thg.ru/game/20030929/images/ganon2.jpg

Sorry I couldn't help it.

Nice. :p

nielsie95
06-24-2010, 09:04 AM
Oh, I see. We can't see the image here, we see an image of tom's hardware info.. But the real picture is this one, right?

http://zelda.nintendo-europe.com/shared/user/media/contantImages/bosses_ganon_z5.jpg

Yap, named after that Ganon ^^

Dgby714
06-24-2010, 09:08 AM
I guess the site didn't like hot-linking >.>

Tim0suprem0
06-24-2010, 01:11 PM
You still don't get the Tom's hardware guide reference??

Just kidding :p Sorry guys, yeah I guess the image was just in google's cache? Because it's tom's hardware on the site too now (not just what you get when you try and hotlink). I feel quite silly.

EDIT: Oh it was in MY cache from the site, so I didn't see a problem. And yes, it was some sort of anti-hotlink thingy.


Nielsie, yes that is the ganon I was trying to show, but apparently failed to :p

nielsie95
06-24-2010, 01:28 PM
Featured in the best (http://www.gamerankings.com/browse.html) game ever, of course.. :p

Wizzup?
06-24-2010, 06:23 PM
I wonder if it is high level as normal Python... :p