Hi guys!
I'm creating a piece of software that basically crawls through several pages getting ALL the links on the page.
You may say "yeah, that's easy... you can download the page's source code via an httpClient and then use HtmlParser, JerichoParser, etc.", but the problem is that those are decent parsers for HTML links but VERY SLOW parsers for links on JavaScript scripts and other technologies.
That's the reason of my title. I need to crawl through ALL of the links on a page, but on a decent speed... JUST as a browser does.

My questions are:
1. Recommendations on a parser to do this job? I already tried with JerichoParser, and it is pretty slow (the JS part)
2. Which variety of links and technologies will I encounter (besides HTML and JS)? I need a parser that handles all of them on an efficient (and easy!) way.


Thanks in advance!
-Lucas