This is such a specific task that I would just recommend you write it yourself. The simplest thing you need is an XPATH selector to give you the h1/h2/h3 tags.
Counting the headings:
- Pick any one of your favorite programming languages.
- Issue a web request for a page on your website (Ruby, Perl, PHP).
- Parse the HTML.
- Invoke the XPATH heading selector and count the number of elements that it returns.
Crawling your site:
Do step 2 through 4 for all of your pages (you'll probably have to have a queue of pages that you want to crawl). If you want to crawl all of the pages, then it will be just a little more complicated:
- Crawl your home page.
- Select all anchor tags.
- Extract the URL from each
href
and discard any URLs that don't point to your website.
- Perform a URL-seen test: if you have seen it before, then discard, otherwise queue for crawling.
URL-Seen test:
The URL-seen test is pretty simple: just add all the URLs you've seen so far to a hash map. If you run into a URL that is in your hash map, then you can ignore it. If it's not in the hash map, then add it to the crawl queue. The key for the hash map should be the URL and the value should be some kind of a structure that allows you to keep statistics for the headings:
Key = URL
Value = struct{ h1Count, h2Count, h3Count...}
That should be about it. I know it seems like a lot, but it shouldn't be more than a few hundred lines of code!