With a little scripting, cleaning up documentation and other large sets of HTML files can be easy. But first you need to parse them.