

-JTidy - JTidy

-JTidy - Browse /JTidy/r938 at SourceForge.net

-Maven Repository: net.sf.jtidy >> jtidy >> r938
-->JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML. In addition, JTidy provides a DOM interface to the document that is being processed, which effectively makes you able to use JTidy as a DOM parser for real-world HTML.

-[ヅ] はてなモノリスの投稿データを Java + JTidy + ROME で取得する (2012-07-23)

-[ヅ] Java と JTidy で Yahoo!リアルタイム検索で話題のキーワードを取得する (2015-01-14)

-Java の HTML パーサとスクレイピング