t

tagsoup

TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML.
http://home.ccil.org/~cowan/XML/tagsoup/
Apache License 2.0
John Cowan
Files download
File Operation
tagsoup-1.2.1.jar download
tagsoup-1.2.1.pom download
tagsoup-1.2.1-sources.jar download
Apache Maven
<dependency>
  <groupId>org.ccil.cowan.tagsoup</groupId>
  <artifactId>tagsoup</artifactId>
  <version>1.2.1</version>
</dependency>
Gradle Groovy
implementation 'org.ccil.cowan.tagsoup:tagsoup:1.2.1'
Gradle Kotlin
implementation("org.ccil.cowan.tagsoup:tagsoup:1.2.1")
Scala SBT
libraryDependencies += "org.ccil.cowan.tagsoup" % "tagsoup" % "1.2.1"
Groovy Grape
@Grapes(
  @Grab(group='org.ccil.cowan.tagsoup', module='tagsoup', version='1.2.1')
)
Apache Ivy
<dependency org="org.ccil.cowan.tagsoup" name="tagsoup" rev="1.2.1" />
Leiningen
[org.ccil.cowan.tagsoup/tagsoup "1.2.1"]
Apache Buildr
'org.ccil.cowan.tagsoup:tagsoup:jar:1.2.1'
Dependencies
The project has no third-party dependencies