

Information extraction is the process of identifying specified classes of entities, relations, and events in natural language text – creating structured data from unstructured input. JET, the Java Extraction Toolkit, developed at New York University over the past fifteen years, provides a rich set of tools for research and education in information extraction from English text. These include standard language processing tools such as a tokenizer, sentence segmenter, part-of-speech tagger, name tagger, regular-expression pattern matcher, and dependency parser. Also provided are relation and event extractors based on the specifications of the U.S. Government's ACE [Automatic Content Extraction] program. The program is provided under an Apache 2.0 license.
Apache License, Version 2.0
New York University
Ralph Grishman Yifan He Angus Grieve-Smith
Files download
File Operation
jet-1.9.0.jar download
jet-1.9.0.pom download
jet-1.9.0-sources.jar download
Apache Maven
Gradle Groovy
implementation 'edu.nyu:jet:1.9.0'
Gradle Kotlin
Scala SBT
libraryDependencies += "edu.nyu" % "jet" % "1.9.0"
Groovy Grape
  @Grab(group='edu.nyu', module='jet', version='1.9.0')
Apache Ivy
<dependency org="edu.nyu" name="jet" rev="1.9.0" />
[edu.nyu/jet "1.9.0"]
Apache Buildr