org.dbpedia.extraction.mappings

AbstractExtractor

class AbstractExtractor extends WikiPageExtractor

Extracts wiki texts like abstracts or sections in html. NOTE: This class is not only used for abstract extraction but for extracting wiki text of the whole page The NifAbstract Extractor is extending this class. All configurations are now outsourced to //extraction-framework/core/src/main/resources/mediawikiconfig.json change the 'publicParams' entries for tweaking endpoint and time parameters

From now on we use MobileFrontend for MW <2.21 and TextExtracts for MW > 2.22 The patched mw instance is no longer needed except from minor customizations in LocalSettings.php TextExtracts now uses the article entry and extracts the abstract. The retional for the new extension is that we will not need to load all articles in MySQL, just the templates At the moment, setting up the patched MW takes longer than the loading of all articles in MySQL :) so, even this way it's way better and cleaner ;) We leave the old code commented since we might re-use it soon

Annotations
@deprecated @ExtractorAnnotation( name = "abstract extractor" )
Deprecated

(Since version 2016-10) replaced by NifExtractor.scala: which will extract the whole page content including the abstract

Linear Supertypes
WikiPageExtractor, Extractor[WikiPage], Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. AbstractExtractor
  2. WikiPageExtractor
  3. Extractor
  4. Serializable
  5. AnyRef
  6. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new AbstractExtractor(context: AnyRef { ... /* 3 definitions in type refinement */ })

Value Members

  1. final def !=(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  4. val apiParametersFormat: String

    Attributes
    protected
  5. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  6. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. val datasets: Set[Dataset]

    Datasets generated by this extractor.

    Datasets generated by this extractor. Used for serialization. If a mapping implementation does not return all datasets it produces, serialization may fail.

    Definition Classes
    AbstractExtractorExtractor
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def extract(pageNode: WikiPage, subjectUri: String): Seq[Quad]

    subjectUri

    The subject URI of the generated triples

    returns

    A graph holding the extracted data

    Definition Classes
    AbstractExtractorExtractor
  11. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. def finalizeExtractor(): Unit

    when extractor needs some finalization

    when extractor needs some finalization

    Definition Classes
    Extractor
  13. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  14. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  15. def initializeExtractor(): Unit

    when extractor has a pre-phase

    when extractor has a pre-phase

    Definition Classes
    Extractor
  16. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  17. val language: String

    protected params ...

    protected params ...

    Attributes
    protected
  18. val logger: Logger

    Attributes
    protected
  19. lazy val longProperty: OntologyProperty

    Attributes
    protected
  20. lazy val longQuad: (String, String, String) ⇒ Quad

    Attributes
    protected
  21. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  22. final def notify(): Unit

    Definition Classes
    AnyRef
  23. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  24. def short(text: String, max: Int = 500): String

    Returns the first sentences of the given text that have less than 500 characters.

    Returns the first sentences of the given text that have less than 500 characters. A sentence ends with a dot followed by whitespace. TODO: probably doesn't work for most non-European languages. TODO: analyse ActiveAbstractExtractor, I think this works quite well there, because it takes the first two or three sentences

    text
    max

    max length

    returns

    result string

  25. lazy val shortProperty: OntologyProperty

    Attributes
    protected
  26. lazy val shortQuad: (String, String, String) ⇒ Quad

    Attributes
    protected
  27. var state: ExtractorState.Value

    Definition Classes
    Extractor
  28. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  29. def toString(): String

    Definition Classes
    AnyRef → Any
  30. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from WikiPageExtractor

Inherited from Extractor[WikiPage]

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped