org.dbpedia.extraction

wikiparser

package wikiparser

Visibility
  1. Public
  2. All

Type Members

  1. final class AnnotationKey[T] extends AnyRef

    A marker class whose purpose is to add compile-time type safety to node annotations.

    A marker class whose purpose is to add compile-time type safety to node annotations. Replaces annotation key names. Usage is similar to annotation key names, but annotation key names are distinguished by equality, while AnnotationKey objects are distinguished by identity. Thus, this class does not override hashCode and equals.

    Note: we could add a name to objects of this class, which would probably be nice for debugging, but could lead to confusion: one might think that the name is the actual key, but is isn't, and cannot be, because we cannot associate the name with the type of the annotation - at run-time, the type is gone because of erasure. Two annotation key objects with the same name would still be completely unrelated keys.

    TODO: as a dedugging aid, call Thread.currentThread.getStackTrace in the constructor and save class name, method name and line number of the place where the annotation key was created. Probably the second element in the stack trace array, but that's not guaranteed.

  2. case class ExternalLinkNode(destination: IRI, children: List[Node], line: Int, destinationNodes: List[Node] = List[Node]()) extends LinkNode with Product with Serializable

    Represents an external Link.

    Represents an external Link. The children of this node represent the label of the link. If the source does not define a label explicitly, a TextNode containing the link destination will be the only child.

    destination

    The destination URI of this link

    children

    The nodes of the label of this link

    line

    The source line number of this link

  3. case class InterWikiLinkNode(destination: WikiTitle, children: List[Node], line: Int, destinationNodes: List[Node] = List[Node]()) extends WikiLinkNode with Product with Serializable

    Represents an InterWiki Link.

    Represents an InterWiki Link. The children of this node represent the label of the link. If the source does not define a label explicitly, a TextNode containing the link destination will be the only child.

    destination

    The destination WikiTitle of this link

    children

    The nodes of the label of this link

    line

    The source line number of this link

  4. case class InternalLinkNode(destination: WikiTitle, children: List[Node], line: Int, destinationNodes: List[Node] = List[Node]()) extends WikiLinkNode with Product with Serializable

    Represents an internal Link.

    Represents an internal Link. The children of this node represent the label of the link. If the source does not define a label explicitly, a TextNode containing the link destination will be the only child.

    destination

    The destination WikiTitle of this link

    children

    The nodes of the label of this link

    line

    The source line number of this link

  5. class JsonNode extends Node

  6. sealed abstract class LinkNode extends Node

    Represents a Link.

    Represents a Link. This class is abstract and derived by ExternalLinkNode and InternalLinkNode. The children of this node represent the label of the link. If the source does not define a label explicitly, a TextNode containing the link destination will be the only child.

  7. class Namespace extends Serializable

    Namespaces codes.

    Namespaces codes.

    FIXME: This object should not exist. We must load the namespaces for a MediaWiki instance from its api.php or from a file. These values must then be injected into all objects that need them.

    FIXME: separate Wikipedia and DBpedia namespaces. We cannot even be sure that there are no name clashes. "Mapping ko" may mean "Template talk" in some language...

  8. abstract class Node extends AnyRef

    Base class of all nodes in the abstract syntax tree.

    Base class of all nodes in the abstract syntax tree.

    This class is NOT thread-safe.

  9. class PageNode extends Node

    Represents a page.

  10. case class ParserFunctionNode(title: String, children: List[Node], line: Int) extends Node with Product with Serializable

    Represents a parser function.

    Represents a parser function.

    title

    The title of the page, where this parser function is defined

    children

    The properties of this parser function

    line

    The source line number of this parser function

  11. case class PropertyNode(key: String, children: List[Node], line: Int) extends Node with Product with Serializable

    Represents a template property.

    Represents a template property.

    key

    The key by which this property is identified in the template.

    children

    The contents of the value of this property

    line

    The source line number of this property

  12. case class SectionNode(name: String, level: Int, children: List[Node], line: Int) extends Node with Product with Serializable

    Represents a section.

    Represents a section.

    name

    The name of this section

    level

    The level of this section. This corresponds to the number of '=' in the WikiText source

    children

    The nodes of the section name

    line

    The source line number of this section

  13. case class TableCellNode(children: List[Node], line: Int, rowSpan: Int, colSpan: Int) extends Node with Product with Serializable

    Represents a table cell

    Represents a table cell

    children

    The contents of this cell

    line

    The (first) line where this table cell is located in the source

  14. case class TableNode(caption: Option[String], children: List[TableRowNode], line: Int) extends Node with Product with Serializable

    Represents a table.

    Represents a table.

    The rows are represents as child nodes. Each row itself contains a child node for each of its cells.

    caption

    The caption of this table

    children

    The rows of this table

    line

    The (first) line where this table is located in the source

  15. case class TableRowNode(children: List[TableCellNode], line: Int) extends Node with Product with Serializable

    Represents a table row

    Represents a table row

    children

    The cells of this table row.

    line

    The (first) line where this table row is located in the source

  16. case class TemplateNode(title: WikiTitle, children: List[PropertyNode], line: Int, titleParsed: List[Node] = List()) extends Node with Product with Serializable

    Represents a template.

    Represents a template.

    title

    The title of the page, where this template is defined

    children

    The properties of this template

    line

    The source line number of this property

  17. case class TemplateParameterNode(parameter: String, children: List[Node], line: Int) extends Node with Product with Serializable

    Represents a template property.

    Represents a template property.

    parameter

    The key by which this property is identified in the template.

    children

    The contents of the value of this property

    line

    The source line number of this property

  18. case class TextNode(text: String, line: Int, lang: Language = null) extends Node with Product with Serializable

    Represents plain text.

    Represents plain text.

    text

    The text

    line

    The source line number where this text begins

  19. sealed abstract class WikiLinkNode extends LinkNode

  20. class WikiPage extends Serializable

    Represents a wiki page

    Represents a wiki page

    TODO: use redirect id to check redirect extractor. Or get rid of redirect extractor.

  21. trait WikiParser extends (WikiPage) ⇒ Option[PageNode]

    Parses WikiText source and builds an Abstract Syntax Tree.

    Parses WikiText source and builds an Abstract Syntax Tree. Create new instances of this trait by using the companion object.

  22. class WikiParserException extends Exception

    Thrown whenever a parsing error is encountered.

  23. class WikiTitle extends Serializable

    Represents a page title.

    Represents a page title. Or a link to a page.

    FIXME: a link is different from a title and should be represented by a different class.

Value Members

  1. object Namespace extends Serializable

  2. object Node

  3. object NodeUtil

    Utility functions for working with nodes.

  4. object TemplateNode extends Serializable

  5. object WikiPage extends Serializable

  6. object WikiParser

    Creates new WikiParser instances.

  7. object WikiTitle extends Serializable

  8. package impl

Ungrouped