org.dbpedia.extraction.config

Config

class Config extends Properties with Serializable

Documentation of config values

TODO universal.properties is loaded and then overwritten by the job specific config, however, we are working on removing universal properties by setting (and documenting) sensible default values here, that CAN be overwritten in job sepcific config

Guideline: * Use Java/Scaladoc always * Parameters (lazy val) MUST be documented in the following manner:

  1. provide info about how the parameter works 2. describe all checks done, i.e. fail on null or >300 3. state the default value * Parameters MUST follow this pattern:
  2. if the config param is called "base-dir" then the param MUST be "baseDir" 2. i.e. replace - by CamelCase, since "-" can not be used in scala

TODO @Fabian please: * go through universal properties and other configs and move all comments here * after removing place a comment in the property file refering to * set default values according to universal.properties * try to FOLLOW THE GUIDELINES above, add TODO if unclear * if possible, move all def functions to ConfigUtils * check the classes using the params for validation checks and move them here

Linear Supertypes
Properties, Hashtable[AnyRef, AnyRef], Serializable, Cloneable, Map[AnyRef, AnyRef], Dictionary[AnyRef, AnyRef], AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. Config
  2. Properties
  3. Hashtable
  4. Serializable
  5. Cloneable
  6. Map
  7. Dictionary
  8. AnyRef
  9. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Config(configPath: String)

Type Members

  1. class LineReader extends AnyRef

    Attributes
    private[java.util]
    Definition Classes
    Properties

Value Members

  1. final def !=(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  4. lazy val abstractParameters: AbstractParameters

  5. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  6. def clear(): Unit

    Definition Classes
    Hashtable → Map
  7. def clone(): AnyRef

    Definition Classes
    Hashtable → AnyRef
  8. def compute(arg0: AnyRef, arg1: BiFunction[_ >: AnyRef, _ >: AnyRef, _ <: AnyRef]): AnyRef

    Definition Classes
    Hashtable → Map
  9. def computeIfAbsent(arg0: AnyRef, arg1: Function[_ >: AnyRef, _ <: AnyRef]): AnyRef

    Definition Classes
    Hashtable → Map
  10. def computeIfPresent(arg0: AnyRef, arg1: BiFunction[_ >: AnyRef, _ >: AnyRef, _ <: AnyRef]): AnyRef

    Definition Classes
    Hashtable → Map
  11. val configPath: String

  12. def contains(arg0: Any): Boolean

    Definition Classes
    Hashtable
  13. def containsKey(arg0: Any): Boolean

    Definition Classes
    Hashtable → Map
  14. def containsValue(arg0: Any): Boolean

    Definition Classes
    Hashtable → Map
  15. lazy val copyrightCheck: Boolean

  16. lazy val datasetnameExtension: Option[String]

    instead of a defined output dataset name, one can specify a name extension turncated at the end of the input dataset name (e.g.

    instead of a defined output dataset name, one can specify a name extension turncated at the end of the input dataset name (e.g. '-transitive' -> instance-types-transitive)

  17. lazy val dbPediaVersion: String

    The version string of the DBpedia version being extracted

  18. lazy val disambiguations: String

  19. lazy val dumpDir: File

    base-dir gives either an absolute path or a relative path to where all data is stored, normally wikidumps are downloaded here and extracted data is saved next to it, created folder structure is {{lang}}wiki/$date

    base-dir gives either an absolute path or a relative path to where all data is stored, normally wikidumps are downloaded here and extracted data is saved next to it, created folder structure is {{lang}}wiki/$date

    DEV NOTE: 1. this must stay lazy as it might not be used or creatable in the SPARK extraction 2. Download.scala in core does the creation

    DEFAULT ./wikidumps

    TODO rename dumpDir to baseDir

  20. def elements(): Enumeration[AnyRef]

    Definition Classes
    Hashtable → Dictionary
  21. def entrySet(): Set[Entry[AnyRef, AnyRef]]

    Definition Classes
    Hashtable → Map
  22. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  23. def equals(arg0: Any): Boolean

    Definition Classes
    Hashtable → Map → AnyRef → Any
  24. lazy val extractorClasses: Map[Language, Seq[Class[_ <: Extractor[_]]]]

    the extractor classes to be used when extracting the XML dumps

  25. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  26. def forEach(arg0: BiConsumer[_ >: AnyRef, _ >: AnyRef]): Unit

    Definition Classes
    Hashtable → Map
  27. lazy val formats: Map[String, Formatter]

  28. def get(arg0: Any): AnyRef

    Definition Classes
    Hashtable → Map → Dictionary
  29. def getArbitraryStringProperty(key: String): Option[String]

  30. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  31. def getDefaultExtractionRecorder[T](lang: Language, interval: Int = 100000, preamble: String = null, writer: Writer = null, datasets: ListBuffer[Dataset] = ListBuffer[Dataset](), monitor: ExtractionMonitor = null): ExtractionRecorder[T]

  32. def getOrDefault(arg0: Any, arg1: AnyRef): AnyRef

    Definition Classes
    Hashtable → Map
  33. def getProperty(arg0: String, arg1: String): String

    Definition Classes
    Properties
  34. def getProperty(arg0: String): String

    Definition Classes
    Properties
  35. def hashCode(): Int

    Definition Classes
    Hashtable → Map → AnyRef → Any
  36. lazy val inputDatasets: Seq[String]

    An array of input dataset names (e.g.

    An array of input dataset names (e.g. 'instance-types' or 'mappingbased-literals') (separated by a ',')

  37. lazy val inputSuffix: Option[String]

    the suffix of the files representing the input dataset (usually a combination of RDF serialization extension and compression used - e.g.

    the suffix of the files representing the input dataset (usually a combination of RDF serialization extension and compression used - e.g. .ttl.bz2 when using the TURTLE triples compressed with bzip2)

  38. def isDownloadComplete(lang: Language): Boolean

    determines if 1.

    determines if 1. the download has to be completed and if so 2. looks for the download-complete file

    lang

    - the language for which to check

    returns

  39. def isEmpty(): Boolean

    Definition Classes
    Hashtable → Map → Dictionary
  40. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  41. def keySet(): Set[AnyRef]

    Definition Classes
    Hashtable → Map
  42. def keys(): Enumeration[AnyRef]

    Definition Classes
    Hashtable → Dictionary
  43. lazy val languages: Array[Language]

    An array of languages specified by the exact enumeration of language wiki codes (e.g.

    An array of languages specified by the exact enumeration of language wiki codes (e.g. en,de,fr...) or article count ranges ('10000-20000' or '10000-' -> all wiki languages having that much articles...) or '@mappings', '@chapters' when only mapping/chapter languages are of concern or '@downloaded' if all downloaded languages are to be processed (containing the download.complete file) or '@abstracts' to only process languages which provide human readable abstracts (thus not 'wikidata' and the like...)

  44. def list(arg0: PrintWriter): Unit

    Definition Classes
    Properties
  45. def list(arg0: PrintStream): Unit

    Definition Classes
    Properties
  46. def load(arg0: InputStream): Unit

    Definition Classes
    Properties
    Annotations
    @throws( classOf[java.io.IOException] )
  47. def load(arg0: Reader): Unit

    Definition Classes
    Properties
    Annotations
    @throws( classOf[java.io.IOException] )
  48. def loadFromXML(arg0: InputStream): Unit

    Definition Classes
    Properties
    Annotations
    @throws( ... ) @throws( classOf[java.io.IOException] )
  49. lazy val logDir: Option[File]

    The directory where all log files will be stored

  50. lazy val mappingsDir: File

    Local mappings files, downloaded for speed and reproducibility Note: This is lazy to defer initialization until actually called (eg.

    Local mappings files, downloaded for speed and reproducibility Note: This is lazy to defer initialization until actually called (eg. this class is not used directly in the distributed extraction framework - DistConfig.ExtractionConfig extends Config and overrides this val to null because it is not needed)

  51. lazy val mediawikiConnection: MediaWikiConnection

  52. def merge(arg0: AnyRef, arg1: AnyRef, arg2: BiFunction[_ >: AnyRef, _ >: AnyRef, _ <: AnyRef]): AnyRef

    Definition Classes
    Hashtable → Map
  53. lazy val namespaces: Set[Namespace]

    namespaces loaded defined by the languages in use (see languages)

  54. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  55. lazy val nifParameters: NifParameters

  56. final def notify(): Unit

    Definition Classes
    AnyRef
  57. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  58. lazy val ontologyFile: File

    Local ontology file, downloaded for speed and reproducibility Note: This is lazy to defer initialization until actually called (eg.

    Local ontology file, downloaded for speed and reproducibility Note: This is lazy to defer initialization until actually called (eg. this class is not used directly in the distributed extraction framework - DistConfig.ExtractionConfig extends Config and overrides this val to null because it is not needed)

  59. lazy val outputDataset: Option[String]

    A dataset name for the output file generated (e.g.

    A dataset name for the output file generated (e.g. 'instance-types' or 'mappingbased-literals')

  60. lazy val outputSuffix: Option[String]

    same as for inputSuffix (for the output dataset)

  61. lazy val parallelProcesses: Int

    Number of parallel processes allowed.

    Number of parallel processes allowed. Depends on the number of cores, type of disk and IO speed

  62. lazy val policies: Map[String, Array[Policy]]

  63. def propertyNames(): Enumeration[_]

    Definition Classes
    Properties
  64. def put(arg0: AnyRef, arg1: AnyRef): AnyRef

    Definition Classes
    Hashtable → Map → Dictionary
  65. def putAll(arg0: Map[_ <: AnyRef, _ <: AnyRef]): Unit

    Definition Classes
    Hashtable → Map
  66. def putIfAbsent(arg0: AnyRef, arg1: AnyRef): AnyRef

    Definition Classes
    Hashtable → Map
  67. def rehash(): Unit

    Attributes
    protected[java.util]
    Definition Classes
    Hashtable
  68. def remove(arg0: Any, arg1: Any): Boolean

    Definition Classes
    Hashtable → Map
  69. def remove(arg0: Any): AnyRef

    Definition Classes
    Hashtable → Map → Dictionary
  70. def replace(arg0: AnyRef, arg1: AnyRef): AnyRef

    Definition Classes
    Hashtable → Map
  71. def replace(arg0: AnyRef, arg1: AnyRef, arg2: AnyRef): Boolean

    Definition Classes
    Hashtable → Map
  72. def replaceAll(arg0: BiFunction[_ >: AnyRef, _ >: AnyRef, _ <: AnyRef]): Unit

    Definition Classes
    Hashtable → Map
  73. lazy val requireComplete: Boolean

    before processing a given language, check if the download.complete file is present

  74. lazy val retryFailedPages: Boolean

    TODO experimental, ignore for now

  75. val runJobsInParallel: Boolean

    Normally extraction jobs are run sequentially (one language after the other), but for some jobs it makes sense to run these in parallel.

    Normally extraction jobs are run sequentially (one language after the other), but for some jobs it makes sense to run these in parallel. This only should be used if a single extraction job does not take up the available computing power.

  76. def setProperty(arg0: String, arg1: String): AnyRef

    Definition Classes
    Properties
  77. def size(): Int

    Definition Classes
    Hashtable → Map → Dictionary
  78. lazy val slackCredentials: Try[SlackCredentials]

    If set, extraction summaries are forwarded via the API of Slack, displaying messages on a dedicated channel.

    If set, extraction summaries are forwarded via the API of Slack, displaying messages on a dedicated channel. The URL of the slack webhook to be used the username under which all messages are posted (has to be registered for this webhook?) Threshold of extracted pages over which a summary of the current extraction is posted Threshold of exceptions over which an exception report is posted

  79. lazy val source: Seq[String]

    get all universal properties, check if there is an override in the provided config file

  80. lazy val sparkLocalDir: String

  81. lazy val sparkMaster: String

  82. def store(arg0: OutputStream, arg1: String): Unit

    Definition Classes
    Properties
    Annotations
    @throws( classOf[java.io.IOException] )
  83. def store(arg0: Writer, arg1: String): Unit

    Definition Classes
    Properties
    Annotations
    @throws( classOf[java.io.IOException] )
  84. def storeToXML(arg0: OutputStream, arg1: String, arg2: String): Unit

    Definition Classes
    Properties
    Annotations
    @throws( classOf[java.io.IOException] )
  85. def storeToXML(arg0: OutputStream, arg1: String): Unit

    Definition Classes
    Properties
    Annotations
    @throws( classOf[java.io.IOException] )
  86. def stringPropertyNames(): Set[String]

    Definition Classes
    Properties
  87. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  88. def throwMissingPropertyException(property: String, required: Boolean): Unit

  89. def toString(): String

    Definition Classes
    Hashtable → AnyRef → Any
  90. def values(): Collection[AnyRef]

    Definition Classes
    Hashtable → Map
  91. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  92. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  93. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  94. lazy val wikiName: String

  95. lazy val wikidataMappingsFile: File

Deprecated Value Members

  1. def save(arg0: OutputStream, arg1: String): Unit

    Definition Classes
    Properties
    Annotations
    @Deprecated @deprecated
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

Inherited from Properties

Inherited from Hashtable[AnyRef, AnyRef]

Inherited from Serializable

Inherited from Cloneable

Inherited from Map[AnyRef, AnyRef]

Inherited from Dictionary[AnyRef, AnyRef]

Inherited from AnyRef

Inherited from Any

Ungrouped