All Configuration Options
Below is a table containing all of the configurable options within the Graph Writer. These can be set when configuring your stack or on an already running Writer.
To do this, simply call the endpoint /updateConfig?configEntry=<entry>&configValue=<value> where the entry the config item as seen below, and value is the new value you wish to set. Any configuration changed while the Writer is running can also be backed up and restored.
Transformer Writer Configuration
Environment Variable
Entry
Default Value
Description
FRIENDLY_NAME
friendlyName
Graph-Writer
The name you wish to set your Writer up with.
TRANSFORMER_LICENSE
transformerLicense
The License key provided required for running the Writer. Only required when running a non AWS Marketplace version of the Writer.
GRAPH_DATABASE_ENDPOINT
graphDatabaseEndpoint
The endpoint for the graph database you wish to upload your data to.
GRAPH_DATABASE_TYPE
graphDatabaseType
sparql
The Knowledge Graph type, some Semantic Graphs will support the default sparql type (e.g. AllegroGraph), however certain graphs require specific type declaration, these include graphdb, stardog, blazegraph, neptune-sparql and rdfox. If you are using a Property Graph, you can set a specific Graph provider, including neo4j, neptune-cypher, neptune-gremlin, or the traversal language cypher or gremlin.
GRAPH_DATABASE_REASONING
graphDatabaseReasoning
false
Whether you want reasoning enabled or disabled. This only applies to Semantic Graphs.
GRAPH_DATABASE_USERNAME
graphDatabaseUsername
The username of your Graph Database. Leave blank if your Knowledge Graph does not require any authentication.
GRAPH_DATABASE_PASSWORD
graphDatabasePassword
The password of your Graph Database. Leave blank if your Graph Database does not require any authentication.
CONFIG_BACKUP
configBackup
file:///var/local/config-backup/
The URL directory where the config will be backed up to when calling the upload config endpoint
DELETE_SOURCE
deleteSourceFile
false
Whether you wish to delete the source data file after it has been written to the Graph Database
TRANSFORMER_RUN_STANDALONE
runStandalone
true
The Graph Writer is designed to run as part of a larger end to end system with the Transformer providing the Writer with RDF or CSV files to write to a Graph Database. As part of this process, Kafka is used to communicate between services. This is turned off by default, however if you want to enable the running of the Graph Writer with connected services, set this property to false.
INGESTION_MODE
ingestionMode
insert
How to process the ingested data.
'insert': the new data is ingested in full and do not replace existing data. The new dataset adds new values to already existing subject-predicates.'update': the new data is used for updating the existing data. The new dataset replaces value in existing subject-predicate.
Please note this only applies to Semantic Graphs, in Property Graphs the Writer defaults to an upsert pattern, updating the properties for a given node or edge.
Kafka Configuration
Environment Variable
Entry
Default Value
Description
KAFKA_BROKERS
kafkaBrokers
localhost:9092
The Kafka Broker is what tells the Writer where to look for your Kafka Cluster. Set with the following structure <kafka-ip>:<kafka-port>. The recommended port is 9092.
KAFKA_TOPIC_NAME_SOURCE
topicNameSource
success_queue
The topic used for the Consumer to read messages from containing the URLs of the source data files to ingest.
KAFKA_TOPIC_NAME_DLQ
topicNameDLQ
dead_letter_queue
The topic used to push messages containing reasons for failure within the Writer. These messages are represented as JSON.
KAFKA_GROUP_ID_CONFIG
groupIdConfig
consumerGroup1
The identifier of the group this consumer belongs to.
KAFKA_AUTO_OFFSET_RESET_CONFIG
autoOffsetResetConfig
earliest
What to do when there is no initial offset in Kafka or if an offset is out of range.
earliest: automatically reset the offset to the earliest offset
latest: automatically reset the offset to the latest offset
KAFKA_MAX_POLL_RECORDS
maxPollRecords
100
The maximum number of records returned in a single call to poll.
KAFKA_TIMEOUT
timeout
1000
Kafka consumer polling time out.
Provenance Configuration
Environment Variable
Entry
Default Value
Description
RECORD_PROVO
recordProvo
false
Currently, the Graph Writer does not generate its own provenance meta-data and so this option is set to false
Logging Configuration
Environment Variable
Default Value
Description
LOG_LEVEL_TRANSFORMER
INFO
Log level for Writer/Transformer loggers - change to DEBUG to see more in depth logs, or to WARN or ERROR to quiet the logging.
Additional Logging Configuration
Environment Variable
Default Value
Description
LOGGING_LEVEL
WARN
Global log level
LOGGING_APPENDERS_CONSOLE_TIMEZONE
UTC
Timezone for console logging
LOGGING_APPENDERS_TXT_FILE_THRESHOLD
ALL
Threashold for text logging
Log Format (not overridable)
%-6level [%d{HH:mm:ss.SSS}] [%t] %logger{5} - %X{code} %msg %n
Pattern for logging messages
Current Log Filename (not overridable)
/var/log/graphbuild/text/current/application_${applicationName}_${timeStamp}.txt.log
Pattern for log file name
LOGGING_APPENDERS_TXT_FILE_ARCHIVE
true
Archive log text files
Archived Log Filename Pattern (not overridable)
/var/log/graphbuild/text/archive/application_${applicationName}_${timeStamp}_to_%d{yyyy-MM-dd}.txt.log
Log file rollover frequency depends on pattern in following property. For example %d{yyyy-MM-ww} declares rollover weekly
LOGGING_APPENDERS_TXT_FILE_ARCHIVED_TXT_FILE_COUNT
7
Max number of archived text files
LOGGING_APPENDERS_TXT_FILE_TIMEZONE
UTC
Timezone for text file logging
LOGGING_APPENDERS_JSON_FILE_THRESHOLD
ALL
Threashold for text logging
Log Format (not overridable)
%-6level [%d{HH:mm:ss.SSS}] [%t] %logger{5} - %X{code} %msg %n
Pattern for logging messages
Current Log Filename (not overridable)
/var/log/graphbuild/json/current/application_${applicationName}_${timeStamp}.json.log
Pattern for log file name
LOGGING_APPENDERS_JSON_FILE_ARCHIVE
true
Archive log text files
Archived Log Filename Pattern (not overridable)
/var/log/graphbuild/json/archive/application_${applicationName}_${timeStamp}_to_%d{yyyy-MM-dd}.json.log
Log file rollover frequency depends on pattern in following property. For example %d{yyyy-MM-ww} declares rollover weekly
LOGGING_APPENDERS_JSON_FILE_ARCHIVED_FILE_COUNT
7
Max number of archived text files
LOGGING_APPENDERS_JSON_FILE_TIMEZONE
UTC
Timezone for text file logging
LOGGING_APPENDERS_JSON_FILE_LAYOUT_TYPE
json
The layout type for the json logger
Last updated

