Running the STH server

To run the STH server, please execute the following command from the directory where the STH component was installed:

./bin/sth

The STH component provides the user with 2 mechanisms to configure the component to the concrete needs of the user:

  • Environment variables, which can be set assigning values to them or using the sth_default.conf file if a packaged version of the STH component is used.
  • The config.js file located at the root of the STH component code, a JSON formatted file including the configuration properties.

It is important to note that environment variables, if set, take precedence over the properties defined in the config.js file.

The environment variables accepted by the script (for which there exists counterpart entries in the already mentioned config.js are the following ones:

  • STH_HOST: The host where the STH server will be started. Optional. Default value: "localhost".
  • STH_PORT: The port where the STH server will be listening. Optional. Default value: "8666".
  • FILTER_OUT_EMPTY: A flag indicating if the empty results should be removed from the response. Optional. Default value: "true".
  • TEMPORAL_DIR: A relative path from the STH home directory to a directory where the temporary files generated by the STH component are stored. These files are generated before returning them when the filetype is included in any data retrieval request. Default value: "temp".
  • DEFAULT_SERVICE: The service to be used if not sent in the Orion Context Broker notifications. Optional. Default value: "testservice".
  • DEFAULT_SERVICE_PATH: The service path to be used if not sent in the Orion Context Broker notifications. Optional. Default value: "/testservicepath".
  • DATA_MODEL: The STH component supports 3 alternative data models when storing the raw and aggregated data into the database: 1) one collection per attribute, 2) one collection per entity and 3) one collection per service path. The possible values are: "collection-per-attribute", "collection-per-entity" and "collection-per-service-path" respectively. Default value: "collection-per-entity".
  • DB_USERNAME: The username to use for the database connection. Optional. Default value: "".
  • DB_PASSWORD: The password to use for the database connection. Optional. Default value: "".
  • DB_URI: The URI to use for the database connection. This does not include the 'mongo://' protocol part (see a couple of examples below). Optional. Default value: "localhost:27017".
  • REPLICA_SET: The name of the replica set to connect to, if any. Default value: "".
  • DB_PREFIX: The prefix to be added to the service for the creation of the databases. More information below. Optional. Default value: "sth_".
  • COLLECTION\PREFIX: The prefix to be added to the collections in the databases. More information below. Optional. Default value: "sth_".
  • POOL_SIZE: The default MongoDB pool size of database connections. Optional. Default value: "5".
  • WRITE_CONCERN: The write concern policy to apply when writing data to the MongoDB database. Default value: "1".
  • SHOULD_STORE: Flag indicating if the raw and/or aggregated data should be persisted. Valid values are: "only-raw", "only-aggregated" and "both". Default value: "both".
  • SHOULD_HASH: Flag indicating if the raw and/or aggregated data collection names should include a hash portion. This is mostly due to MongoDB's limitation regarding the number of bytes a namespace may have (currently limited to 120 bytes). In case of hashing, information about the final collection name and its correspondence to each concrete service path, entity and (if applicable) attribute is stored in a collection named COLLECTION_PREFIX + "collection_names". Default value: "false".
  • TRUNCATION_EXPIRE_AFTER_SECONDS: Data from the raw and aggregated data collections will be removed if older than the value specified in seconds. In case of raw data the reference time is the one stored in the recvTime property whereas in the case of the aggregated data the reference of time is the one stored in the _id.origin property. Set the value to 0 not to apply this time-based truncation policy. Default value: "0".
  • TRUNCATION_SIZE: The oldest raw data (according to insertion time) will be removed if the size of the raw data collection gets bigger than the value specified in bytes. Set the value to 0 not to apply this truncation policy. Take into consideration than the "size" configuration parameter is mandatory in case size collection truncation is desired as required by MongoDB. Default value: "0". Notice that this configuration parameter does not affect the aggregated data collections since MongoDB does not currently support updating documents in capped collections which increase the size of the documents. Notice also that in case of the raw data, the size-based truncation policy takes precedence over the TTL one. More concretely, if a size limitation is set, the previous time expiration is ignored for the raw data collections since currently MongoDB does not support TTL in capped collections. Default value: "0".
  • TRUNCATION_MAX: The oldest raw data (according to insertion time) will be removed if the number of documents in the raw data collections goes beyond the specified value. Set the value to 0 not to apply this truncation policy. Notice that this configuration parameter does not affect the aggregated data collections since MongoDB does not currently support updating documents in capped collections which increase the size of the documents. Default value: "0".
  • IGNORE_BLANK_SPACES: Attribute values to one or more blank spaces should be ignored and not processed either as raw data or for the aggregated computations. Default value: "true".
  • LOGOPS_LEVEL: The log level to use. Possible values are: "DEBUG", "INFO", "WARN", "ERROR" and "FATAL". Since the STH component uses the logops package for logging, for further information check out the logops npm package information online. Default value: "INFO".
  • LOGOPS_FORMAT: The log format to use. Possible values are: "json" (writes logs as JSON), "dev" (for development, used when the NODE_ENV variable is set to 'development'). Since the STH component uses the logops package for logging, for further information please check out the logops npm package information online. Default value: "json".
  • PROOF_OF_LIFE_INTERVAL: The time in seconds between proof of life logging messages informing that the server is up and running normally. Default value: "60".

For example, to start the STH server listening on port 7777, connecting to a MongoDB instance listening on mymongo.com:27777 and without filtering out the empty results, use:

STH_PORT=7777 DB_URI=mymongo.com:27777 FILTER_OUT_EMPTY=false ./bin/sth

On the other hand, in case of connecting to a MongoDB replica set composed of 3 machines with IPs addresses 1.1.1.1, 1.1.1.2, 1.1.1.3 listening on ports 27771, 27772 and 27773, respectively, use:

DB_URI=1.1.1.1:27771,1.1.1.2:27772,1.1.1.3:27773 ./bin/sth

The STH component creates a new database for each service. The name of these databases will be the concatenation of the DB_PREFIX environment variable and the service name, using an underscore (_) as the separator.

As already mentioned, all these configuration parameters can also be adjusted using the config.js file whose contents are self-explanatory.

It is important to note that there is a limitation of 120 bytes for the namespaces (concatenation of the database name and collection names) in MongoDB. Related to this, the STH generates the collection names using 2 possible mechanisms:

  1. Plain text: In case the SHOULD_HASH configuration parameter is set to false (the default option), the collection names are generated as a concatenation of the COLLECTION_PREFIX configuration parameter, the service path, the entity id, the entity type and the suffix ".aggr" for the collections storing the aggregated data. The length of the collection name plus the DB_PREFIX plus the database name (or service) should not be more than 120 bytes using UTF-8 format or MongoDB will complain and will not create the collection, and consequently no data would be stored by the STH. A warning message is logged in case this happens.
  2. Hash based: In case the SHOULD_HASH option is set to something distinct from false, the collection names are generated as a concatenation of the COLLECTION_PREFIX configuration parameter, a generated hash and the suffix ".aggr" for the collections storing the aggregated data. To avoid collisions in the generation of these hashes, they are forced to be 20 bytes long at least. Once again, the length of the collection name plus the DB_PREFIX plus the database name (or service) should not be more than 120 bytes using UTF-8 or MongoDB will complain and will not create the collection, and consequently no data would be stored by the STH. The STH component adapts the hash size to the available characters. The hash function used is SHA-512. A warning message is logged in case there are not enough available characters for the hash.

In case of using hashes as part of the collection names and as a way to let the user or developer easily recover this information, a collection named by concatenating the DB_COLLECTION_PREFIX configuration parameter and the text "_collection_names" is created and fed with information regarding the mapping of the collection names and the combination of related services, service paths, entities and attributes.