Plugin Results Format
Initial stage
Currently only Python 3.7 code is supported, so this will be the first thing to get installed on your local workstation. Then download this requirements.txt file so you have the correct libraries and their version available. Install using pip install -r requirements.txt
.
A plugin receives an input manifest as first command line argument, and the output file path via the second arguments. A report plugin run would look like this:
A basic python plugin would look like this:
It's important to note that the output JSON should always be written to argv[2]
; in production this location will be writeable. In general, any files during your plugin run should written to the current path, thus no absolute or other paths, because they will not exist or not be writeable when your plugin runs on the platform in production.
First we read the JSON manifest file. Then we construct a basic JSON plugin results object, shown in the output
variable in the example above.
The data
key can contain anything you like and will be available as-in in the context of the js
, jsx
and helper
keys.
A plugin has different JavaScript contexts to use code within, each defined by their respective keys as a string in the plugin JSON result object. When a plugin runs as stand-alone, it first evaluates the js
context and then jsx
to display the visualizations — each of which can call any optionally defined helpers
:
js
— here we have access to a globalresults
variable; an object containing the plugin results. Thisresults
object contains two properties:data
which is whateverdata
was returned from Python in the output JSON andhelpers
is a helper class that can be optionally outputted in the output JSON.
The object returned injs
will be available in thejsx
context as thedata
variable. Normally thejs
property will not be used; instead it's best to use thehelper
instead because that will allows for better re-use of any helper code. If used, it's important that the JS code has code at the end of the file, and not comments. See the third item below.jsx
— here we render any JSX components/tags to visualize our plugin. For a full overview of JSX components and API functions see the see the API library. Here we have access to the same globalresults
variable. In addition we also have access to a variable calleddata
, which contains whatever was returned from thejs
code. In most casesjs
should not be used, and there is thus no returneddata
available. Instead we use thehelper
class instead, as this will allow for easier code re-use, especially when this plugin is available to third-parties.helper
— this is the main glue of the plugin, and defines functions that can be re-used in thejsx
orjs
code parts. Within the helper code, you can accessthis
to call another helper function. All helpers are also pre-initialized withthis.results
, which gives you access to the pluginresults
object. The helper functions can also be documented with a JSON format, making them available to third-parties where they can be re-used in the Insight API, with fully ready to use code examples. See the Plugin Helpers documentation for more details.
Finally a plugin ouput should always contain a valid status
object. In case of no errors, it can simply look like this:
In case of an error, it's most basic format would be:
You can also provide a friendly title for end-users in the title
key, together with a more detailed error message for the end-user in explanation
. Any backtrace can be put into the backtrace
. All fields are optional, so you can supply whatever you have available. A full error would look like this:
While the plugin code above thus shows how to use the full js
, jsx
and helper
formats, normally you would only use the jsx
and helper
parts. The code would then be structure more like this:
Helper functions should not really be used to do intensive work, which is done instead in the plugin itself. The helpers should be a glue layer that render components, visualizes data
— by mapping this custom object to the format for different visualization components, and to expose convenience functions for users of the plugin, which could be yourself or third-parties in case you publish the template.
Within js
, jsx
and helper
there is access to a range of visualization components, utility methods and libraries to make developing plugins as easy as possible. See the API library for more details.
Additional stages
Specifying datasets
If you need to run additional stages, the plugin results JSON of the initial stage is the place to indicate so.
You can use a process
object on the root object to indicate what additional stages are needed, and what datasets each of them needs.
If you don't provide a process
object the plugin finished after the initial stage is executed.
In the example below we want to process one more stage named trainMore
, using a dataset that is using the latest state of users, accessible via its latestData
when the plugin receives the JSON manifest in the next trainMore
run:
We can specify other moment types for datasets. Below we show all possible combinations, and all datasets under that stage will have access to all those four datasets. Dataset of type latest
means the current state of users. Type since
can have seconds
and states the number of seconds since user creation.
pctOfConvertedToMeasure
means that we first calculate for all users the time in seconds it took them to convert (since their creation), then take the 5th percentile of that (1.0 - 0.95). Then we use the resulting 5th percentile as number of seconds since user creation for the dataset, to measure only actions done up to that number of seconds. Conversion is by default "where": "y_value='true'"
, but where
can be omitted in most cases. If you have a special query you want to run on the dataset in terms of what users to filter for you can do that in there. The where
is being run on the table and columns as described in Dataset and features. Finally be aware that the pctOfConvertedToMeasure
and its "where": "y_value='true'"
filter is always being run against whatever the initial dataset was (0 seconds, latest or 95%).
Specifying stages
After the initial stage, we can run a few more stages. Each additional key under process
means one additional stage, and for each stage we can have a number of (different) dataSets
. Each of the additional stages run in parallel, and do not have access to data from other addtional stages. It can only access data from the initial stage, more on that in the next section on Plugin Storage.
To know which stage a current plugin run is in, parse the JSON manifest and read the stage
key. It returns a string, set to initial
for the initial stage and for any additional stages whatever has been set as stage key on the process
object. This can be used to do a different analysis, with a different datasets, depending on the stage. For a prediction plugin, the initial stage can be used to figure out what additional datasets are needed, then we have an actual training
stage that trains the model. Finally the server
stage can be used to run a http server that responds to realtime prediction requests or sets a prediction score for all users in the dataset &mdash& more on that in the Deployment section and Batch updating section.
Below is an example that runs three additional stages, where each one has access to its own dataset plus the latest
one.
Note that each process
key needs to be unique. Within dataSets
if a key is re-used in another stage, it means that same dataset is re-used in another stage. If you need a different dataset specification, always make sure the key under dataSets
is unique for the whole JSON object.
Each additional stage JSON manifest that the plugin receives as input, has the specified dataUrls
accessible under the keys, outputted in the process
under datasets
. In addition the initial
dataset is always available too.
Preparing stages and datasets during development
Normally your plugin will be run on the platform, and the preparation of datasets for additional stages is done automatically. During local development of your plugin, you need to trigger the dataset preparation and get the new JSON manifest for the next stages.
Send the plugin JSON output file from the plugin as JSON body, like shown below, where we first run the plugin initial stage. Note again that these two steps only need to be run manually when developing your plugin locally — once your plugin is imported in a template onto the platform, these steps are executed automatically:
This returns {"status": "preparing"}
or {"status": "ready"}
. This step may take anywhere from a few seconds to a few minutes, depending on the size of your dataset.
Note that the last part of the url is used to indicate what stage we are processing for, in this case initial
, because we are processing results for the initial plugin run stage. But later we can POST
results for other stages — as specified under the process
object — mostly used for deployable and batching plugins. More on this later in their respective section.
The next step will be to simulate a plugin run of any additional stage, like stage1
for example. To get your JSON manifest for stage1
make a GET
request to https://www.stormly.com/api/developer/get_manifest/stage1
. You will get a status message in case the dataset is still being prepared. For example with curl:
Below is shown how an additional stage JSON manifest could look like — the same as for the initial stage, with the difference that we now have "stage": "stage1"
, and dataUrls
containing initial
plus any additional datasets requested for stage1
— 60secData
and latestData
in this case:
Once you're ready to test to plugin run for that stage, execute with manifest for the new additional stage:
Then POST
your results for that stage. This is only necessary if your plugin supports being deployed or can do batching updates of existing user data. More on this in the deployable and batching plugin sections.
Success of stages
The initial stage always has to succeeed, as indicated by a success
status code returned, for the plugin run to be considered successful. All additional stages by default also need to have a successful run, for the whole plugin run to be successful. If one of the additional stages has an error, for example because the dataset requested doesn't have enough conversion samples, the whole plugin run is unsuccessful.
For any additional stage, we can indicate that they don't have to be successful by simply setting on that stage object under process
the field successrequired
to false
. An example below where only stage1
needs to success, while stage2
and stage3
can have an error, and the plugin run is still considered successful.
When we set all additional stages to "successRequired": false
, we only need the initial stage to succeed for a successful plugin run.
When there are multiple stage and all of them including the initial ones has an error, it shows the end-user only the error for the initial stage. For warnings, the explanation and backtrace are collected for all where successRequired
is not false, and then concated using newlines; note that the title of the warning is the first one available, first from initial then additional stages.
Accessing results in JS, JSX and helpers
As described at the beginning of this page, within the js
and jsx
code the results from the plugin run can be accessed under the variable results
, and in the helper
via this.results
. But when there are multiple stages, the results object has one extra layer, where we first have a key indicating the stage key as indicated in process
. The initial stage is always accessible on the results via initial
. Even if all but the initial stage fail, we still have this format with initial
and keys for additional stages.
The results object for the process
example above will roughly look like this:
So to access the initial stage results in the JSX code, use results.initial.something
, while in the helper
code we use this.results.initial.something
. For stage3
we use results.stage3.something
in JS/JSX, while in the helper
we use this.results.stage3.something
.
Limitations
- The number of additional stages is limited to a maximum of 25.
- The number of unique datasets that can be requested among all
dataSets
is limited to a maximum of 25. Take note of this mostly when using hyper-parameters to experiment with a large number of datasets. - The
js
,jsx
andhelpers
are only taken from the initial plugin run, never from any additional stages.
User Segments
The Insight API has a <Segment ... />
component that allows any end-user to quickly save a segment of users, such as Country is US AND Number of photos uploaded > 10 AND Came back in 2nd week
.
While <Segment ... />
is strictly an Insight API component, it's used so commonly and depends more on Intelligence Plugins than other components, that it will be described here too.
A Segment can be specified by a simple nested array format. Each element within the array should be an array containing three elements. The first element is the name of the feature, as found in the Dataset and features]. The second element contains an operator such as = != == > < >= <=
. The third element contains the value to compare on — where "n/a"
is used to indicate missing values.
A few examples:
["feature_e9dea1034", ">", 1.0]
["feature_e9dea1034", "=", "[n/a]"]
["feature_f9e8jf", "=", "US"]
.
There is also a shortcut the negate the condition, by adding "NOT"
as first element like this:
["NOT", "feature_f9e8jf", "=", "US"]
.
These parts can be joined together with conjunctions and parenthese to make more complex segments. The coinjunctions and parenthese can be ( ) AND OR
.
A full example could look like this:
Parenthese can be nested as many levels as you like, but the array must stay flat, so no nested arrays:
Then inside your JS/JSX or plugin helper code, supply the filter
like this via a utility function called createUserFilter
: