Plugin Storage
Uploading files
When a plugin has multiple stage, is deployable, or does a batching update of existing data, we often need to retrieve saved files or models from a previou stage.
On the initial plugin run, we have two keys related to upload and ownloading. Before we can upload a file, we first need to get the upload url.
This can be done by doing a GET
on getUploadUrls
and within that use the value under stage
as key to access the same key under getUploadUrls
.
Finally append to the get-upload-url the path of the filename you want to store the upload under, so you can download in any next stage using the same filename. In our example we will use model1.pkl
.
Here is the relevant part of the plugin JSON manifest for the intial stage:
To get the upload url for /model1.pkl
in curl, we execute:
Next we actual upload the file using the upload url from the previous step:
Note that you can upload only into the current stage. So if current plugin run manifest is "stage": "initial"
, you can only upload using getUploadUrls
for initial
. So when you are in stage1
, you cannot use the getUploadUrls
for initial
anymore. Downloading will work for any stage, from any stage.
In plugin development mode, these presigned upload urls are only valid for 8 hours, so make sure in general that you code GETs the upload url right before you want to do your upload. In production these presigned urls are valid for a few days.
Downloading files
To download a file from another stage, simply look the url up via the downloadUrls
object, using the key of the stage you want to download from, and append the file path to the url.
For example in stage1
we may get JSON manifest below, and want to download model1.pkl
that was uploaded in the initial stage:
To download model1.pkl
we can do that with a GET, using curl as an example:
It's important to note that any files should be written/downloaded to the current path, thus no absolute or other paths, because they will not exist or not be writeable when your plugin runs on the platform in production.
All downloadUrls
from all stages are also available at the plugin deployment stage, if the plugin supports deployment. The deployment stage does not have access to any upload urls anymore, so uploading there is not possible, but also not needed in mostly.
Another thing to note is that the platform takes care of hyper-parameters variations and storage; each upload and download storage path can be fixed, when we request in the deployment or batching plugin stage the model1.pkl
, it will be the one that was trained with the optimal hyper-parameters.
Limitations
- Always download files to the current path, never use any absolute paths as they will not be writeable when running on production.
- Uploading is done with a PUT request.
- Upload urls are only valid for 5 days. After that, they don't accept uploads anymore.
- 5GB per file is the maximum.
- If you use curl, make sure to place the upload or download urls between quotes.