- 11 Feb 2023
- 3 Minutes to read
- Print
- DarkLight
- PDF
Resultsets
- Updated on 11 Feb 2023
- 3 Minutes to read
- Print
- DarkLight
- PDF
A resultset is a collection of curated data objects the user has selected through different explore and refine capabilities provided by Data Explorer. Using an adectl command, the user can upload the resultset objects into the S3 bucket, Azure blob store, or a local file system.
Adding Objects to the Resultset
The '+' button against each thumbnail and the '+' button for bulk add on the bottom right are used to add objects to the resultset. If there is no resultset in an opened state (when adding to the resultset for the first time), a form to enter the name of the resultset will be presented.
Opening an Existing Resultset
In the resultset tab, click the Select a Resultset drop-down arrow as shown in the below image and choose a resultset,
Edit Resultset
Once a resultset is opened, and in context, you can add objects to the resultset using the '+' buttons on the selection tab and similarity search tabs. You can delete the existing objects from the resultset using the 'X' button against thumbnails presented in the 'RESULTSET' tab. The number 2 in the below diagram on the 'RESULTSET' tab indicates the number of objects that are part of the edit done on the resultset in a draft state that is not yet saved. Click the Save button to save these draft edits to the resultset.
Publish Resultset
A resultset can be published for access outside the job context. The published resultsets are available on the 'Resultsets' page, accessible from the left panel, as shown below.
The resultset listing page, as shown below, provides a list of all published resultsets. This page provides sorting, filtering, and search capabilities to reach the specific resultset of interest.
Resultset Upload
The resultset upload operation allows the user to materialize the objects in a resultset onto a target S3 bucket, Azure blob store, or a local file system for further analysis or connecting to downstream pipelines (e.g., training pipeline).
Configure the Target Location
adectl resultset config
This command will select the type of destination location (S3/Azure/GCP. etc.) and necessary credentials and configuration information. As an example, the fields captured for the S3 destination are as follows:
Select store type [s3 | azure | GCP | file | hdfs] : s3
Enter S3 Bucket Name: bucket
Enter S3 Access Key: xxxxx
Enter S3 Secret Key: yyyyy
Enter S3 Endpoint [default: https://s3.amazonaws.com]:
Configured S3 Store Successfully
For Azure blob store as a destination, storage account key fields are needed.
Upload resultset objects
You can upload resultsets using either resultset ID or resultset name.
adectl resultset upload -n <resultsetname> -t <target-location>
OR
adectl resultset upload -r <resultsetid> -t <target-location>
- resultsetname - Resultset name is available under the 'RESULTSET' tab on the UI.
- resultsetid - Resultset ID is available under the 'RESULTSET' tab on the UI.
- target-location - Location relative to the configured location using the adectl resultset config command. For example, If the S3 bucket name has been configured as s3://bucket and -t is specified as /rsupload, then the resultset uploads the objects to s3://bucket/rsupload.
The command starts the upload operation as an asynchronous operation, and the status of this operation is available by running the following command.
adectl show
Dump Resultset Objects
The command copies all the object file names available in the selected resultset.
adectl resultset dump -n <resultsetname> -o <output-dir>
OR
adectl resultset dump -r <resultsetid> -o <output-dir>
- resultsetname - Resultset name is available under the 'RESULTSET' tab on the UI.
- resultsetid - Resultset ID is available under the 'RESULTSET' tab on the UI.
- output-dir - Location to copy the resultset file names in JSON format.