Overview
  • 12 Sep 2024
  • 1 Minute to read
  • Dark
    Light
  • PDF

Overview

  • Dark
    Light
  • PDF

Article summary

A dataset is an entity that holds a subset of data in a container along with all associated catalog tables and catalog views (virtual tables). The dataset page is available under 'Data -> Repo' on the left navigation panel.

The Dataset page has the following information and controls:

  1. Search box: To search for a dataset based on its name.

  2. Dataset Type: Image or Video.

  3. Public dataset: When a public dataset is imported as described in Import public dataset, a 'Public' badge is shown on the dataset card.

  4. Filtering: Filter the page on dataset type and status. The status could be

    1. Activated: A dataset on which new data can be registered and existing data can be accessed and visualized.

    2. Deactivated: A dataset on which no new data can be registered. Existing data can be accessed and visualized.

  5. Catalog: Opens up the catalog page for the dataset where all catalog-related operations like querying catalog, importing external catalog, creating views etc. can be performed.

  6. Visualize/Create Visualization: This button creates a default visualization job or opens the default visualization job if it already exists. The default visualization job is a quick way to explore the dataset's contents.

  7. A 3-dot 'Actions' button for additional actions as below.

    1. Visualization: Options to create a default visualization job or refresh the default visualization job in case new data has been ingested into the dataset.

    2. Pipeline: Pipeline-related operations are used to attach/detach pipelines or execute all pipelines attached to the dataset.

    3. Catalog: Catalog-related operations like creating a new table or viewing all the catalog import jobs.

    4. Settings: Other operations on the dataset like managing permissions, deactivating a dataset etc.


Was this article helpful?