Release 1.18
  • 04 Oct 2024
  • 5 Minutes to read
  • Dark
    Light
  • PDF

Release 1.18

  • Dark
    Light
  • PDF

Article summary

Release 1.18.185 (Bugfix release) - Oct 4, 2024

  • Fix for failure to download the zip file with datagen output files.

Release 1.18.183 - Sep 25, 2024

Datagen(Synthetic data generation) enhancements

Datagen feature details

  • Draw multiple bounding boxes to mark multiple instances of a concept object.

  • Draw multiple segments to mark multiple generation locations on the generation canvas.

Visualize 3D point cloud files

Release 1.18.170 - Aug 30, 2024

Datagen(Synthetic data generation) enhancements

  • Separation of generation canvas and background images sections

  • Support importing input images from the catalog and the image repository.

  • A filter to select a category of image from the image repository

Import external features/embeddings

  • For image datasets, import external features(embeddings). If you have image features(embeddings) generated from a featurizer outside of Data Explorer, then these features can be imported to the catalog, and visualization jobs can be created using these features.

    • Prepare a CSV file as per instructions shown below.

    • Use File Type=’Features’ in the ‘Import catalog’ flow.

Autolabel enhancements

  • Segment detection in autolabel object detection jobs.

    • Use the ‘Show segments’ toggle in the labeling job submit screen to prepare segment information.

    • Toggle between bounding box and segments display in the full-resolution view.

  • Annotation state based filters

Other enhancements

  • File path prefix based grouping in the dataset details page. This gives a breakup of dataset files for different file path prefixes and can be used to view class distribution if the dataset has directories organized by classes.

  • Support to dump catalog attributes and cluster name in resultset dump output.

  • Sort clusters by size

Release 1.18.147 (Maintenance release) - Aug 16, 2024

  • Fix an issue with 16-bit grayscale image uploads as inputs in datagen and labeling specs.

Release 1.18.140 (Maintenance release) - Aug 09, 2024

  • Increase max image size allowed to be uploaded as input in Datagen project and labeling specs from 8MB to 50MB.

Release 1.18.131 (Maintenance release) - Aug 01, 2024

Dataset dashboard page enhancements

  • A catalog summary section that shows the breakup of the frequency of a particular value in a categorical column. The eligible columns for this display are presented in the dropdowns for selection. Currently supported columns for this display are

    • A column of type ClassLabelGT or ClassLabelPred in a view.

    • A class column in a table that was imported using a supported standard input format, such as COCO and all variants of YOLO formats.

  • The sample images section presents clusters in the outlier cluster followed by other clusters in decreasing order of cluster sizes.

  • An expand button is provided to view the Sample images in expanded mode.

Datagen(Synthetic data generation) enhancements

On the left navigation bar, a ‘Utilities→Get Datagen bucket credentials’ option is provided that gives temporary S3 access credentials to copy out datagen output files. The access credentials consist of an AWS access key, a secret key, and a session token that expires in 12 hours. Configure these credentials in your AWS CLI or other AWS SDK(like boto for Python) and download the datagen output files listed under the ‘Download file list’ option in a datagen session.

Other enhancements

  • Resultset dump provides cluster name as an additional attribute to dump.

  • Show the ‘catalog tags’ option against a resultset image

Bug fixes

  • Fix for low-quality text search results produced in certain cases. The fix is applicable only for newly created visualization jobs only.

    Fix for failure when a new dataset is created for a datagen project.

Release 1.18.114 - July 24, 2024

Datagen(Synthetic data generation) support(Beta)

In use cases like visual quality inspection, a challenge is to collect enough defect/anomaly samples to train a model for production-grade accuracy. Synthetic data generation can help augment your dataset with synthesized images for the rarely occurring categories. The guide below provides a brief overview of Datagen capabilities available in this release.

Enhanced dataset details page

  • The dataset cards on the dataset listing page is updated to be concise with only the basic information.

  • The ‘View’ button on the dataset card opens up the following dataset details(dashboard) page. The dashboard page shows

    • Configuration parameters for the datasets under the ‘View More’ button.

    • Few sample images are organized under content-based clusters.

    • Links to different types of jobs.

  • Recent searches and jobs for quick access.

Visual catalog page

The catalog page presents a default pre-executed query results by default. Additionally, the catalog page has been redesigned with focus on a visual mode presentation of the dataset and catalog contents. The tabular mode is available as a toggle option.


The below video provides an overview of the controls on the visual catalog page.

Autolabel enhancements

  • Global Image samples: Provide a small(50+) list of images with ground truth information to enhance the autolabel accuracy. The global image samples are provided in the label specification. The guide below provides an overview of the steps to add global image samples.

  • Fine-grained object detection: If the image is high resolution or the objects to autolabel are small, running object detection on cropped images is expected to give better results. In the job submission screen, toggle the ‘Label on image crops’ to turn on fine-grained detection.

Other enhancements

  • Bounding box navigation enhancements in full-resolution image

    • Hover on the confidence legend(top left) or the ground truth legend(bottom left) to view details about that bounding box.

    • Hover on the class name badge on the catalog card to highlight the specific bounding box corresponding to that card. In the below image, hover mouse on the highlighted ‘person’ badge to bring focus to the corresponding bounding box on the image.

  • Update default view: A default view associated with the dataset significantly enhances the user experience of different features related to catalog navigation. For old datasets, the default view may not be present and to force create a default view by joining all the pipeline primary tables, the following option is provided in the catalog page.


Was this article helpful?

What's Next