- 12 Sep 2024
- 3 Minutes to read
- Print
- DarkLight
- PDF
Quick Import catalog
- Updated on 12 Sep 2024
- 3 Minutes to read
- Print
- DarkLight
- PDF
Data Explorer enables enriching your data with metadata referred to as the catalog. The catalog can be imported into the dataset in the following ways.
- Connect to an SQL-based catalog store like MYSQL, SQL Server, PostGreSQL and Amazon Athena without copying any catalog into Data Explorer. Please check External-catalog for more details.
- Import catalog by uploading files in one of the following formats
- CSV file
- COCO formatted JSON file.
- YOLO Darknet, YOLOv8-Ultralytics, YOLOv8-PyTorch formatted files.
Importing a COCO formatted annotation file
The video below provides a walkthrough of importing a COCO formatted file.
Importing a CSV annotation file
The section below describes the steps to import a catalog into a freshly created catalog table with data types automatically inferred.
For advanced use cases, creating a table with specific data types and importing catalog as a follow-on operation or incrementally importing new catalog records into an existing table is possible. Please refer to Import catalog for details on advanced operations.
- Navigate to 'Data -> Repo -> Datasets' on the left navigation panel and click the Catalog button on the dataset card.
- Click the hamburger button as shown below and select 'Import catalog'.
- Enter catalog name and select the type of the file being imported.
- Select the file to import. For a CSV file, the first line is expected to be a header, and the catalog columns will be created with the same name as in the header. You can also select more than one file.
- Preview the contents and Upload.
- Once the upload is complete, click 'Import'.
- The status of the import is presented in the screen below. Click on the 'Refresh' button to update the status.
- Once the status transitions to 'SUCCESS', close the popup. The imported table is listed under the 'Dataset tables' shown below.
Importing catalog in YOLO formats
The overall import catalog flow is the same as the import of CSV or COCO JSON files described above, except for the appropriate 'File type' selection from the drop-down below.
The YOLO format input consists of a metadata file and folder with one annotation file per image. The following UI controls to upload metadata and label folders are provided on selecting the required YOLO variant from' File Type'.
YOLO-Darknet
- Metadata file - File with any extension with 1 line per class name.
- Label folders - Each label folder must have a .txt file per image with the name of the .txt file being the same as image name(e.g. a.txt for a.jpg image)Text
Metadata file -------------- car bus truck Label Folders -------------- Each label folder must have a .txt file per image with name of .txt file same as image name(e.g. a.txt for a.jpg image). Each line of text file represents one annotation in the format <class_id> <x_centre> <y_centre> <box_width> <box_height> class_id: Index to the class in metadata file. First class has index 0. All co-ordinates and sizes are between 0-1 normalized to the image height and width.
YOLOv8-Ultralytics
- Metadata file - A .yaml file with names as the mandatory key that provides class id to class name mapping.
- Label folders - Each label folder must have a .txt file per image, with the name of the .txt file being the same as the image file name(e.g., a.txt for a.jpg image file).
Metadata file -------------- names: 0: car 1: bus 2: truck Label Folders -------------- Each label folder must have a .txt file per image with name of .txt file same as image name(e.g. a.txt for a.jpg image). Each line of text file represents one annotation in the format <class_id> <x_centre> <y_centre> <box_width> <box_height> class_id: Index to the class in metadata file. First class has index 0. All co-ordinates and sizes are between 0-1 normalized to the image height and width.
YOLOv8-PyTorch
- Metadata file - A .yaml file with names as a mandatory key that lists the class names.
- Label folders - Each label folder must have a .txt file per image, with the name of the .txt file being the same as the image file name(e.g., a.txt for a.jpg image file).Text
Metadata file -------------- names: ['car', 'bus', 'truck'] Label Folders -------------- Each label folder must have a .txt file per image with name of .txt file same as image name(e.g. a.txt for a.jpg image). Each line of text file represents one annotation in the format <class_id> <x_centre> <y_centre> <box_width> <box_height> class_id: Index to the class in metadata file. First class has index 0. All co-ordinates and sizes are between 0-1 normalized to the image height and width.