Pelagic Pixels

โ€ข ๐ŸŒŠห–ยฐ๐“‡ผโ‹†๐Ÿ‹๐Ÿš ๐“ˆ’๐Ÿซงหš

A Synthetic Data Generation Pipeline for Marine Object Detection

Project Repository


๐ŸŽฏ Purpose

Pelagic Pixels is a cutting-edge synthetic data generation pipeline designed to address the scarcity of real-world data for marine object detection. By combining 3D model animations with authentic underwater footage, Pelagic Pixels enables the training of AI models to accurately identify rare and protected marine species, such as sea turtles, rays, and sharks, in an ethical and scalable manner.


๐Ÿ† Milestone

Our project development is meticulously managed through milestone projects, each focusing on critical aspects of the pipeline.


M1: Foundation Setup

- Objective: Establish the core components of the synthetic data generation pipeline.
- Achievements:
- Acquired high-quality 3D models of marine species.
- Developed animation modules to simulate natural behaviors.
- Integrated basic underwater footage into the pipeline.


M2: Dataset Expansion

- Objective: Enhance dataset diversity and realism.
- Achievements:
- Implemented randomized augmentations for model placement, size, lighting, and color.
- Expanded the dataset to include varied environmental conditions.
- Refactored code for improved reusability and scalability.


M3: Testing & Documentation

- Objective: Ensure pipeline reliability and transparency.
- Achievements:
- Added comprehensive documentation for pipeline usage.
- Integrated testing with YOLOv5 to validate data integrity.
- Identified and addressed synthetic-to-real domain gaps.


๐ŸŒŸ Features

- Ethical Data Collection: Leverages synthetic data to eliminate the need for intrusive real-world data collection of protected species.
- Advanced Augmentation: Incorporates variations in lighting, orientation, size, and color to create a robust and diverse dataset.
- Scalability: Designed to generate large-scale datasets efficiently, facilitating extensive AI training without ethical concerns.


๐Ÿค Contribution

As a key contributor to Pelagic Pixels, my focus has been on developing the augmentation module and ensuring seamless integration with footage. This involved:
- Augmentation Development: Implemented randomized adjustments for lighting, color, and size to enhance data realism.
- Integration: Merged species-specific animations with diverse underwater environments.
- Automation: Developed automated process for generating large amounts of synthetic data.
This role required both technical expertise and a commitment to ethical data practices, ensuring that our pipeline remains both effective and responsible.


๐Ÿ“ˆ Results

Initial testing with YOLOv5 highlighted the effectiveness of synthetic data in recognizing specific marine species. However, addressing the synthetic-to-real domain gap remains a priority for improving model generalization to real-world scenarios.

ยฉ 2024 Ursula Nichols | Pelagic Pixels Project