Download

Download TPDNet

Introduction

We construct a 3D object detection dataset for wax gourds and propose a network called TPDNet, which aims to capture the 3D information of objects from a single RGB image for fruits and vegetables in fields. Specifically, since a single RGB image lacks spatial depth information, we construct a depth estimation and enhance module that introduces depth information into the model with the help of depth auxiliary labels, and improves the representation of depth information by utilizing weight information across spatial and channel dimensions. Meanwhile, since depth features and image features are heterogeneous, we design the phenotype aggregation and phenotype intensify module to capture the correspondence between image and depth features, promoting the effective fusion of image and depth information. The experimental results show that our method significantly outperforms others, demonstrating the effectiveness and validity of our proposed method..

Usage

The corresponding training weights of the TPDNet is linked in the list:

Model Structure Parameters
TPDNet 484 KB
Pre-trained Model 208 MB

DataSets

Wax Gourd 3D Object Detection DataSets

Dependencies

  • Linux with Python >= 3.7
  • CUDA >=11.2
  • PyTorch
  • Torchvision
  • fire
  • matplotlib
  • numpy
  • scipy
  • pillow
  • pandas
  • scikit-image
  • opencv-python
  • numba
  • easydict
  • tensorflow
  • cython
  • tqdm
  • pyquaternion
  • NOTICE: different versions of Pytorch package have different memory usages.

Experiments

3D Bounding Box Visualization

3D Bounding Box Visualization

Load CSNet

Training:
 ./launcher/train.sh config/config.py

Testing:
./launcher/eval.sh config/config.py
                

GitHub

https://github.com/GZU-SAMLab/TPDNet