CSNet Download Page

Download TPDNet

Introduction

We construct a 3D object detection dataset for wax gourds and propose a network called TPDNet, which aims to capture the 3D information of objects from a single RGB image for fruits and vegetables in fields. Specifically, since a single RGB image lacks spatial depth information, we construct a depth estimation and enhance module that introduces depth information into the model with the help of depth auxiliary labels, and improves the representation of depth information by utilizing weight information across spatial and channel dimensions. Meanwhile, since depth features and image features are heterogeneous, we design the phenotype aggregation and phenotype intensify module to capture the correspondence between image and depth features, promoting the effective fusion of image and depth information. The experimental results show that our method significantly outperforms others, demonstrating the effectiveness and validity of our proposed method..

Usage

The corresponding training weights of the TPDNet is linked in the list:

Model Structure	Parameters
TPDNet	484 KB
Pre-trained Model	208 MB

DataSets

Wax Gourd 3D Object Detection DataSets

Dependencies

Linux with Python >= 3.7
CUDA >=11.2
PyTorch
Torchvision
fire
matplotlib
numpy
scipy
pillow
pandas
scikit-image
opencv-python
numba
easydict
tensorflow
cython
tqdm
pyquaternion
NOTICE: different versions of Pytorch package have different memory usages.

Experiments

3D Bounding Box Visualization

Load CSNet

Training:
 ./launcher/train.sh config/config.py

Testing:
./launcher/eval.sh config/config.py

GitHub

https://github.com/GZU-SAMLab/TPDNet