Download TPDNet
Introduction
We construct a 3D object detection dataset for wax gourds and propose a network called TPDNet, which aims to capture the 3D information of objects from a single RGB image for fruits and vegetables in fields. Specifically, since a single RGB image lacks spatial depth information, we construct a depth estimation and enhance module that introduces depth information into the model with the help of depth auxiliary labels, and improves the representation of depth information by utilizing weight information across spatial and channel dimensions. Meanwhile, since depth features and image features are heterogeneous, we design the phenotype aggregation and phenotype intensify module to capture the correspondence between image and depth features, promoting the effective fusion of image and depth information. The experimental results show that our method significantly outperforms others, demonstrating the effectiveness and validity of our proposed method..
Usage
The corresponding training weights of the TPDNet is linked in the list:
| Model Structure | Parameters |
|---|---|
| TPDNet | 484 KB |
| Pre-trained Model | 208 MB |
DataSets
Wax Gourd 3D Object Detection DataSets
Dependencies
- Linux with Python >= 3.7
- CUDA >=11.2
- PyTorch
- Torchvision
- fire
- matplotlib
- numpy
- scipy
- pillow
- pandas
- scikit-image
- opencv-python
- numba
- easydict
- tensorflow
- cython
- tqdm
- pyquaternion
- NOTICE: different versions of Pytorch package have different memory usages.
Experiments
3D Bounding Box Visualization
3D Bounding Box Visualization
Load CSNet
Training:
./launcher/train.sh config/config.py
Testing:
./launcher/eval.sh config/config.py