Possum TrackerPT
HomeThe DetailsPrototypeDashboardCalendarAbout Me
Deep Dive

The Details

End-to-end computer vision pipeline for real-time possum detection
Computer Vision
Deep Learning
CNN
Transfer Learning
PyTorch
OpenCV
Real-Time Detection
Possum detection system visualization

Problem & Motivation

Possums regularly visit the backyard at night, naturally triggering the curiosity and hunting instincts of our dog, Beau. To prevent potential attacks and injuries to both wildlife and pets, we decided to design a smart mechanism for the dog door that automatically closes when a possum is detected in the backyard, keeping the dog safely inside.

Detecting possums in this environment is challenging: naive motion detectors produce many false positives caused by insects, wind-driven vegetation, mice, and infrared camera noise. The main challenge is therefore to reliably identify possums in low-light conditions while minimizing false alarms.


Key Challenges

Night Footage Noise

Limited visibility and IR artifacts make possum detection harder in low-light conditions.

False Motion Triggers

Insects, shadows, rain, vegetation, mice, and wind generate many non-possum ROIs.

Imbalanced Data

Few possum appearances vs many non-possum motions make training CNNs challenging.

Frame Similarity

Consecutive frames from the same night are nearly identical, leading to redundant ROIs.

Real-Time Processing

Very large number of ROIs need to be classified by the CNN on a continuous camera feed.

Static Possums

Possums may remain still for long periods, making motion-based detection unreliable.

ROI Quality Variability

Crops can be partial, occluded, or poorly illuminated, complicating CNN classification.

Manual Labeling

Creating representative non-possum and possum datasets is time-consuming and labor-intensive.


Data Collection & Preparation

Data was collected directly from night camera recordings. Motion detection was used to automatically generate Regions of Interest (ROIs), which were then manually reviewed, sorted, and labeled.

Session-Based Splitting

ROIs from the same night session were kept together in either train or test sets to prevent temporal data leakage.

Padding-Based Resizing

ROIs were resized to 224x224 using padding to preserve object proportions rather than distorting features.

Blurry Image Inclusion

Motion-blurred possum images were kept in training to reflect realistic night conditions.

Example ROIs

Good Possum Images
Good possum ROI 1
Good possum ROI 2
Good possum ROI 3
Good possum ROI 4
Blurry / Motion-Affected Possum Images
Blurry possum ROI 1
Blurry possum ROI 2
Blurry possum ROI 3
Blurry possum ROI 4
Non-Possum Motion (Mice, Insects)
Non-possum ROI 1
Non-possum ROI 2

Model Architecture

A convolutional neural network (CNN) was trained using transfer learning to distinguish possum vs non-possum ROIs. The pretrained ResNet18 backbone has all layers frozen except the final classification head, which is trained on the custom possum dataset.

The model combines classical motion detection with CNN classification: batch-based training handles many small crops efficiently, while inference is performed per ROI in real-time.

Key Details

Pretrained ResNet18 backbone with frozen layers

Custom binary classification head (possum / not possum)

Input: motion-based ROIs padded to 224x224

Batch-based training, per-ROI real-time inference

Temporal sliding window: 3/5 frames to confirm detection


Model Training & Performance

The model was trained for 5 epochs using transfer learning with ResNet18, achieving a best test accuracy of 99.61% and demonstrating strong generalization with minimal overfitting.

Training Progress
Training & Test Loss
Training & Test Accuracy
Confusion Matrix
Predicted Not Possum
Predicted Possum
Actual Not Possum

4,404

49.9%

77

0.9%

Actual Possum

30

0.3%

4,312

48.9%

Metrics Summary
Accuracy
98.79%
Precision
98.25%
Recall
99.31%
F1 Score
98.77%
Key Insights
Best Test Accuracy
99.61%
Final Test Loss
0.0146
False Positives
77
False Negatives
30

Detection Logic

Temporal Consistency Rule

A possum is considered detected only if it appears in at least 3 out of the last 5 processed frames. This sliding window approach:

Reduces single-frame misclassifications

Stabilizes predictions in noisy night conditions

Ensures robust detection when possums move slowly or remain stationary


Real-Time Detection Snapshots

Real-time examples captured during live camera inference. The system reports a possum detection only after satisfying the temporal consistency rule.

Terminal LogDetection terminal log 1
Full FrameDetection full frame 1
ROI to CNNDetection ROI 1
Terminal LogDetection terminal log 2
Full FrameDetection full frame 2
ROI to CNNDetection ROI 2

Limitations

01

Performance degrades in heavy rain, strong wind, or extreme infrared camera noise.

02

Completely static possums may be missed, as detection relies on movement.

03

Possums appear rarely, limiting the diversity of training examples.

04

Low-resolution cameras, poor angles, or distance reduce ROI quality.

05

The model may misclassify unseen objects as possums due to incomplete non-possum coverage.


Future Work

01

Possum Analytics

Log detections with timestamps, ROI crops, and bounding boxes to enable graphs and analysis of possum activity over time.

02

Smartf Home Integration

Connect detection to devices such as automatic feeding boxes and dog door lock mechanisms.

03

Model Experimentation

Train a standard CNN from scratch without transfer learning to compare performance with the current ResNet18-based model.

04

Analytics Dashboard

Build a web dashboard to track possum visits, visualize patterns, and display historical detection data.