Skip to content
This repository was archived by the owner on Jun 5, 2024. It is now read-only.

Commit dd61e76

Browse files
committed
Tested COCO model. Start linting code.
1 parent 8af1178 commit dd61e76

22 files changed

+5182
-4874
lines changed

README.md

+47-7
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# CMU Object Detection & Tracking for Surveillance Video Activity Detection
22

3-
This repository contains the code and models for object detection and tracking from the CMU [DIVA](https://www.iarpa.gov/index.php/research-programs/diva) system. Our system (INF & MUDSML) achieves the **best performance** on the ActEv [leaderboard](https://actev.nist.gov/prizechallenge#tab_leaderboard) ([Cached](https://www.cs.cmu.edu/~junweil/resources/actev-prizechallenge-06-2019.png)).
3+
This repository contains the code and models for object detection and tracking from the CMU [DIVA](https://www.iarpa.gov/index.php/research-programs/diva) system. Our system (INF & MUDSML) achieves the **best performance** on the ActEv [leaderboard](https://actev.nist.gov/prizechallenge#tab_leaderboard) ([Cached](https://www.cs.cmu.edu/~junweil/resources/actev-prizechallenge-06-2019.png)).
44

55
If you find this code useful in your research then please cite
66

@@ -17,7 +17,7 @@ If you find this code useful in your research then please cite
1717

1818

1919
## Introduction
20-
We utilize state-of-the-art object deteciton and tracking algorithm in surveillance videos. Our best object detection model basically uses Faster RCNN with a backbone of Resnet-101 with dilated CNN and FPN. The tracking algo (Deep SORT) uses ROI features from the object detection model.
20+
We utilize state-of-the-art object deteciton and tracking algorithm in surveillance videos. Our best object detection model basically uses Faster RCNN with a backbone of Resnet-101 with dilated CNN and FPN. The tracking algo (Deep SORT) uses ROI features from the object detection model. The ActEV trained models are good for small object detection in outdoor scenes. For indoor cameras, COCO trained models are better.
2121

2222

2323
<div align="center">
@@ -76,6 +76,14 @@ $ python ../../vis_json.py Person.lst ../../v1-val_testvideos_frames/ Person_jso
7676
$ python ../../vis_json.py Vehicle.lst ../../v1-val_testvideos_frames/ Vehicle_json/ Vehicle_vis
7777
$ ffmpeg -framerate 30 -i Vehicle_vis/VIRAT_S_000205_05_001092_001124/VIRAT_S_000205_05_001092_001124_F_%08d.jpg Vehicle_vis_video.mp4
7878
$ ffmpeg -framerate 30 -i Person_vis/VIRAT_S_000205_05_001092_001124/VIRAT_S_000205_05_001092_001124_F_%08d.jpg Person_vis_video.mp4
79+
80+
# or you could put "Person/Vehicle" visualization into the same video
81+
$ ls $PWD/v1-val_testvideos/* > v1-val_testvideos.abs.lst
82+
$ python get_frames_resize.py v1-val_testvideos.abs.lst v1-val_testvideos_frames/ --use_2level
83+
$ python tracks_to_json.py test_track_out/ v1-val_testvideos.abs.lst test_track_out_json
84+
$ python vis_json.py v1-val_testvideos.abs.lst v1-val_testvideos_frames/ test_track_out_json/ test_track_out_vis
85+
# then use ffmpeg to make videos
86+
7987
```
8088
Now you have the tracking visualization videos for both "Person" and "Vehicle" class.
8189

@@ -245,6 +253,38 @@ These are the models you can use for inferencing. The original ActEv annotations
245253
</table>
246254

247255

256+
<table>
257+
<tr>
258+
<td colspan="6">
259+
<a href="https://aladdin-eax.inf.cs.cmu.edu/shares/diva_obj_detect_models/models/obj_coco_tfv1.14.pb">Object COCO</a>
260+
: COCO trained Resnet-101 FPN model. Better for indoor scenes.</td>
261+
</tr>
262+
<tr>
263+
<td>Eval on v1-val</td>
264+
<td>Person</td>
265+
<td>Bike</td>
266+
<td>Push_Pulled_Object</td>
267+
<td>Vehicle</td>
268+
<td>Mean</td>
269+
</tr>
270+
<tr>
271+
<td>AP</td>
272+
<td>0.378</td>
273+
<td>0.398</td>
274+
<td>N/A</td>
275+
<td>0.947</td>
276+
<td>N/A</td>
277+
</tr>
278+
<tr>
279+
<td>AR</td>
280+
<td>0.585</td>
281+
<td>0.572</td>
282+
<td>N/A</td>
283+
<td>0.965</td>
284+
<td>N/A</td>
285+
</tr>
286+
</table>
287+
248288
Activity Box Experiments:
249289
<table>
250290
<tr>
@@ -262,7 +302,7 @@ Activity Box Experiments:
262302
<td>activity_carrying</td>
263303
</tr>
264304
<tr>
265-
<td>AP</td>
305+
<td>AP</td>
266306
<td>0.232</td>
267307
<td>0.38</td>
268308
<td>0.468</td>
@@ -288,8 +328,8 @@ Activity Box Experiments:
288328
<td>Vehicle-Turning</td>
289329
<td>activity_carrying</td>
290330
</tr>
291-
<tr>
292-
<td>AP</td>
331+
<tr>
332+
<td>AP</td>
293333
<td>0.378</td>
294334
<td>0.582</td>
295335
<td>0.435</td>
@@ -298,8 +338,8 @@ Activity Box Experiments:
298338
<td>0.403</td>
299339
<td>0.425</td>
300340
</tr>
301-
<tr>
302-
<td>AR</td>
341+
<tr>
342+
<td>AR</td>
303343
<td>0.780</td>
304344
<td>0.973</td>
305345
<td>0.942</td>

application_util/__init__.pyc

-120 Bytes
Binary file not shown.

application_util/image_viewer.pyc

-12.1 KB
Binary file not shown.

application_util/preprocessing.pyc

-2.07 KB
Binary file not shown.

application_util/visualization.pyc

-5.98 KB
Binary file not shown.

0 commit comments

Comments
 (0)