0 citations0 references

End-to-End People Detection in Crowded Scenes

2016pp. 2325–2333

Citations Over TimeTop 1% of 2016 papers

Russell J. Stewart, Mykhaylo Andriluka, Andrew Y. Ng

Abstract

Current people detectors operate either by scanning an image in a sliding window fashion or by classifying a discrete set of proposals. We propose a model that is based on decoding an image into a set of people detections. Our system takes an image as input and directly outputs a set of distinct detection hypotheses. Because we generate predictions jointly, common post-processing steps such as nonmaximum suppression are unnecessary. We use a recurrent LSTM layer for sequence generation and train our model end-to-end with a new loss function that operates on sets of detections. We demonstrate the effectiveness of our approach on the challenging task of detecting people in crowded scenes1.

Related Papers

→ Does End-to-End Trained Deep Model Always Perform Better than Non-End-to-End Counterpart?(2021)2 cited
Periodically updating sliding window join algorithms over data streams(2005)
→ End-to-end consensus using end-to-end channels(2006)2 cited
→ Dynamic window configuration in an object oriented programming environment(2003)
Using the Technique of Window Subclassification to Design the Windows Program Manager Restoring Software(1999)