4Paradigm Twice Breaks PASCAL VOC Object Challenge World Record


During the recent Competition 4 for universal object detection at the international top competition PASCAL VOC 2012 Challenge, 4Paradigm adopted multi-model and single-model solutions, breaking the task test results twice in two days and took the top two places in overall rankings. 4Paradigm achieved the best results in 12 of the 20 subdivision object detection tasks.

4Paradigm’s highest-ranked scheme introduced multi-level deep transfer learning technology and multi-model combined solution to improve recognition accuracy and robustness. 4Paradigm’s second-ranked single-model solution, the adaptive candidate frame extraction method, has the advantages of being efficient, fast, and more suitable for actual deployment and application.

4Paradigm took the Top 2 places of the Project Competition Based on Different Solutions

The PASCAL VOC Challenge is known for its high quality, complex scenarios, diverse targets, and high detection difficulty. It attracts AI companies from home and abroad, universities, and research institutes in a fierce competition centered on algorithm effectiveness. Now, the PASCAL VOC data set includes 20 categories such as people, animals, vehicles, and indoor objects. Among the many events of PASCAL VOC, the 2012 Challenge has the largest amount of data and covers plenty of real and complex scenarios, becoming a yardstick for measuring technical strength.

For picture detection, due to the huge differences in size of objects, competitors often use multi-scale (4-6 scales) test, which zooms in the picture to detect small objects, and zoom out to detect large object. Although the multi-scale detection method is very effective for improving the accuracy rate, this method has the defects of tying up large computing resources and delaying feedback, which seriously affects the actual application effect.

Based on 4Paradigm’s AutoCV simple and easy-to-use principles, 4Paradigm designed an ‘Adaptive Candidate Frame Extraction Method’ that can solve the problem of size gaps between different objects in the picture. Only a single-scale image input can reach or exceed multi-scale effect, which saves resources and guarantees real-time object detection.