RetinaNet

개요

1 Stage Detector는 ‘Class imbalance’라는 고질적인 문제를 갖고 있었습니다.

Class imbalance
- Grid로 나눠서 Cell 마다 모두 Bbox를 추정하게 함
- Positive Sample(Object area) < Negative Sample(BG Area)

Untitled 39.png

RetinaNet은 One-stage Detector입니다. Class imbalance 문제를 해결하기 위해 Focal Loss를 제안했으며 이를 통해 낮은 확률의 클래스에 대한 학습 성능을 향상시켰고, ResNet 구조의 FPN(Feature Pyramid Network) Network를 Backbone으로 사용하여 Two-stage Detector인 Faster R-CNN의 정확도를 능가했습니다.

핵심 개념

One-Stage Detector인 이유

class+box subnet을 보면 class를 예측하는 network와 box를 예측하는 network가 나누어져 있습니다.

이 구조를 보면 two stage detector가 아닌가 생각 할 수도 있지만, 자세히 살펴보면 class subnet은 box subnet의 결과를 전혀 사용하지 않고 독립적으로 anchor 별 class 예측을 한다는 것을 알 수 있습니다.

그러므로 RetinaNet은 One-Stage Detector입니다.

Untitled 1 33.png

Focal Loss

Untitled 2 20.png

Focal loss는 one-stage object detector의 극단적인 class imbalance 문제를 해결하기 위해 design 된 loss function입니다.

$L_{f oc a l l oss} (p_{t}) = - α_{t} (1 - p_{t})^{γ} lo g (p_{t})$

저자는 실험적으로 alpha=0.25, gamma = 2가 가장 성능이 좋았다고 합니다.

참조

Focal Loss for Dense Object Detection

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. https://arxiv.org/abs/1708.02002

Week 10

https://deep-learning-study.tistory.com/504

https://csm-kr.tistory.com/5

https://talktato.tistory.com/13

HSV

Explorer

RetinaNet

개요

핵심 개념

One-Stage Detector인 이유

Focal Loss

관련 개념

참조

Graph View

Table of Contents

Backlinks