4 minute read

Reference

ํŠน์ง• ์ถ”์ถœ์„ ์ด์šฉํ•œ ์ด๋ฏธ์ง€ ๋งค์นญ ๊ธฐ์ˆ ์˜ ์ตœ๊ทผ ๋™ํ–ฅ ๋ถ„์„

Brief: Binary robust independent elementary features

ORB: An efficient alternative to SIFT or SURF

LIFT: Learned Invariant Feature Transform

Tilde: A temporally invariant learned detector

SuperPoint: Self-Supervised Interest Point Detection and Description

Superglue: Learning feature matching with graph neural networks

LoFTR: Detector-free local feature matching with transformers

๊ทœ์น™ ๊ธฐ๋ฐ˜ ํŠน์ง•์  ๋งค์นญ

ํŠน์ง•์  ๊ฒ€์ถœ์€ ๊ณผ์ • ์ดํ›„ ์ฐพ์€ ํŠน์ง•์ ์„ ๋ฒกํ„ฐ๋กœ ์ €์žฅํ•˜๊ฑฐ๋‚˜ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๊ณ ,

์ด ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํŠน์ง•์  ๋งค์นญ์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

SIFT๋ฅผ ๊ธฐ๋ฐ˜(SIFT, SURF)์œผ๋กœํ•˜๋Š” ์ด๋ฏธ์ง€ ๋งค์นญ ๊ธฐ๋ฒ•์€ ๋””์Šคํฌ๋ฆฝํ„ฐ ์ธก์ • ํ›„, ์ธก์ •๋œ ๋””์Šคํฌ๋ฆฝํ„ฐ ์‚ฌ์ด์˜ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ๋‘ ํŠน์ง•์ ์„ ๋งค์นญํ•ฉ๋‹ˆ๋‹ค.

SIFT์˜ ๋””์Šคํฌ๋ฆฝํ„ฐ๋Š” ์ฃผ๋ณ€ ์˜์—ญ์„ 16๊ฐœ์˜ ๊ตฌ๊ฐ„์œผ๋กœ ๋‚˜๋ˆ„์–ด ๊ฐ ๊ตฌ๊ฐ„์˜ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ์ง‘๊ณ„ํ•˜์—ฌ๊ฐ’์„ ์ƒ์„ฑํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ถ”์ถœ๋œ ๋””์Šคํฌ๋ฆฝํ„ฐ์˜ ํฌ๊ธฐ๋Š” 128์ฐจ์›์˜ ๋ฒกํ„ฐ์ด๋ฏ€๋กœ ์—ฐ์‚ฐ๋Ÿ‰์ด ํฌ๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค.

์ด๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด Binary Robust Independent Elementary Features(BRIEF) ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๊ณ ์•ˆ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

BRIEF๋Š” ์ด๋ฏธ์ง€์˜ ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ์ด์ง„ ํŠน์ง• ๋ฒกํ„ฐ๋กœ ๊ธฐ์ˆ ํ•˜์—ฌ ๊ณ„์‚ฐ์ด ํ•„์š”ํ•œ ์ฐจ์›์˜ ์ˆ˜๋ฅผ ํฌ๊ฒŒ ์ค„์ด๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ SIFT ๋ณด๋‹ค ๋น ๋ฅธ ์—ฐ์‚ฐ์ด ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ–ˆ์Šต๋‹ˆ๋‹ค.

Oriented FAST and Rotated BRIEF(ORB)๋Š” ํŠน์ง•์  ์ถ”์ถœ ๊ณผ์ •์— FAST ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜๊ณ  ํŠน์ง•์  ๋งค์นญ ๊ณผ์ •์— BRIEF ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ SIFT ๋ณด๋‹ค ๊ฐœ์„ ๋œ ์ˆ˜ํ–‰์‹œ๊ฐ„์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค.

BRIEF๋Š” ๊ฐ„๋‹จํ•˜๊ณ  ๋น ๋ฅด๊ฒŒ ๊ณ„์‚ฐ๋  ์ˆ˜ ์žˆ์ง€๋งŒ ํŠน์ง•์ ์˜ ๋ฐฉํ–ฅ๊ณผ ๊ฐ•๋„๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ๋Šฅ๋ ฅ์€ SIFT๋ณด๋‹ค ์ œํ•œ์ ์ด๋‹ค.

๋˜ํ•œ ์ด๋Ÿฌํ•œ ๊ทœ์น™ ๊ธฐ๋ฐ˜์˜ ํŠน์ง•์  ๋งค์นญ์€ ๊ฒฐ๊ตญ ํŠน์ง•์ ์˜ ๋””์Šคํฌ๋ฆฝํ„ฐ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•˜๋Š” ๊ตฌ์กฐ์—์„œ ๊ทธ์นฉ๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ํŠน์ง•์ ์„ ๊ธฐ์ˆ ํ•˜๋Š” ๋””์Šคํฌ๋ฆฝํ„ฐ๊ฐ€ ํ‘œํ˜„ํ•˜์ง€ ๋ชปํ•˜๋Š” ์ด๋ฏธ์ง€ ์ „์ฒด์˜ ๋งฅ๋ฝ์€ ํŒŒ์•…ํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๋‹จ์ ์„ ๊ฐ€์งˆ ์ˆ˜๋ฐ–์— ์—†์Šต๋‹ˆ๋‹ค.

์ดํ›„ ์ด๋ฏธ์ง€์˜ ์ „์ฒด ๋งฅ๋ฝ์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ํŠน์ง•์  ์ถ”์ถœ ๋ฐฉ์‹์ด ๋„์ž…๋˜์—ˆ๋‹ค๋ผ๋Š” ํ๋ฆ„์ž…๋‹ˆ๋‹ค.

๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฒ•์ด ๋ฐœ๋‹ฌํ•จ์— ๋”ฐ๋ผ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ํ›ˆ๋ จ ๊ฐ€๋Šฅํ•œ ํŠน์ง• ์ถ”์ถœ ๋ชจ๋ธ๋“ค์ด ๋“ฑ์žฅํ•˜์˜€์Šต๋‹ˆ๋‹ค:

  • Lift: Learned invariant feature transform
  • Tilde: A temporally invariant learned detector

๊ทธ ์ค‘ Learned Invarient Feature Transform (LIFT)๋Š” Convolution Neural Network(CNN)์„ ์‚ฌ์šฉํ•˜์—ฌ ํ›ˆ๋ จ์ด ๊ฐ€๋Šฅํ•˜๋ฉฐ SIFT์™€ ์œ ์‚ฌํ•œ ํŠน์ง•์  ์ถ”์ถœ๊ณผ ๋””์Šคํฌ๋ฆฝํ„ฐ ์ƒ์„ฑ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ–ˆ์Šต๋‹ˆ๋‹ค.

SuperPoint๋Š” LIFT ์ดํ›„ ๋“ฑ์žฅํ•œ ํŠน์ง•์ถ”์ถœ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. (ํŠน์ง•์  ์ถ”์ถœ ๋ชจ๋ธ๊ณผ ํŠน์ง•์  ๋งค์นญ ๋ชจ๋ธ์€ ๊ตฌ๋ถ„ํ•˜์—ฌ ๊ธฐ์–ตํ•ฉ์‹œ๋‹ค.)

์ด๋ฏธ์ง€๋ฅผ ์ž…๋ ฅ ๋ฐ›์•„ CNN์„ ํ†ตํ•ด LIFT์ฒ˜๋Ÿผ ํŠน์ง•์  ๊ฒ€์ถœ๊ณผ ๋””์Šคํฌ๋ฆฝํ„ฐ ์ƒ์„ฑ์„ ๋™์‹œ์— ์ˆ˜ํ–‰ํ•˜์ง€๋งŒ,

SuperPoint๋Š” LIFT์™€ ๋‹ฌ๋ฆฌ ์ด๋ฏธ์ง€ ํฌ๊ธฐ์˜ ์ œํ•œ์ด ์—†์œผ๋ฉฐ, ํ›ˆ๋ จ ๊ณผ์ •์—์„œ ๋ ˆ์ด๋ธ”์ด ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋กœ ํ›ˆ๋ จ์„ ์ง„ํ–‰ํ•œ ๋’ค ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ์ด๋ฏธ์ง€ ์Œ์„ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ์ž์ฒด ๊ฐ๋… ํ›ˆ๋ จ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

๊ธฐ์กด ๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•œ ํŠน์ง•์ถ”์ถœ ๋ชจ๋ธ์€ ํ›ˆ๋ จํ•˜๊ธฐ ์œ„ํ•ด ์ง€๋„๋œ ๊ฒฐ๊ณผ ๊ฐ’์„ ๊ฐ€์ง„ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ–ˆ์ง€๋งŒ SuperPoint๋Š” ์ž์ฒด ๊ฐ๋… ํ›ˆ๋ จ์„ ์ ์šฉํ•˜์—ฌ ์ด๋ฅผ ๊ทน๋ณตํ–ˆ์Šต๋‹ˆ๋‹ค.

์•„๋ž˜ ๊ทธ๋ฆผ์€ SuperPoint์˜ ์ž์ฒด ๊ฐ๋… ํ›ˆ๋ จ ๊ฐœ์š”๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

image

SuperPoint๋Š” ๋ ˆ์ด๋ธ”์ด์—†๋Š” ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๊ธฐ๋ณธ์ ์ธ ๋„ํ˜• ์ด๋ฏธ์ง€๋ฅผ ํ†ตํ•ด ํ›ˆ๋ จ๋œ ๊ธฐ์ค€ ํŠน์ง•์ถ”์ถœ ๋ชจ๋ธ๊ณผ Homographic ์ ์‘ ์ ˆ์ฐจ๋ฅผ ์ ์šฉํ•˜์—ฌ ๋ ˆ์ด๋ธ”์„ ์ž๋™์œผ๋กœ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

์ดํ›„ ์ƒ์„ฑ๋œ ๋ ˆ์ด๋ธ”์„ ํ†ตํ•ด CNN๋ฅผ ํ›ˆ๋ จํ•ฉ๋‹ˆ๋‹ค.

๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•œ ํŠน์ง•์  ๋งค์นญ

๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•˜์—ฌ ํŠน์ง•์ ์„ ์ถ”์ถœํ•˜๋Š” ๋ฐฉ๋ฒ•๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•˜์—ฌ ํŠน์ง•์ ์„ ๋งค์นญํ•˜๋Š” ๋ฐฉ๋ฒ• ๋˜ํ•œ ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๊ธฐ์กด์˜ ์—ฐ๊ตฌ๋“ค์€ ๋‘ ์ด๋ฏธ์ง€์˜ ํŠน์ง•์ ์˜ ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ํ†ตํ•ด์„œ๋งŒ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•˜๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋Ÿฌํ•œ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๋ ค ํ–ˆ์Šต๋‹ˆ๋‹ค.

ํŠนํžˆ ํŠน์ง•์  ๋งค์นญ ์‹œ ํŠน์ง•์ ์˜ ๋””์Šคํฌ๋ฆฝํ„ฐ๋งŒ์„ ๊ณ ๋ คํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ ํŠน์ง•์  ์‚ฌ์ด์˜ ๊ด€๊ณ„๋„ ๊ณ ๋ คํ•˜์—ฌ ๋งค์นญ ํ•˜๋Š” ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

SuperGlue๋Š” SuperPoint์˜ ํ›„์† ์—ฐ๊ตฌ๋กœ Graph Neural Networks (GNN)์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‘ ์ด๋ฏธ์ง€ ์‚ฌ์ด ํŠน์ง•์ ๋“ค ๊ฐ„ ๊ด€๊ณ„๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๊ฐ™์€ ์ด๋ฏธ์ง€์— ์žˆ๋Š” ํŠน์ง•์ ์˜ ๊ด€๊ณ„ ๋˜ํ•œ ๊ณ ๋ คํ•˜์—ฌ ํŠน์ง•์  ๋งค์นญ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

SuperGlue๋Š” ์ด๋ฏธ ์ถ”์ถœ๋œ ํŠน์ง•์ ๊ณผ ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ์ž…๋ ฅ๋ฐ์ดํ„ฐ๋กœ ๋ฐ›์•„ ๋งค์นญ์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ ์ถ”์ถœ๋œ ํŠน์ง•์ ์— ๋…ธ๋“œ๋กœ ๊ฐ™์€ ์ด๋ฏธ์ง€์— ์กด์žฌํ•˜๋Š” ํŠน์ง•์ ๊ฐ„์˜ ์—ฃ์ง€๋ฅผ ์ƒ์„ฑํ•œ ๊ทธ๋ž˜ํ”„์ธ self ๊ทธ๋ž˜ํ”„์™€ ๋งค์นญ ์Œ ์ด๋ฏธ์ง€์— ์กด์žฌํ•˜๋Š” ํŠน์ง•์ ๊ฐ„ ์—ฃ์ง€๋ฅผ ์ƒ์„ฑํ•œ cross ๊ทธ๋ž˜ํ”„๋ฅผ ๋‘ ๊ฐœ ์ƒ์„ฑํ•˜์—ฌ GNN์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.

์ด๋ฅผ ํ†ตํ•ด SuperGlue๋Š” ๋ณธ ์ด๋ฏธ์ง€์—์„œ์˜ ํŠน์ง•์ ์˜ ๋งฅ๋ฝ๊ณผ ์ƒ๋Œ€ ์ด๋ฏธ์ง€์˜ ํŠน์ง•์ ๊ณผ์˜ ๋งฅ๋ฝ์„ ์ „๋ถ€ ๋ฐ˜์˜ํ•˜์—ฌ ์ด๋ฏธ์ง€ ๋งค์นญ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์•„๋ž˜ ๊ทธ๋ฆผ์€ SuperGlue์˜ ๋™์ž‘ ๊ณผ์ •์„ ๋ณด์—ฌ์ค๋‹Œ๋‹ค.

image

SuperGlue๋Š” ์ถ”์ถœ๋œ ํŠน์ง•์ ๊ณผ ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๋Š” ๊ธฐ๋ฒ•์ด๋ฏ€๋กœ ์ถ”์ถœ๋œ ํŠน์ง•์ ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ๋งค์นญ์„ ์ˆ˜ํ–‰ํ•˜์ง€๋งŒ ํŠน์ง•์ ์ด ๊ฒ€์ถœ๋˜์ง€ ์•Š์€ ๊ตฌ์—ญ์ด๋‚˜ ์ด๋ฏธ์ง€์˜ ์ „์ฒด์ ์ธ ๋งฅ๋ฝ์€ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค.

๋˜ํ•œ ๋‹ค๋ฅธ ์œ„์น˜์˜ ํŠน์ง•์ ์ด ๊ฐ™์€ ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ๊ฐ€์ง„๋‹ค๋ฉด ๋‘˜ ์‚ฌ์ด์˜ ์ฐจ์ด๋ฅผ ๊ตฌ๋ณ„ํ•˜๋Š” ์œ„์น˜ ์ •๋ณด๋ฅผ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค.

์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ œ์•ˆ๋œ Detector-Free Local Feature Matching with Transformers (LoFTR)์€ SuperGlue๋‚˜ SIFT๊ฐ™์€ ๊ธฐ์กด๋ฐฉ๋ฒ•์ฒ˜๋Ÿผ ์ถ”์ถœ๋œ ํŠน์ง•์ ์„ ๋งค์นญํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ

์ด๋ฏธ์ง€์— CNN์„ ํ†ตํ•ด ์ถ”์ถœ๋œ feature map์„ ์‚ฌ์šฉํ•œ ๋’ค ๊ด€๊ณ„์„ฑ์„ ํŒ๋‹จํ•˜๊ณ 

์ถ”ํ›„ coarse-to-fine module์„ ํ†ตํ•ด ํŠน์ง•์ ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

์•„๋ž˜ ๊ทธ๋ฆผ์€ LoFTR์˜ ๋™์ž‘ ๊ณผ์ •์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

image

LoFTR์€ ์ถ”์ถœ๋œ feature map์„ ๊ฐ™์€ ์ด๋ฏธ์ง€์˜ ํŠน์ง• ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” self attention layer์™€ ๋งค์นญ ์Œ ์ด๋ฏธ์ง€์˜ ํŠน์ง• ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” cross attention layer๋ฅผ ์ƒ์„ฑํ•œ ๋’ค ๋””์Šคํฌ๋ฆฝํ„ฐ ์—†์ด ๋ฐ”๋กœ ์ด๋ฏธ์ง€ ๋งค์นญ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.

๋”๋ถˆ์–ด ๋งค์นญ๋œ Feature map์—์„œ Coarse-to-Fine Module์„ ์ œ์•ˆํ•˜์—ฌ ์ƒ์„ธํžˆ ํŠน์ง•์ ์˜ ์œ„์น˜๋ฅผ ์ถ”์ •ํ•ฉ๋‹ˆ๋‹ค.

LoFTR๋Š” SuperGlue์™€ ๊ฐ™์ด self attention layer, cross attention layer๋ฅผ ํ†ตํ•ด ๋ณธ ์ด๋ฏธ์ง€๋‚ด๋ถ€์˜ ๊ด€๊ณ„์™€ ๋งค์นญ ๋Œ€์ƒ ์ด๋ฏธ์ง€์™€์˜ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์ง€๋งŒ

GNN์ด ์•„๋‹Œ transformer๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์„ ๋ฐœ์ „์‹œํ‚ค๊ณ  ์ถ”์ถœ๋œ ํŠน์ง•์ ์ด ์•„๋‹ˆ๋ผ feature map์„ ์‚ฌ์šฉํ•˜์—ฌ ํŠน์ง•์˜ ์œ„์น˜์  ์ •๋ณด๋ฅผ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.

๋˜ํ•œ LoFTR์˜ ๋†’์€ ์„ฑ๋Šฅ ๋•Œ๋ฌธ์— ์ดํ›„์—๋„ LoFTR์„ ํ† ๋Œ€๋กœ ์‚ฌ์šฉํ•œ ๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€ ๋งค์นญ ๊ธฐ๋ฒ•๋“ค์ด ๋“ฑ์žฅํ•˜์˜€์Šต๋‹ˆ๋‹ค.

์ด๋ฏธ์ง€ ๋งค์นญ ๊ธฐ๋ฒ• ๊ฒฐ๊ณผ

image

์‹คํ—˜๊ฒฐ๊ณผ์—์„œ SIFT์™€ ORB๋Š” ์ด๋ฏธ์ง€์˜ ์ถ”์ถœ๋œ ํŠน์ง•์ ์˜ ์œ„์น˜๋Š” ์ฝ”๋„ˆ๋‚˜ ์—ฃ์ง€๊ฐ€ ์žˆ๋Š” ๋ถ€๋ถ„์œผ๋กœ ํ•œ์ •๋˜์–ด ์žˆ์œผ๋ฉฐ

ํŠนํžˆ ORB์˜ ๊ฒฝ์šฐ SIFT์— ๋น„ํ•ด ์ƒ๋Œ€์ ์œผ๋กœ ํ๋ฆฟํ•œ ์˜์—ญ์˜ ํŠน์ง•์ ์„ ์ถ”์ถœํ•˜๋Š” ๋Šฅ๋ ฅ์ด ๋ถ€์กฑํ•˜๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ ๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•œ ๊ธฐ๋ฒ•์ด ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ธฐ๋ฒ•๋ณด๋‹ค ๋งŽ์€ ํŠน์ง•์ ์„ ๋งค์นญ ์‹œํ‚ค๋Š” ๊ฒƒ์„ ์‹คํ—˜ ๊ฒฐ๊ณผ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํŠนํžˆ, LoFTR์˜ ๊ฒฝ์šฐ ์ฝ”๋„ˆ๋‚˜ ์—ฃ์ง€๊ฐ€ ์•„๋‹Œ ํ‰ํƒ„ํ•œ ๋ถ€๋ถ„์—์„œ๋„ ํŠน์ง•์ ์„ ๋งค์นญ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ์œผ๋ฉฐ ํŠน์ง•์  ๊ฐ„ ๊ณต๊ฐ„์ ์ธ ์ •๋ณด๋ฅผ ๋” ์ž˜ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฒฐ๋ก 

๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํŠน์ง• ์ถ”์ถœ์„ ํ†ตํ•œ ์ด๋ฏธ์ง€ ๋งค์นญ ๊ธฐ๋ฒ•์— ๋Œ€ํ•ด์„œ ํŠน์ง• ์  ์ถ”์ถœ๊ณผ ํŠน์ง• ์  ๋งค์นญ์˜ ๋‘ ๋‹จ๊ณ„๋กœ ๋‚˜๋ˆ„์–ด ๊ฐ ์ ˆ์ฐจ๋ฅผ ์„ค๋ช…ํ•˜๊ณ  ๋„๋ฆฌ ์•Œ๋ ค์ง„ ์ด๋ฏธ์ง€ ๋งค์นญ ๊ธฐ๋ฒ•์— ๋Œ€ํ•ด ์‚ดํŽด๋ณด์•˜์Šต๋‹ˆ๋‹ค.

์ดํ›„ ์‹คํ—˜์„ ํ†ตํ•ด ๊ฐ ๊ธฐ๋ฒ•์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ณ  ์ด๋ฏธ์ง€ ๋งค์นญ์˜ ์ •์„ฑ์  ๊ฒฐ๊ณผ๋ฅผ ๋…ผ์˜ํ•˜์˜€์Šต๋‹ค.

ํŠน์ง• ์ถ”์ถœ์„ ํ†ตํ•œ homography ์ถ”์ •์—์„œ๋Š” SuperPoint๊ฐ€ ๊ฐ€์žฅ ์šฐ์ˆ˜ํ•œ ์„ฑ์ ์„ ๋ณด์—ฌ์ฃผ์—ˆ์œผ๋ฉฐ. ํŠน์ง•์  ๋งค์นญ ๊ธฐ๋ฒ• ์ค‘์—์„œ๋Š” LoFTR์ด ๊ฐ€์žฅ์šฐ์ˆ˜ํ•œ ๋งค์นญ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ ๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•œ ๊ธฐ๋ฒ•๋“ค์ด ๊ทœ์น™๊ธฐ๋ฐ˜์˜ ๊ธฐ๋ฒ•๋ณด๋‹ค ๋” ๋งŽ์€ ํŠน์ง•์ ์„ ์ •ํ™•ํžˆ ๋งค์นญ์‹œ์ผฐ์œผ๋ฉฐ

LoFTR์˜ ๊ฒฝ์šฐ ํŠน์ง•์  ๊ฐ„์˜ ๊ณต๊ฐ„์  ์ •๋ณด๋ฅผ ๋ณด์กดํ•˜์—ฌ ๋งค์นญ์ด ์–ด๋ ค์šด ์˜์—ญ์—์„œ๋„ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.

์ด๋ฏธ์ง€ ๋งค์นญ์€ ๋‘ ์ด๋ฏธ์ง€ ์•ˆ์—์„œ ํŠน์ง•์  ๊ฐ„ ๊ด€๊ณ„, ์ „์ฒด์ ์ธ ์ด๋ฏธ์ง€์—์„œ ํŠน์ง•์ ์˜ ๋งฅ๋ฝ ๋“ฑ์„ ๊ณ ๋ คํ•ด์•ผ ํ•˜๋Š” ๋ณต์žกํ•˜๊ณ  ๊นŒ๋‹ค๋กœ์šด ์ž‘์—…์ž…๋‹ˆ๋‹ค.

๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์˜ ๋ฐœ์ „๊ณผ ํ•จ๊ป˜ ๊ธฐ์กด ๊ธฐ๋ฒ•๋“ค์˜ ๋‹จ์ ์„ ํ•ด๊ฒฐํ•˜๊ณ  ๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€์˜ ํŠน์ง•์„ ๊ณ ๋ คํ•˜๋Š” ์ƒˆ๋กœ์šด ๊ธฐ๋ฒ•๋“ค์ด ๋“ฑ์žฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ๋…ผ๋ฌธ์— ์ œ์‹œ๋œ ๊ธฐ์กด์˜ ๊ธฐ๋ฒ•๋“ค์€ ๊ฐ ํ”„๋กœ์ ํŠธ์˜ ๋ชฉ์ ๊ณผ ๋งค์นญ ์ด๋ฏธ์ง€์˜ ํŠน์„ฑ์— ๋”ฐ๋ผ ์ œ์•ฝ์‚ฌํ•ญ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ์ด๋Ÿฌํ•œ ๋ชฉ์ ๊ณผํŠน์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ ๊ธฐ์กด ๊ธฐ๋ฒ•์˜ ๋‹จ์ ์„ ๋ถ„์„ํ•˜๊ณ  ๊ทน๋ณตํ• ์ˆ˜์žˆ๋Š” ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€ ๋งค์นญ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

Leave a comment