Paper detail

Conditional DETR V2: Efficient Detection Transformer with Box Queries

In this paper, we are interested in Detection Transformer (DETR), an end-to-end object detection approach based on a transformer encoder-decoder architecture without hand-crafted postprocessing, such as NMS. Inspired by Conditional DETR, an improved DETR with fast training convergence, that presented box queries (originally called spatial queries) for internal decoder layers, we reformulate the object query into the format of the box query that is a composition of the embeddings of the reference point and the transformation of the box with respect to the reference point. This reformulation indicates the connection between the object query in DETR and the anchor box that is widely studied in Faster R-CNN. Furthermore, we learn the box queries from the image content, further improving the detection quality of Conditional DETR still with fast training convergence. In addition, we adopt the idea of axial self-attention to save the memory cost and accelerate the encoder. The resulting detector, called Conditional DETR V2, achieves better results than Conditional DETR, saves the memory cost and runs more efficiently. For example, for the DC$5$-ResNet-$50$ backbone, our approach achieves $44.8$ AP with $16.4$ FPS on the COCO $val$ set and compared to Conditional DETR, it runs $1.6\times$ faster, saves $74$\% of the overall memory cost, and improves $1.0$ AP score.

preprint2022arXivOpen access
0citations
0reviews
0saves
Nocode
Nodataset
0institutions

Next steps

Decide what to do with this paper

Use like or dislike for the fast social read. The more specific scholarly feedback stays available below when needed.

Log in to curate

Reading frame

Keep the important context close to the paper

Keep the important signals around this paper in one place: votes, save state, collection context, reviews and the metadata you need before deciding what to do next.

Institutions

Add specific reaction

Move through the context

Research map

Open full explorer

Move through nearby people, institutions, topics and adjacent work without leaving the paper page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Structured reviews

0 review(s)

ContributeLeave structured feedbackUse the review template when you have a concrete strength, concern or method question.Open review form

No structured reviews yet. High-signal critique starts here.

Work discussion

0 comment(s)

DiscussAdd a high-signal commentKeep quick notes, caveats and replication pointers separate from formal reviews.Open comment form

No discussion yet. The first strong comment sets the tone.