Bibliography¶

[BRPM16]

Vassileios Balntas, Edgar Riba, Daniel Ponsa, and Krystian Mikolajczyk. Learning local feature descriptors with triplets and shallow convolutional neural networks. In British Machine Vision Conference (BMVC). 2016.

[BLRPM19]

Axel Barroso-Laguna, Edgar Riba, Daniel Ponsa, and Krystian Mikolajczyk. Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters. In ICCV. 2019.

[BA87]

Peter J Burt and Edward H Adelson. The laplacian pyramid as a compact image code. In Readings in computer vision, pages 671–679. Elsevier, 1987.

[CLO+20]

Luca Cavalli, Viktor Larsson, Martin Ralf Oswald, Torsten Sattler, and Marc Pollefeys. Adalam: revisiting handcrafted outlier detection. CoRR, 2020. URL: https://arxiv.org/abs/2006.04250, arXiv:2006.04250.

[CZM+18]

Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V Le. Autoaugment: learning augmentation policies from data. arXiv preprint arXiv:1805.09501, 2018.

[CZSL20]

Ekin D Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V Le. Randaugment: practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 702–703. 2020.

[DBK+21]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: transformers for image recognition at scale. ICLR, 2021.

[EBWF24]

Johan Edstedt, Georg Bökman, Mårten Wadenbäck, and Michael Felsberg. DeDoDe: Detect, Don't Describe — Describe, Don't Detect for Local Feature Matching. In 2024 International Conference on 3D Vision (3DV). 2024.

[FYP+21]

Yuantao Feng, Shiqi Yu, Hanyang Peng, Yan-ran Li, and Jianguo Zhang. Detect faces efficiently: a survey and evaluations. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2021.

[HS15]

Kaiming He and Jian Sun. Fast guided filter. 2015. arXiv:1505.00996.

[HST10]

Kaiming He, Jian Sun, and Xiaoou Tang. Guided image filtering. In Proceedings of the 11th European Conference on Computer Vision: Part I, 1–14. 2010.

[JJC01]

Guerrero J.J. and Sagues C. From lines to homographies between uncalibrated images. In IX Spanish Symposium on Pattern Recognition and Image Analysis. 2001.

[KS19]

Davood Karimi and Septimiu E Salcudean. Reducing the hausdorff distance in medical image segmentation with convolutional neural networks. IEEE Transactions on medical imaging, 39(2):499–513, 2019.

[LYFC21]

Shiqi Lin, Tao Yu, Ruoyu Feng, and Zhibo Chen. Patch autoaugment. 2021. arXiv:2103.11099.

[LGG+18]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. arXiv ePrint 1708.02002, 2018.

[LSP23]

Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Pollefeys. Lightglue: local feature matching at light speed. arXiv ePrint 2306.13643, 2023.

[MarquezNLopezABB16]

Pablo Márquez-Neila, Javier López-Alberca, José M. Buenaposada, and Luis Baumela. Speeding-up homography estimation in mobile devices. J. Real-Time Image Process., 11(1):141–154, January 2016. URL: https://doi.org/10.1007/s11554-012-0314-1, doi:10.1007/s11554-012-0314-1.

[MMRM17]

Anastasiya Mishchuk, Dmytro Mishkin, Filip Radenovic, and Jiri Matas. Working hard to know your neighbor's margins: local descriptor learning loss. In Proceedings of NeurIPS. 2017.

[MRM18]

D. Mishkin, F. Radenovic, and J. Matas. Repeatability is Not Enough: Learning Affine Regions via Discriminability. In ECCV. 2018.

[MMP15]

Dmytro Mishkin, Jiri Matas, and Michal Perdoch. Mods: fast and robust method for two-view matching. Computer Vision and Image Understanding, 141:81 – 93, 2015.

[MTB+19]

Arun Mukundan, Giorgos Tolias, Andrei Bursuc, Hervé Jégou, and Ondřej Chum. Understanding and improving kernel local descriptors. International Journal of Computer Vision, 2019.

[MullerH21]

Samuel G Müller and Frank Hutter. Trivialaugment: tuning-free yet state-of-the-art data augmentation. In Proceedings of the IEEE/CVF international conference on computer vision, 774–782. 2021.

[NCR+22]

Anguelos Nicolaou, Vincent Christlein, Edgar Riba, Jian Shi, Georg Vogeler, and Mathias Seuret. Tormentor: deterministic dynamic-path, data augmentations with fractals. 2022. URL: https://arxiv.org/abs/2204.03776, doi:10.48550/ARXIV.2204.03776.

[PautratLinL+21]

Rémi Pautrat*, Juan-Ting Lin*, Viktor Larsson, Martin R. Oswald, and Marc Pollefeys. Sold2: self-supervised occlusion-aware line description and detection. In Computer Vision and Pattern Recognition (CVPR). 2021.

[PDP20]

Duc Duy Pham, Gurbandurdy Dovletov, and Josef Pauli. A differentiable convolutional distance transform layer for improved image segmentation. Pattern Recognition, 12544:432 – 444, 2020.

[Pul20]

Milan Pultar. Improving the hardnet descriptor. arXiv ePrint 2007.09699, 2020.

[RDPC24]

Christoph Reich, Biplob Debnath, Deep Patel, and Srimat Chakradhar. Differentiable jpeg: the devil is in the details. In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 2024.

[ROF+21]

Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiri Matas, and Marc Pollefeys. Defmo: deblurring and shape recovery of fast moving objects. In CVPR. 2021.

[SEG17]

Seyed Sadegh Mohseni Salehi, Deniz Erdogmus, and Ali Gholipour. Tversky loss function for image segmentation using 3d fully convolutional deep networks. arXiv ePrint 1706.05721, 2017.

[SSSF+23]

Jan Sellner, Silvia Seidlitz, Alexander Studier-Fischer, Alessandro Motta, Berkin Özdemir, Beat Peter Müller-Stich, Felix Nickel, and Lena Maier-Hein. Semantic segmentation of surgical hyperspectral images under geometric domain shifts. 2023. arXiv:2303.10972.

[SZZ+24]

Jian Shi, Pengyi Zhang, Ni Zhang, Hakim Ghazzai, and Peter Wonka. Dissolving is amplifying: towards fine-grained anomaly detection. 2024.

[SS17]

Richard Shin and Dawn Song. Jpeg-resistant adversarial images. In NIPS Workshop on Machine Learning and Computer Security, volume 1, 8. 2017.

[SSP03]

P. Simard, David Steinkraus, and John C. Platt. Best practices for convolutional neural networks applied to visual document analysis. Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings., pages 958–963, 2003.

[SK20]

Saurabh Singh and Shankar Krishnan. Filter response normalization layer: eliminating batch dependence in the training of deep neural networks. In CVPR. 2020.

[SSW+21]

Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, and Xiaowei Zhou. LoFTR: detector-free local feature matching with transformers. In CVPR. 2021.

[TBLN+20]

Yurun Tian, Axel Barroso Laguna, Tony Ng, Vassileios Balntas, and Krystian Mikolajczyk. Hynet: learning local descriptor with hybrid similarity measure and triplet loss. In NeurIPS. 2020.

[TFT20]

Michał Tyszkiewicz, Pascal Fua, and Eduard Trulls. Disk: learning local features with policy gradient. Advances in Neural Information Processing Systems, 33:14254–14265, 2020.

[YHO+19]

Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. Cutmix: regularization strategy to train strong classifiers with localizable features. In International Conference on Computer Vision (ICCV). 2019.

[ZnYNDLP18]

Hongyi Zhang, Moustapha Cisse nad Yann N. Dauphin, and David Lopez-Paz. Mixup: beyond empirical risk minimization. International Conference on Learning Representations, 2018. URL: https://openreview.net/forum?id=r1Ddp1-Rb.

[Zha19]

Richard Zhang. Making convolutional networks shift-invariant again. In ICML. 2019.

[ZWC+23]

Xiaoming Zhao, Xingming Wu, Weihai Chen, Peter C. Y. Chen, Qingsong Xu, and Zhengguo Li. Aliked: a lighter keypoint and descriptor extraction network via deformable transformation. IEEE Transactions on Instrumentation and Measurement, 72:1–16, 2023. doi:10.1109/TIM.2023.3271000.

[ZBTvdW22]

Simone Zini, Marco Buzzelli, Bartłomiej Twardowski, and Joost van de Weijer. Planckian jitter: enhancing the color quality of self-supervised visual representations. arXiv preprint arXiv:2202.07993, 2022.

[Baumberg00]

A. Baumberg. Reliable feature matching across widely separated views. In CVPR. 2000.