Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на этот ресурс: https://elib.psu.by/handle/123456789/30612
Название: Object Detection in Video Surveillance Based on Multiscale Frame Representation and Block Processing by a Convolutional Neural Network
Авторы: Bohush, R.
Ma, G.
Yang, W.
Ablameyko, S.
Дата публикации: 2022
Издатель: Springer
Библиографическое описание: Bohush, R., Ma, G., Weichen, Y. et al. Object Detection in Video Surveillance Based on Multiscale Frame Representation and Block Processing by a Convolutional Neural Network. Pattern Recognit. Image Anal. 32, 1–10 (2022). https://doi.org/10.1134/S1054661822010035
Аннотация: A method for detecting objects in high-resolution images is proposed that is based on representing an image as a set of its copies of decreasing scale, splitting it into blocks with overlap at each level of the image pyramid except for the top one, detecting objects in the blocks, and analyzing objects at the boundaries of adjacent blocks to merge them. The number of pyramid layers is determined by the size of the image and the input layer of the convolutional neural network (CNN). At all levels except for the top one, a block splitting is performed, and the use of overlap allows one to improve the correct classification of objects, which are divided into fragments and located in adjacent blocks. The decision to merge such fragments is made based on the analysis of the metric of intersection over union and membership in the same class. The proposed approach is evaluated for 4K and 8K images. To carry out experiments, a database is prepared with objects of two classes, person and vehicle, marked in such images. Networks of the You Only Look Once (YOLO) family of the third and fourth versions are used as CNNs. A quantitative assessment of the detection efficiency of objects is performed using the mAP metric for various combinations of parameters such as the degree of threshold confidence of the CNN and the percentage of intersection of blocks in the hierarchical representation of images. The results of the investigations are presented.
URI (Унифицированный идентификатор ресурса): https://elib.psu.by/handle/123456789/30612
DOI: 10.1134/S1054661822010035
Располагается в коллекциях:Публикации в Scopus и Web of Science
Машинное обучение. Обработкой изображений и видео. Интеллектуальные системы. Информационная безопасность

Файлы этого ресурса:
Файл Описание РазмерФормат 
1-10.pdf184.2 kBAdobe PDFЭскиз
Просмотреть/Открыть


Все ресурсы в архиве электронных ресурсов защищены авторским правом, все права сохранены.