Cемантична сегментація зображень з використанням Transformer архітектури

Швай, НадіяІванюк-Скульський, Богдан2024-04-102024-04-102022https://ekmair.ukma.edu.ua/handle/123456789/28826In this work we have presented a model that efficiently balances between local representations obtained by convolution blocks and a global representations obtained by transformer blocks. Proposed model outperforms, previously, standard decoder architecture DeepLabV3 by at least 1% Jaccard index with smaller number of parameters. In the best case this improvement is of 7%. As part of our future work we plan to experiment with (1) MS COCO dataset pretraining (2) hyperparameters search.enAlexNetTransformer Encoder blocksJaccard indexDeepLabV3магістерська роботаCемантична сегментація зображень з використанням Transformer архітектуриOther