[ Russian ] [ English ]

Искусственный интеллект в компьютерном зрении

Владимир Арлазаров
ФИЦ ИУ РАН, Smart Engines (https://smartengines.com/)
vva@smartengines.com

В докладе будут рассмотрены следующие вопросы:

  • Сверточные нейронные сети – типы слоев, функции активации, обучение [1-7].
  • Обзор современных моделей сверточных нейронных сетей (LeNet-5, AlexNet, VGG-16, Inception-v1, ResNet, R-CNN, YOLO, U-Net, CapsNet, GAN, VAE) и их применений (классификация и локализация объектов, сегментация изображений).
  • Современные тренды развития нейросетевых методов, в частности, направления экстенсивного и интенсивного развития.
  • Проблемы безопасности и уязвимости нейросетей: отравление обучающих данных [8][11], встраивание триггеров [9][10], атака уклонением, состязательные примеры [12-15].
  • Подходы к повышению вычислительной эффективности нейросетей: прореживание сети [17], дистилляция знаний [18-20], квантизация сетей [21-22], малобитные вычисления, морфологические нейросети.
  • Аппаратные и программные направления реализации сверточных нейронных сетей.
  • Современные проблемы нейросетей.

Слайды доклада

Видео доклада.

Литература:

  1. E. Shelhamer, J. Long and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640-651, 1 April 2017, doi: 10.1109/TPAMI.2016.2572683.
  2. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (June 2017), 84–90. DOI:https://doi.org/10.1145/3065386
  3. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML'15). JMLR.org, 448–456.
  4. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), ICCV ’15, 1026–1034, IEEE Computer Society, USA (2015). DOI: 10.1109/ICCV.2015.123.
  5. D. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate deep network learning by exponential linear units (elus),” in 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings , (2016).
  6. Ramachandran, P., Zoph, B., & Le, Q.V. (2018). Searching for Activation Functions. ArXiv, abs/1710.05941.
  7. Alexander V. Gayer, Alexander V. Sheshkus, Dmitri P. Nikolaev, Vladimir V. Arlazarov, "Improvement of U-Net architecture for image binarization with activation functions replacement," Proc. SPIE 11605, Thirteenth International Conference on Machine Vision, 116050Y (4 January 2021); https://doi.org/10.1117/12.2587027
  8. Jagielski, Matthew & Oprea, Alina & Biggio, Battista & Liu, Chang & Nita-Rotaru, Cristina & Li, Bo. (2018). Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning. 19-35. 10.1109/SP.2018.00057.
  9. T. Gu, K. Liu, B. Dolan-Gavitt and S. Garg, "BadNets: Evaluating Backdooring Attacks on Deep Neural Networks," in IEEE Access, vol. 7, pp. 47230-47244, 2019, doi: 10.1109/ACCESS.2019.2909068.
  10. T. Gu, K. Liu, B. Dolan-Gavitt and S. Garg, "BadNets: Evaluating Backdooring Attacks on Deep Neural Networks," in IEEE Access, vol. 7, pp. 47230-47244, 2019, doi: 10.1109/ACCESS.2019.2909068.
  11. Tahmasebian, Farnaz & Xiong, Li & Sotoodeh, Mani & Sunderam, Vaidy. (2020). Crowdsourcing Under Data Poisoning Attacks: A Comparative Study. 10.1007/978-3-030-49669-2_18.
  12. I.J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and harnessing adversarial examples", 2015, ICLR.
  13. I.J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and harnessing adversarial examples", 2015, ICLR.
  14. S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, "Universal adversarial perturbations", 2017, CVPR.
  15. X. Xu, J. Chen, J. Xiao, L. Gao, F. Shen, and H.T. Shen, "What machines see is not what they get: fooling scene text recognition models with adversarial text images", 2020, CVPR.
  16. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, "Intriguing properties of neural networks", 2014.
  17. Xiaohan Ding, Guiguang Ding, Xiangxin Zhou, Yuchen Guo, Jungong Han, Ji Liu: Global Sparse Momentum SGD for Pruning Very Deep Neural Networks. NeurIPS 2019: 6379-6391
  18. Hinton, G., Vinyals, O. & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531
  19. Asami, Taichi; Masumura, Ryo; Yamaguchi, Yoshikazu; Masataki, Hirokazu; Aono, Yushi (2017). Domain adaptation of DNN acoustic models using knowledge distillation. IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 5185–5189.
  20. Cui, Jia; Kingsbury, Brian; Ramabhadran, Bhuvana; Saon, George; Sercu, Tom; Audhkhasi, Kartik; Sethy, Abhinav; Nussbaum-Thom, Markus; Rosenberg, Andrew (2017). Knowledge distillation across ensembles of multilingual models for low-resource languages. IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 4825–4829.
  21. X. Chen, X. Hu, H. Zhou and N. Xu, "FxpNet: Training a deep convolutional neural network in fixed-point representation," 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, 2017, pp. 2494-2501, doi: 10.1109/IJCNN.2017.7966159.
  22. Qin, H., Gong, R., Liu, X., Bai, X., Song, J., & Sebe, N. (2020). Binary Neural Networks: A Survey. ArXiv, abs/2004.03333.
Supported by Synthesis Group