MPG.eBooks - Staff View: Multimodal scene understanding

Read Now

Multimodal scene understanding algorithms, applications and deep learning

Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes th...

Full description

Bibliographic Details
Other Authors:	Yang, Michael Ying (Editor), Rosenhahn, Bodo (Editor), Murino, Vittorio (Editor)
Format:	eBook
Language:	English
Published:	London Academic Press 2019
Subjects:	Ingénierie Computer Algorithms / Http://id.loc.gov/authorities/subjects/sh91000149 Engineering / Fast Algorithms / Http://id.loc.gov/authorities/subjects/sh85003487 Artificial Intelligence / Http://id.loc.gov/authorities/subjects/sh85008180 Intelligence Artificielle Computer Vision / Fast Computational Intelligence / Http://id.loc.gov/authorities/subjects/sh94004659 Engineering / Aat Computer Vision / Http://id.loc.gov/authorities/subjects/sh85029549 Computer Algorithms / Fast Algorithms / Fast Artificial Intelligence / Fast Intelligence Informatique Algorithmes Algorithms / Aat Artificial Intelligence / Aat Computational Intelligence / Fast Vision Par Ordinateur Engineering / Http://id.loc.gov/authorities/subjects/sh85043176
Online Access:	https://learning.oreilly.com/library/view/~/978012...
Collection:	O'Reilly - Collection details see MPG.ReNa


LEADER	05978nmm a2200649 u 4500
001	EB001936143
003	EBX01000000000000001099045
005	00000000000000.0
007	cr\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|
008	210123 \|\|\| eng
020			\|a 9780128173596
020			\|a 0128173599
020			\|a 9780128173589
050		4	\|a Q342
100	1		\|a Yang, Michael Ying \|e editor
245	0	0	\|a Multimodal scene understanding \|b algorithms, applications and deep learning \|c edited by Michael Ying Yang, Bodo Rosenhahn, Vittorio Murino
260			\|a London \|b Academic Press \|c 2019
300			\|a ix, 412 pages \|b illustrations (some color), maps
505	0		\|a Includes bibliographical references and index
505	0		\|a 4 Learning Convolutional Neural Networks for Object Detection with Very Little Training Data4.1 Introduction; 4.2 Fundamentals; 4.2.1 Types of Learning; 4.2.2 Convolutional Neural Networks; 4.2.2.1 Arti cial neuron; 4.2.2.2 Arti cial neural network; 4.2.2.3 Training; 4.2.2.4 Convolutional neural networks; 4.2.3 Random Forests; 4.2.3.1 Decision tree; 4.2.3.2 Random forest; 4.3 Related Work; 4.4 Traf c Sign Detection; 4.4.1 Feature Learning; 4.4.2 Random Forest Classi cation; 4.4.3 RF to NN Mapping; 4.4.4 Fully Convolutional Network; 4.4.5 Bounding Box Prediction; 4.5 Localization
505	0		\|a Front Cover; Multimodal Scene Understanding; Copyright; Contents; List of Contributors; 1 Introduction to Multimodal Scene Understanding; 1.1 Introduction; 1.2 Organization of the Book; References; 2 Deep Learning for Multimodal Data Fusion; 2.1 Introduction; 2.2 Related Work; 2.3 Basics of Multimodal Deep Learning: VAEs and GANs; 2.3.1 Auto-Encoder; 2.3.2 Variational Auto-Encoder (VAE); 2.3.3 Generative Adversarial Network (GAN); 2.3.4 VAE-GAN; 2.3.5 Adversarial Auto-Encoder (AAE); 2.3.6 Adversarial Variational Bayes (AVB); 2.3.7 ALI and BiGAN
505	0		\|a 2.4 Multimodal Image-to-Image Translation Networks2.4.1 Pix2pix and Pix2pixHD; 2.4.2 CycleGAN, DiscoGAN, and DualGAN; 2.4.3 CoGAN; 2.4.4 UNIT; 2.4.5 Triangle GAN; 2.5 Multimodal Encoder-Decoder Networks; 2.5.1 Model Architecture; 2.5.2 Multitask Training; 2.5.3 Implementation Details; 2.6 Experiments; 2.6.1 Results on NYUDv2 Dataset; 2.6.2 Results on Cityscape Dataset; 2.6.3 Auxiliary Tasks; 2.7 Conclusion; References; 3 Multimodal Semantic Segmentation: Fusion of RGB and Depth Data in Convolutional Neural Networks; 3.1 Introduction; 3.2 Overview; 3.2.1 Image Classi cation and the VGG Network
505	0		\|a 3.2.2 Architectures for Pixel-level Labeling3.2.3 Architectures for RGB and Depth Fusion; 3.2.4 Datasets and Benchmarks; 3.3 Methods; 3.3.1 Datasets and Data Splitting; 3.3.2 Preprocessing of the Stanford Dataset; 3.3.3 Preprocessing of the ISPRS Dataset; 3.3.4 One-channel Normal Label Representation; 3.3.5 Color Spaces for RGB and Depth Fusion; 3.3.6 Hyper-parameters and Training; 3.4 Results and Discussion; 3.4.1 Results and Discussion on the Stanford Dataset; 3.4.2 Results and Discussion on the ISPRS Dataset; 3.5 Conclusion; References
505	0		\|a 4.6 Clustering4.7 Dataset; 4.7.1 Data Capturing; 4.7.2 Filtering; 4.8 Experiments; 4.8.1 Training and Test Data; 4.8.2 Classi cation; 4.8.3 Object Detection; 4.8.4 Computation Time; 4.8.5 Precision of Localizations; 4.9 Conclusion; Acknowledgment; References; 5 Multimodal Fusion Architectures for Pedestrian Detection; 5.1 Introduction; 5.2 Related Work; 5.2.1 Visible Pedestrian Detection; 5.2.2 Infrared Pedestrian Detection; 5.2.3 Multimodal Pedestrian Detection; 5.3 Proposed Method; 5.3.1 Multimodal Feature Learning/Fusion; 5.3.2 Multimodal Pedestrian Detection; 5.3.2.1 Baseline DNN model
653			\|a Ingénierie
653			\|a Computer algorithms / http://id.loc.gov/authorities/subjects/sh91000149
653			\|a Engineering / fast
653			\|a Algorithms / http://id.loc.gov/authorities/subjects/sh85003487
653			\|a Artificial intelligence / http://id.loc.gov/authorities/subjects/sh85008180
653			\|a Intelligence artificielle
653			\|a Computer vision / fast
653			\|a Computational intelligence / http://id.loc.gov/authorities/subjects/sh94004659
653			\|a engineering / aat
653			\|a Computer vision / http://id.loc.gov/authorities/subjects/sh85029549
653			\|a Computer algorithms / fast
653			\|a Algorithms / fast
653			\|a Artificial intelligence / fast
653			\|a Intelligence informatique
653			\|a Algorithmes
653			\|a algorithms / aat
653			\|a artificial intelligence / aat
653			\|a Computational intelligence / fast
653			\|a Vision par ordinateur
653			\|a Engineering / http://id.loc.gov/authorities/subjects/sh85043176
700	1		\|a Rosenhahn, Bodo \|e editor
700	1		\|a Murino, Vittorio \|e editor
041	0	7	\|a eng \|2 ISO 639-2
989			\|b OREILLY \|a O'Reilly
015			\|a GBB9C9474
776			\|z 0128173599
776			\|z 9780128173596
776			\|z 0128173580
776			\|z 9780128173589
856	4	0	\|u https://learning.oreilly.com/library/view/~/9780128173596/?ar \|x Verlag \|3 Volltext
082	0		\|a 006.3
082	0		\|a 620
520			\|a Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms. Researchers collecting and analyzing multi-sensory data collections - for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful

Multimodal scene understanding algorithms, applications and deep learning

Similar Items