Augmented and Virtual Reality - From Daily Life to Assisted Robotics Seminar Media Technology - TUM

Page created by Fernando Ingram
 
CONTINUE READING
Augmented and Virtual Reality - From Daily Life to Assisted Robotics Seminar Media Technology - TUM
Augmented and Virtual Reality
                  From Daily Life to Assisted Robotics

Seminar Media Technology
SS 2021
Chair of Media Technology
Technical University of Munich
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                              Technische Universität München

Seminar Topics
1. AR/VR-based Human-Robot Interfaces
2. VR and AR Interfaces for Robot Learning from Demonstration
3. Intuitive Teleoperation using Augmented Reality
4. Improving Teleoperated Driving Using an Augmented Reality Representation
5. Deep Learning for 6D pose estimation and its applications in AR
6. Deformable Object Tracking for AR
7. Marker-based Augmented Reality
8. Visual Place Recognition
9. Activity Recognition for Augmented Reality and Virtual Reality
10. Augmented and Virtual Haptics
11. Generation of Realistic Virtual Views
12. Video coding optimization of virtual reality 360-degree Video

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                    6
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                Technische Universität München

                               AR/VR-based Human-Robot Interfaces
                      Supervision: Furkan Kaynar (furkan.kaynar@tum.de)

                      Augmented, virtual or mixed reality applications attract interest in many fields including robotics. The design of an
                      human-robot interface is of great importance as it determines the capacity and limitations of the interaction between
                      the human and the robot. The type of the human input is defined by the user interface itself. Due to their intuitive
                      nature, AR/VR-based interfaces may improve human demonstrations for robotic tasks. The demonstrations can be
                      provided via virtual tools rendered in a real or simulated scene. AR/VR-based interfaces may also improve the
                      understanding of the planned robotic tasks by the human. In this topic, we will investigate the methods and
                      applications of the AR/VR-based interfaces for robotic applications including but not limited to semiautonomous
                      teleoperation and robot programming.

                      References:
                      [1] Gadre, Samir Yitzhak, et al. "End-user robot programming using mixed reality." 2019 International conference on robotics and
                      automation (ICRA). IEEE, 2019.
                      [2] Rosen, Eric, et al. "Communicating and controlling robot arm motion intent through mixed-reality head-mounted displays." The
                      International Journal of Robotics Research 38.12-13 (2019): 1513-1526.
                      [3] Walker, Michael E., Hooman Hedayati, and Daniel Szafir. "Robot teleoperation with augmented reality virtual surrogates." 2019
                      14th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 2019.
                      [4] Makhataeva, Zhanat, and Huseyin Atakan Varol. "Augmented reality for robotics: a review." Robotics 9.2 (2020): 21.
                      [5] Kent, David, Carl Saldanha, and Sonia Chernova. "Leveraging depth data in remote robot teleoperation interfaces for general
                      object manipulation." The International Journal of Robotics Research 39.1 (2020): 39-53.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                      7
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                Technische Universität München

                                       VR and AR Interfaces
                               for Robot Learning from Demonstration
                      Supervision: Basak Guelecyuez (basak.guelecyuez@tum.de)

                      In the context of robotics and automation, learning from demonstration (LfD) refers to the framework of
                      endowing robots with autonomy in variety of tasks by making use of the demonstrations provided by a
                      human teacher. A classical approach for interacting with the robot during demonstrations is kinesthetic
                      teaching, where the human teacher physically guides the robot. In this topic we would like to investigate
                      novel approaches with VR and AR interfaces for human-robot interaction in LfD. For example, VR interfaces
                      are used to teleoperate a robot when providing demonstrations. AR interfaces are further employed to asses
                      the learned task and determine failure cases of the robot autonomy.

                      References:
                      [1] M. Diehl, A. Plopski, H. Kato and K. Ramirez-Amaro, "Augmented Reality interface to verify Robot Learning," 2020 29th IEEE
                      International Conference on Robot and Human Interactive Communication (RO-MAN), Naples, Italy, 2020, pp. 378-383.
                      [2] H. Liu, Y. Zhang, W. Si, X. Xie, Y. Zhu and S. Zhu, "Interactive Robot Knowledge Patching Using Augmented Reality," 2018 IEEE
                      International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, pp. 1947-1954.
                      [3] Luebbers M. B., Brooks C., Kim M. J., Szafir D., Hayes B. Augmented reality interface for constrained learning from
                      demonstration. In: Proceedings of the 2nd International Workshop on Virtual, Augmented and Mixed Reality for HRI (VAM-HRI);
                      2019.
                      [4] D. Bambuŝek, Z. Materna, M. Kapinus, V. Beran and P. Smrž, "Combining Interactive Spatial Augmented Reality with Head-
                      Mounted Display for End-User Collaborative Robot Programming," 2019 28th IEEE International Conference on Robot and Human
                      Interactive Communication (RO-MAN), New Delhi, India, 2019, pp. 1-8.
                      [5] T. Zhang et al., "Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation," 2018 IEEE
                      International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, pp. 5628-5635.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                      8
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                 Technische Universität München

                                                   Intuitive Teleoperation
                                                  using Augmented Reality
                      Supervision: Edwin Babaians (edwin.babaians@tum.de)

                      Teleoperation remains a dominant control paradigm for human interaction with robotic systems. However,
                      teleoperation can be quite challenging, especially for novice users. Even experienced users may face difficulties or
                      inefficiencies when operating a robot with unfamiliar and/or complex dynamics, such as industrial manipulators or
                      aerial robots, as teleoperation forces users to focus on low-level aspects of robot control, rather than higher level
                      goals regarding task completion, data analysis, and problem solving. We will explore how advances in augmented
                      reality (AR) may enable the design of novel teleoperation interfaces that increase operation effectiveness, support the
                      user in conducting concurrent work, and decrease stress. In addition, AR could help in shortening the learning curve,
                      so that the operators become proficient in the teleoperation setup and can thus perform better with just a short
                      familiarization with the system.
                      References:
                      [1] Birkenkampf, Peter, Daniel Leidner, and Christoph Borst. "A knowledge-driven shared autonomy human-robot interface for
                      tablet computers." In 2014 IEEE-RAS International Conference on Humanoid Robots, pp. 152-159. IEEE, 2014.
                      [2] Brizzi, Filippo, Lorenzo Peppoloni, Alessandro Graziano, Erika Di Stefano, Carlo Alberto Avizzano, and Emanuele Ruffaldi.
                      "Effects of augmented reality on the performance of teleoperated industrial assembly tasks in a robotic embodiment." IEEE
                      Transactions on Human-Machine Systems 48, no. 2 (2017): 197-206.
                      [3] Livatino, Salvatore, Dario C. Guastella, Giovanni Muscato, Vincenzo Rinaldi, Luciano Cantelli, Carmelo D. Melita, Alessandro
                      Caniglia, Riccardo Mazza, and Gianluca Padula. "Intuitive Robot Teleoperation through Multi-Sensor Informed Mixed Reality Visual
                      Aids." IEEE Access 9 (2021): 25795-25808.
                      [4] Walker, Michael E., Hooman Hedayati, and Daniel Szafir. "Robot teleoperation with augmented reality virtual surrogates." In 2019
                      14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 202-210. IEEE, 2019.
                      [5] Hernández, Juan David, Shlok Sobti, Anthony Sciola, Mark Moll, and Lydia E. Kavraki. "Increasing robot autonomy via motion
                      planning and an augmented reality interface." IEEE Robotics and Automation Letters 5, no. 2 (2020): 1017-1023.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                       9
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                               Technische Universität München

                            Improving Teleoperated Driving
                      Using an Augmented Reality Representation
                      Supervision: Markus Hofbauer (markus.hofbauer@tum.de)

                      Teleoperated driving can be a possible fallback candidate to resolve failures of autonomous vehicles. One of
                      the main problems is the representation of the sensor data transmitted from the autonomous vehicle to the
                      human operator in a way, that the operator can understand the current traffic situation [1]. While regular
                      displays are often the first choice, the immersive telepresence of head mounted displays allows for further
                      mixed reality representations to improve the operator situation awareness [2], [3].

                      The task of this project is analyze different augmented/virtual reality approaches to improve the sensor
                      representation in driving.

                      References:
                      [1]J.-M. Georg und F. Diermeyer, „An Adaptable and Immersive Real Time Interface for Resolving System Limitations of Automated
                      Vehicles with Teleoperation“, in 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, Okt. 2019,
                      S. 2659–2664, doi: 10.1109/SMC.2019.8914306.
                      [2]A. Hosseini und M. Lienkamp, „Enhancing telepresence during the teleoperation of road vehicles using HMD-based mixed
                      reality“, in 2016 IEEE Intelligent Vehicles Symposium (IV), Gotenburg, Sweden, Juni 2016, S. 1366–1373, doi:
                      10.1109/IVS.2016.7535568.
                      [3]P. Gomes, C. Olaverri-Monreal, und M. Ferreira, „Making Vehicles Transparent Through V2V Video Streaming“, IEEE Transactions
                      on Intelligent Transportation Systems, Bd. 13, Nr. 2, S. 930–938, Juni 2012, doi: 10.1109/TITS.2012.2188289.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                     10
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                  Technische Universität München

                                 Deep Learning for 6D pose estimation
                                      and its applications in AR
                      Supervision: Diego Prado (diego.prado@tum.de)

                      Estimating the 6D pose of an object from camera images can be very valuable for robotic assembly and maintenance
                      applications. Together with Augmented Reality, step-by-step guidance can be provided to the user in assembling an
                      object consisting of multiple components.
                      In recent years Neural Networks have been applied with great success and have proven to provide fast and robust
                      results. The goal of this seminar topic is to to study state of the art Deep Learning techniques for 6D pose estimation
                      and their possible applications in AR.

                      References:
                      [1] Su, Y., Rambach, J., Minaskan, N., Lesur, P., Pagani, A., & Stricker, D. “Deep multi-state object pose estimation for augmented
                      reality assembly.” ISMAR-Adjunct 2019
                      [2] He, Z.; Feng, W.; Zhao, X.; Lv, Y. “6D Pose Estimation of Objects: Recent Technologies and Challenges.” Appl. Sci. 2021
                      [3] Su, Y., Rambach, J., Pagani, A., & Stricker, D. ”SynPo-Net—Accurate and Fast CNN-Based 6DoF Object Pose Estimation Using
                      Synthetic Training” Sensors, 2021

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                        11
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                            Technische Universität München

                                      Deformable Object Tracking for AR
                      Supervision: Michael Adam (michael.adam@tum.de)

                      Before augmenting reality, one needs to understand the environment which should be changed. Understanding static
                      scenes and its objects has become easier due to the introduction of deep neural networks. However, when interacting
                      with the scene, objects may change size or even appearance. This is the case with deformable objects. Tracking
                      deformable objects can be tricky and several techniques have been developed in order to augment them over time.

                      During the seminar you should first make yourself familiar with object tracking in the scope of AR, for instance [1,2]
                      and find different techniques for tracking deformable objects. Physic-based simulations [3,4] as well as methods with
                      specialized setups exist [5].

                      References:
                      [1] Park, Youngmin, Vincent Lepetit, and Woontack Woo. "Multiple 3d object tracking for augmented reality." 2008 7th IEEE/ACM International
                      Symposium on Mixed and Augmented Reality. IEEE, 2008.
                      [2] Tsoli, Aggeliki, and Antonis A. Argyros. "Joint 3d tracking of a deformable object in interaction with a hand." Proceedings of the European
                      Conference on Computer Vision (ECCV). 2018.
                      [3] Haouchine, Nazim, et al. "Physics-based augmented reality for 3d deformable object." Eurographics Workshop on Virtual Reality Interaction
                      and Physical Simulation. 2012.
                      [4] Paulus, Christoph J., et al. "Augmented reality during cutting and tearing of deformable objects." 2015 IEEE International Symposium on Mixed
                      and Augmented Reality. IEEE, 2015.
                      [5] Fujimoto, Yuichiro, et al. "Geometrically-correct projection-based texture mapping onto a deformable object." IEEE transactions on
                      visualization and computer graphics 20.4 (2014): 540-549.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                                  12
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                            Technische Universität München

                                          Marker-based Augmented Reality
                      Supervision: Martin Oelsch (martin.Oelsch@tum.de)

                      Augmented Reality (AR) employs computer vision, image processing and computer graphics techniques to merge digital content into
                      the real world. It enables real-time interaction between the user, real objects and virtual objects. AR can, for example, be used
                      to embed 3D graphics into a video in such a way as if the virtual elements were part of the real environment [1].

                      One of the challenges of AR is to align virtual data with the environment. A marker-based approach solves the problem
                      using visual markers, e.g. 2D bar-codes, detectable with computer vision methods. Once the markers are detected, the geometry of
                      the currently viewed part of the environment can be estimated by computing the pose of the marker.

                      The student is required to give a good overview of marker-based augmented reality and explain the methodology of marker detection,
                      pose estimation in the context of augmented reality applications mathematically.

                      References:
                      [1] Siltanen, Sanni. (2012). Theory and applications of marker based augmented reality.
                      [2] Sadeghi-Niaraki, A.; Choi, S.-M. A Survey of Marker-Less Tracking and Registration Techniques for Health & Environmental Applications to
                      Augmented Reality and Ubiquitous Geospatial Information Systems. Sensors 2020

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                                  13
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                 Technische Universität München

                                                  Visual Place Recognition
                      Supervision: Sebastian Eger (sebastian.eger@tum.de)

                      A major task for realistic AR/VR applications is to retrieve an accurate pose (position and orientation) estimation for
                      the device (e.g smartphone). Most commonly, sensors like cameras and Inertial Measurement Units (IMUs) are used
                      to run a Visual Inertial Odometry (VIO) [1]. Based on the pose estimation of each frame, a dense 3D map of the
                      environment can be build [2]. This procedure is called Simultaneous Localization and Mapping (SLAM).

                      After that, the AR/VR application can render and place realistic augmented objects into the scene. However,
                      sometimes there already exists a large-scale, but sparse map of the environment. If we want to display position-based
                      information, we first need a pose estimation in this global map. Since the global map does not contain very detailed
                      (local) information, we then can initialise the SLAM at the global pose estimation.

                      In this seminar, the student shall research state-of-the-art visual place recognition methods and algorithms. If desired,
                      we can provide indoor image sequences to test out and evaluate different methods. A good starting point for the
                      literature research is: https://paperswithcode.com/task/visual-place-recognition

                      References:
                      [1] Tong Qin, Peiliang Li, and Shaojie Shen, ‘VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator’, IEEE
                      Transactions on Robotics 34, no. 4 (August 2018): 1004–20
                      [2] Richard A. Newcombe et al., ‘KinectFusion: Real-Time Dense Surface Mapping and Tracking’, in 2011 10th IEEE International
                      Symposium on Mixed and Augmented Reality, 2011, 127–36.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                       14
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                           Technische Universität München

                                               Activity Recognition for
                                          Augmented Reality and Virtual Reality
                      Supervision: Marsil Zakour (marsil.zakour@tum.de) Yuankai Wu (yuankai.wu@tum.de)

                      AR/VR based methods are very beneficial for us to collect data sets of human daily living and predict human activity. Capturing real
                      human daily living data is critical for learning human activity that can later be transferred to robots. Recent improvements in virtual
                      reality (VR) head-mounted displays provide a viable method for collecting data on human activity without the difficulties often
                      encountered in capturing performance in a physical environment [1].

                      Furthermore, [2] uses conventional AR-based method to make predictions about human activity. Another work is driven by the idea of
                      moving from traditional augmented reality (AR) systems, which are typically limited to visualization and tracking components, to
                      augmented reality cognitive systems, which possess or gradually build knowledge about the user's situation and intent [3].

                      You need to explore state of the art approaches by understanding the techniques mentioned above for predicting human activity using
                      augmented reality or virtual reality and describe the methodology.

                      References:
                      [1] T. Bates, K. Ramirez-Amaro, T. Inamura and G. Cheng, "On-line simultaneous learning and recognition of everyday activities from virtual
                      reality performances," 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 2017, pp. 3510-
                      3515, doi: 10.1109/IROS.2017.8206193.
                      [2] Schröder, Matthias, and Helge Ritter. "Deep learning for action recognition in augmented reality assistance systems." ACM SIGGRAPH 2017
                      Posters. 2017. 1-2.
                      [3] Stricker, Didier, and Gabriele Bleser. "From interactive to adaptive augmented reality." 2012 International Symposium on Ubiquitous Virtual
                      Reality. IEEE, 2012.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                                 15
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                             Technische Universität München

                                           Augmented and Virtual Haptics
                      Supervision: Andreas Noll (andreas.noll@tum.de)

                      For the simulation and augmentation of reality, not only the audio-visual perception should be addressed. For
                      instance, the tactile sense needs to be stimulated. However, as of today nearly no consumer product is available for
                      VR and AR scenarios.
                      Hence, in the seminar you should search for devices which are able to produce haptic feedback in the scope of
                      AR/VR. Further you should explain the different modalities which need to be addressed (force-feedback, material
                      property, etc.), discuss which problems need to be faced and present how the devices can be used for an augmented
                      setup or in a virtual environment.

                      References:
                      [1] Shi, Yuxiang, et al. "Self-powered electro-tactile system for virtual tactile experiences." Science Advances 7.6 (2021): eabe2943.
                      [2] Ichikari, Ryosuke, Tenshi Yanagimachi, and Takeshi Kurata. "Augmented reality tactile map with hand gesture recognition." International
                      Conference on Computers Helping People with Special Needs. Springer, Cham, 2016.
                      [3] Kaul, Oliver Beren, and Michael Rohs. "Haptichead: A spherical vibrotactile grid around the head for 3d guidance in virtual and augmented
                      reality." Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2017.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                                   16
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                 Technische Universität München

                                    Generation of Realistic Virtual Views
                      Supervision: Martin Piccolrovazzi (martin.piccolrovazzi@tum.de)

                      Generating realistic virtual views of a scene based on 2D or 3D data is a challenging topic with many possible
                      applications in VR/AR. In recent years, neural rendering has emerged as a active research area, combining machine
                      learning techniques with computer graphics for virtual view generation. In this topic, we review the latest approaches
                      in neural rendering, focusing on different applications or modalities.

                      References:
                      [1] Mildenhall et al. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis https://arxiv.org/pdf/2003.08934.pdf
                      [2] Gafni et al. Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction https://arxiv.org/pdf/2012.03065.pdf
                      [3] Wang et al. NeRF−−: Neural Radiance Fields Without Known Camera Parameters https://arxiv.org/pdf/2102.07064.pdf
                      [4] Tewari et al. State of the Art on Neural Rendering https://arxiv.org/pdf/2004.03805.pdf

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                       17
Chair of Media Technology
      Prof. Dr.-Ing. Eckehard Steinbach                                                                                                 Technische Universität München

                                               Video coding optimization
                                          of virtual reality 360-degree Video
                      Supervision: Kai Cui (kai.cui@tum.de)

                      Virtual reality (VR) creates an immersive experience of the real world in a virtual environment. Due to the
                      technological advancements in recent years, VR technology is growing very fast. Since VR is visualizing the real-
                      world experience, the image or video content that is used must represent the whole 3D world characteristics. 360-
                      degree videos demonstrate such characteristics and hence are used in VR applications. However, these contents are
                      not suitable for conventional video coding standards, which are designed for 2D video format content. Therefore, the
                      focus for 360-degree video compression is to find a proper projection that transforms a 360-degree frame into a
                      rectangular planar image that will have a high compression ratio. In this seminar topic, we will investigate the tools
                      and algorithms that are designed to optimize the 360-degree (spherical) video compression performance in state-of-
                      the-art video coding standards (e.g., HEVC, AV1, VVC). And compare the advantages and disadvantages of different
                      approaches.
                      References:
                      [1] Zhou, Yimin, Ling Tian, Ce Zhu, Xin Jin, and Yu Sun. "Video coding optimization for virtual reality 360-degree source." IEEE
                      Journal of Selected Topics in Signal Processing 14, no. 1 (2020): 118-129.
                      [2] Xu, Mai, Chen Li, Shanyi Zhang, and Patrick Le Callet. "State-of-the-art in 360 video/image processing: Perception, assessment
                      and compression." IEEE Journal of Selected Topics in Signal Processing 14, no. 1 (2020): 5-26.
                      [3] Wien, Mathias, Jill M. Boyce, Thomas Stockhammer, and Wen-Hsiao Peng. "Standardization status of immersive video coding."
                      IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, no. 1 (2019): 5-17.
                      [4] Adhuran, Jayasingam, Gosala Kulupana, Chathura Galkandage, and Anil Fernando. "Multiple Quantization Parameter
                      Optimization in Versatile Video Coding for 360° Videos." IEEE Transactions on Consumer Electronics 66, no. 3 (2020): 213-222.
                      [5] Lin, Jian-Liang, Ya-Hsuan Lee, Cheng-Hsuan Shih, Sheng-Yen Lin, Hung-Chih Lin, Shen-Kai Chang, Peng Wang, Lin Liu, and
                      Chi-Cheng Ju. "Efficient projection and coding tools for 360 video." IEEE Journal on Emerging and Selected Topics in Circuits and
                      Systems 9, no. 1 (2019): 84-97.

14/04/21                          Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality                                                       18
You can also read