Augmented and Virtual Reality - From Daily Life to Assisted Robotics Seminar Media Technology - TUM
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Augmented and Virtual Reality From Daily Life to Assisted Robotics Seminar Media Technology SS 2021 Chair of Media Technology Technical University of Munich
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Seminar Topics 1. AR/VR-based Human-Robot Interfaces 2. VR and AR Interfaces for Robot Learning from Demonstration 3. Intuitive Teleoperation using Augmented Reality 4. Improving Teleoperated Driving Using an Augmented Reality Representation 5. Deep Learning for 6D pose estimation and its applications in AR 6. Deformable Object Tracking for AR 7. Marker-based Augmented Reality 8. Visual Place Recognition 9. Activity Recognition for Augmented Reality and Virtual Reality 10. Augmented and Virtual Haptics 11. Generation of Realistic Virtual Views 12. Video coding optimization of virtual reality 360-degree Video 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 6
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München AR/VR-based Human-Robot Interfaces Supervision: Furkan Kaynar (furkan.kaynar@tum.de) Augmented, virtual or mixed reality applications attract interest in many fields including robotics. The design of an human-robot interface is of great importance as it determines the capacity and limitations of the interaction between the human and the robot. The type of the human input is defined by the user interface itself. Due to their intuitive nature, AR/VR-based interfaces may improve human demonstrations for robotic tasks. The demonstrations can be provided via virtual tools rendered in a real or simulated scene. AR/VR-based interfaces may also improve the understanding of the planned robotic tasks by the human. In this topic, we will investigate the methods and applications of the AR/VR-based interfaces for robotic applications including but not limited to semiautonomous teleoperation and robot programming. References: [1] Gadre, Samir Yitzhak, et al. "End-user robot programming using mixed reality." 2019 International conference on robotics and automation (ICRA). IEEE, 2019. [2] Rosen, Eric, et al. "Communicating and controlling robot arm motion intent through mixed-reality head-mounted displays." The International Journal of Robotics Research 38.12-13 (2019): 1513-1526. [3] Walker, Michael E., Hooman Hedayati, and Daniel Szafir. "Robot teleoperation with augmented reality virtual surrogates." 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 2019. [4] Makhataeva, Zhanat, and Huseyin Atakan Varol. "Augmented reality for robotics: a review." Robotics 9.2 (2020): 21. [5] Kent, David, Carl Saldanha, and Sonia Chernova. "Leveraging depth data in remote robot teleoperation interfaces for general object manipulation." The International Journal of Robotics Research 39.1 (2020): 39-53. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 7
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München VR and AR Interfaces for Robot Learning from Demonstration Supervision: Basak Guelecyuez (basak.guelecyuez@tum.de) In the context of robotics and automation, learning from demonstration (LfD) refers to the framework of endowing robots with autonomy in variety of tasks by making use of the demonstrations provided by a human teacher. A classical approach for interacting with the robot during demonstrations is kinesthetic teaching, where the human teacher physically guides the robot. In this topic we would like to investigate novel approaches with VR and AR interfaces for human-robot interaction in LfD. For example, VR interfaces are used to teleoperate a robot when providing demonstrations. AR interfaces are further employed to asses the learned task and determine failure cases of the robot autonomy. References: [1] M. Diehl, A. Plopski, H. Kato and K. Ramirez-Amaro, "Augmented Reality interface to verify Robot Learning," 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), Naples, Italy, 2020, pp. 378-383. [2] H. Liu, Y. Zhang, W. Si, X. Xie, Y. Zhu and S. Zhu, "Interactive Robot Knowledge Patching Using Augmented Reality," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, pp. 1947-1954. [3] Luebbers M. B., Brooks C., Kim M. J., Szafir D., Hayes B. Augmented reality interface for constrained learning from demonstration. In: Proceedings of the 2nd International Workshop on Virtual, Augmented and Mixed Reality for HRI (VAM-HRI); 2019. [4] D. Bambuŝek, Z. Materna, M. Kapinus, V. Beran and P. Smrž, "Combining Interactive Spatial Augmented Reality with Head- Mounted Display for End-User Collaborative Robot Programming," 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), New Delhi, India, 2019, pp. 1-8. [5] T. Zhang et al., "Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, pp. 5628-5635. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 8
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Intuitive Teleoperation using Augmented Reality Supervision: Edwin Babaians (edwin.babaians@tum.de) Teleoperation remains a dominant control paradigm for human interaction with robotic systems. However, teleoperation can be quite challenging, especially for novice users. Even experienced users may face difficulties or inefficiencies when operating a robot with unfamiliar and/or complex dynamics, such as industrial manipulators or aerial robots, as teleoperation forces users to focus on low-level aspects of robot control, rather than higher level goals regarding task completion, data analysis, and problem solving. We will explore how advances in augmented reality (AR) may enable the design of novel teleoperation interfaces that increase operation effectiveness, support the user in conducting concurrent work, and decrease stress. In addition, AR could help in shortening the learning curve, so that the operators become proficient in the teleoperation setup and can thus perform better with just a short familiarization with the system. References: [1] Birkenkampf, Peter, Daniel Leidner, and Christoph Borst. "A knowledge-driven shared autonomy human-robot interface for tablet computers." In 2014 IEEE-RAS International Conference on Humanoid Robots, pp. 152-159. IEEE, 2014. [2] Brizzi, Filippo, Lorenzo Peppoloni, Alessandro Graziano, Erika Di Stefano, Carlo Alberto Avizzano, and Emanuele Ruffaldi. "Effects of augmented reality on the performance of teleoperated industrial assembly tasks in a robotic embodiment." IEEE Transactions on Human-Machine Systems 48, no. 2 (2017): 197-206. [3] Livatino, Salvatore, Dario C. Guastella, Giovanni Muscato, Vincenzo Rinaldi, Luciano Cantelli, Carmelo D. Melita, Alessandro Caniglia, Riccardo Mazza, and Gianluca Padula. "Intuitive Robot Teleoperation through Multi-Sensor Informed Mixed Reality Visual Aids." IEEE Access 9 (2021): 25795-25808. [4] Walker, Michael E., Hooman Hedayati, and Daniel Szafir. "Robot teleoperation with augmented reality virtual surrogates." In 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 202-210. IEEE, 2019. [5] Hernández, Juan David, Shlok Sobti, Anthony Sciola, Mark Moll, and Lydia E. Kavraki. "Increasing robot autonomy via motion planning and an augmented reality interface." IEEE Robotics and Automation Letters 5, no. 2 (2020): 1017-1023. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 9
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Improving Teleoperated Driving Using an Augmented Reality Representation Supervision: Markus Hofbauer (markus.hofbauer@tum.de) Teleoperated driving can be a possible fallback candidate to resolve failures of autonomous vehicles. One of the main problems is the representation of the sensor data transmitted from the autonomous vehicle to the human operator in a way, that the operator can understand the current traffic situation [1]. While regular displays are often the first choice, the immersive telepresence of head mounted displays allows for further mixed reality representations to improve the operator situation awareness [2], [3]. The task of this project is analyze different augmented/virtual reality approaches to improve the sensor representation in driving. References: [1]J.-M. Georg und F. Diermeyer, „An Adaptable and Immersive Real Time Interface for Resolving System Limitations of Automated Vehicles with Teleoperation“, in 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, Okt. 2019, S. 2659–2664, doi: 10.1109/SMC.2019.8914306. [2]A. Hosseini und M. Lienkamp, „Enhancing telepresence during the teleoperation of road vehicles using HMD-based mixed reality“, in 2016 IEEE Intelligent Vehicles Symposium (IV), Gotenburg, Sweden, Juni 2016, S. 1366–1373, doi: 10.1109/IVS.2016.7535568. [3]P. Gomes, C. Olaverri-Monreal, und M. Ferreira, „Making Vehicles Transparent Through V2V Video Streaming“, IEEE Transactions on Intelligent Transportation Systems, Bd. 13, Nr. 2, S. 930–938, Juni 2012, doi: 10.1109/TITS.2012.2188289. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 10
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Deep Learning for 6D pose estimation and its applications in AR Supervision: Diego Prado (diego.prado@tum.de) Estimating the 6D pose of an object from camera images can be very valuable for robotic assembly and maintenance applications. Together with Augmented Reality, step-by-step guidance can be provided to the user in assembling an object consisting of multiple components. In recent years Neural Networks have been applied with great success and have proven to provide fast and robust results. The goal of this seminar topic is to to study state of the art Deep Learning techniques for 6D pose estimation and their possible applications in AR. References: [1] Su, Y., Rambach, J., Minaskan, N., Lesur, P., Pagani, A., & Stricker, D. “Deep multi-state object pose estimation for augmented reality assembly.” ISMAR-Adjunct 2019 [2] He, Z.; Feng, W.; Zhao, X.; Lv, Y. “6D Pose Estimation of Objects: Recent Technologies and Challenges.” Appl. Sci. 2021 [3] Su, Y., Rambach, J., Pagani, A., & Stricker, D. ”SynPo-Net—Accurate and Fast CNN-Based 6DoF Object Pose Estimation Using Synthetic Training” Sensors, 2021 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 11
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Deformable Object Tracking for AR Supervision: Michael Adam (michael.adam@tum.de) Before augmenting reality, one needs to understand the environment which should be changed. Understanding static scenes and its objects has become easier due to the introduction of deep neural networks. However, when interacting with the scene, objects may change size or even appearance. This is the case with deformable objects. Tracking deformable objects can be tricky and several techniques have been developed in order to augment them over time. During the seminar you should first make yourself familiar with object tracking in the scope of AR, for instance [1,2] and find different techniques for tracking deformable objects. Physic-based simulations [3,4] as well as methods with specialized setups exist [5]. References: [1] Park, Youngmin, Vincent Lepetit, and Woontack Woo. "Multiple 3d object tracking for augmented reality." 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality. IEEE, 2008. [2] Tsoli, Aggeliki, and Antonis A. Argyros. "Joint 3d tracking of a deformable object in interaction with a hand." Proceedings of the European Conference on Computer Vision (ECCV). 2018. [3] Haouchine, Nazim, et al. "Physics-based augmented reality for 3d deformable object." Eurographics Workshop on Virtual Reality Interaction and Physical Simulation. 2012. [4] Paulus, Christoph J., et al. "Augmented reality during cutting and tearing of deformable objects." 2015 IEEE International Symposium on Mixed and Augmented Reality. IEEE, 2015. [5] Fujimoto, Yuichiro, et al. "Geometrically-correct projection-based texture mapping onto a deformable object." IEEE transactions on visualization and computer graphics 20.4 (2014): 540-549. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 12
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Marker-based Augmented Reality Supervision: Martin Oelsch (martin.Oelsch@tum.de) Augmented Reality (AR) employs computer vision, image processing and computer graphics techniques to merge digital content into the real world. It enables real-time interaction between the user, real objects and virtual objects. AR can, for example, be used to embed 3D graphics into a video in such a way as if the virtual elements were part of the real environment [1]. One of the challenges of AR is to align virtual data with the environment. A marker-based approach solves the problem using visual markers, e.g. 2D bar-codes, detectable with computer vision methods. Once the markers are detected, the geometry of the currently viewed part of the environment can be estimated by computing the pose of the marker. The student is required to give a good overview of marker-based augmented reality and explain the methodology of marker detection, pose estimation in the context of augmented reality applications mathematically. References: [1] Siltanen, Sanni. (2012). Theory and applications of marker based augmented reality. [2] Sadeghi-Niaraki, A.; Choi, S.-M. A Survey of Marker-Less Tracking and Registration Techniques for Health & Environmental Applications to Augmented Reality and Ubiquitous Geospatial Information Systems. Sensors 2020 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 13
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Visual Place Recognition Supervision: Sebastian Eger (sebastian.eger@tum.de) A major task for realistic AR/VR applications is to retrieve an accurate pose (position and orientation) estimation for the device (e.g smartphone). Most commonly, sensors like cameras and Inertial Measurement Units (IMUs) are used to run a Visual Inertial Odometry (VIO) [1]. Based on the pose estimation of each frame, a dense 3D map of the environment can be build [2]. This procedure is called Simultaneous Localization and Mapping (SLAM). After that, the AR/VR application can render and place realistic augmented objects into the scene. However, sometimes there already exists a large-scale, but sparse map of the environment. If we want to display position-based information, we first need a pose estimation in this global map. Since the global map does not contain very detailed (local) information, we then can initialise the SLAM at the global pose estimation. In this seminar, the student shall research state-of-the-art visual place recognition methods and algorithms. If desired, we can provide indoor image sequences to test out and evaluate different methods. A good starting point for the literature research is: https://paperswithcode.com/task/visual-place-recognition References: [1] Tong Qin, Peiliang Li, and Shaojie Shen, ‘VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator’, IEEE Transactions on Robotics 34, no. 4 (August 2018): 1004–20 [2] Richard A. Newcombe et al., ‘KinectFusion: Real-Time Dense Surface Mapping and Tracking’, in 2011 10th IEEE International Symposium on Mixed and Augmented Reality, 2011, 127–36. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 14
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Activity Recognition for Augmented Reality and Virtual Reality Supervision: Marsil Zakour (marsil.zakour@tum.de) Yuankai Wu (yuankai.wu@tum.de) AR/VR based methods are very beneficial for us to collect data sets of human daily living and predict human activity. Capturing real human daily living data is critical for learning human activity that can later be transferred to robots. Recent improvements in virtual reality (VR) head-mounted displays provide a viable method for collecting data on human activity without the difficulties often encountered in capturing performance in a physical environment [1]. Furthermore, [2] uses conventional AR-based method to make predictions about human activity. Another work is driven by the idea of moving from traditional augmented reality (AR) systems, which are typically limited to visualization and tracking components, to augmented reality cognitive systems, which possess or gradually build knowledge about the user's situation and intent [3]. You need to explore state of the art approaches by understanding the techniques mentioned above for predicting human activity using augmented reality or virtual reality and describe the methodology. References: [1] T. Bates, K. Ramirez-Amaro, T. Inamura and G. Cheng, "On-line simultaneous learning and recognition of everyday activities from virtual reality performances," 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 2017, pp. 3510- 3515, doi: 10.1109/IROS.2017.8206193. [2] Schröder, Matthias, and Helge Ritter. "Deep learning for action recognition in augmented reality assistance systems." ACM SIGGRAPH 2017 Posters. 2017. 1-2. [3] Stricker, Didier, and Gabriele Bleser. "From interactive to adaptive augmented reality." 2012 International Symposium on Ubiquitous Virtual Reality. IEEE, 2012. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 15
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Augmented and Virtual Haptics Supervision: Andreas Noll (andreas.noll@tum.de) For the simulation and augmentation of reality, not only the audio-visual perception should be addressed. For instance, the tactile sense needs to be stimulated. However, as of today nearly no consumer product is available for VR and AR scenarios. Hence, in the seminar you should search for devices which are able to produce haptic feedback in the scope of AR/VR. Further you should explain the different modalities which need to be addressed (force-feedback, material property, etc.), discuss which problems need to be faced and present how the devices can be used for an augmented setup or in a virtual environment. References: [1] Shi, Yuxiang, et al. "Self-powered electro-tactile system for virtual tactile experiences." Science Advances 7.6 (2021): eabe2943. [2] Ichikari, Ryosuke, Tenshi Yanagimachi, and Takeshi Kurata. "Augmented reality tactile map with hand gesture recognition." International Conference on Computers Helping People with Special Needs. Springer, Cham, 2016. [3] Kaul, Oliver Beren, and Michael Rohs. "Haptichead: A spherical vibrotactile grid around the head for 3d guidance in virtual and augmented reality." Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2017. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 16
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Generation of Realistic Virtual Views Supervision: Martin Piccolrovazzi (martin.piccolrovazzi@tum.de) Generating realistic virtual views of a scene based on 2D or 3D data is a challenging topic with many possible applications in VR/AR. In recent years, neural rendering has emerged as a active research area, combining machine learning techniques with computer graphics for virtual view generation. In this topic, we review the latest approaches in neural rendering, focusing on different applications or modalities. References: [1] Mildenhall et al. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis https://arxiv.org/pdf/2003.08934.pdf [2] Gafni et al. Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction https://arxiv.org/pdf/2012.03065.pdf [3] Wang et al. NeRF−−: Neural Radiance Fields Without Known Camera Parameters https://arxiv.org/pdf/2102.07064.pdf [4] Tewari et al. State of the Art on Neural Rendering https://arxiv.org/pdf/2004.03805.pdf 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 17
Chair of Media Technology Prof. Dr.-Ing. Eckehard Steinbach Technische Universität München Video coding optimization of virtual reality 360-degree Video Supervision: Kai Cui (kai.cui@tum.de) Virtual reality (VR) creates an immersive experience of the real world in a virtual environment. Due to the technological advancements in recent years, VR technology is growing very fast. Since VR is visualizing the real- world experience, the image or video content that is used must represent the whole 3D world characteristics. 360- degree videos demonstrate such characteristics and hence are used in VR applications. However, these contents are not suitable for conventional video coding standards, which are designed for 2D video format content. Therefore, the focus for 360-degree video compression is to find a proper projection that transforms a 360-degree frame into a rectangular planar image that will have a high compression ratio. In this seminar topic, we will investigate the tools and algorithms that are designed to optimize the 360-degree (spherical) video compression performance in state-of- the-art video coding standards (e.g., HEVC, AV1, VVC). And compare the advantages and disadvantages of different approaches. References: [1] Zhou, Yimin, Ling Tian, Ce Zhu, Xin Jin, and Yu Sun. "Video coding optimization for virtual reality 360-degree source." IEEE Journal of Selected Topics in Signal Processing 14, no. 1 (2020): 118-129. [2] Xu, Mai, Chen Li, Shanyi Zhang, and Patrick Le Callet. "State-of-the-art in 360 video/image processing: Perception, assessment and compression." IEEE Journal of Selected Topics in Signal Processing 14, no. 1 (2020): 5-26. [3] Wien, Mathias, Jill M. Boyce, Thomas Stockhammer, and Wen-Hsiao Peng. "Standardization status of immersive video coding." IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, no. 1 (2019): 5-17. [4] Adhuran, Jayasingam, Gosala Kulupana, Chathura Galkandage, and Anil Fernando. "Multiple Quantization Parameter Optimization in Versatile Video Coding for 360° Videos." IEEE Transactions on Consumer Electronics 66, no. 3 (2020): 213-222. [5] Lin, Jian-Liang, Ya-Hsuan Lee, Cheng-Hsuan Shih, Sheng-Yen Lin, Hung-Chih Lin, Shen-Kai Chang, Peng Wang, Lin Liu, and Chi-Cheng Ju. "Efficient projection and coding tools for 360 video." IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, no. 1 (2019): 84-97. 14/04/21 Michael Adam | Scientific Seminar Media Technology | Augmented & Virtual Reality 18
You can also read