Pose Tracking with Wearable and Ambient Devices

Established: November 10, 2016

Summary

Analyzing human motion with high autonomy requires advanced capabilities in sensing, communication, energy management and AI. Wearable systems help us go beyond external cameras enabling motion analysis in the wild. However, such systems are still semi-autonomous. This is because, they require careful sensor calibration and precise positioning on the body over the course of motion. Moreover, these systems are plagued with bulky batteries and issues of time synchronization, sensor noise and drift. All of these restrictions hinder the use of wearable motion analysis in applications that rely on long-term tracking such as everyday gait analysis, performance measurement in the wild and full-body VR controllers. In this project, we aim to solve the specific problem of achieving autonomy and non-intrusiveness in wearable systems that target motion analysis. In order to do so, we extensively leverage efficient techniques in machine learning and systems design.

Project Overview

To overcome the invasive and cumbersome nature of today’s wearable systems that are used for motion analysis, we are conducting research that eliminates rigid mounting and operative restrictions. Our approach requires advances in electronic circuits for sensor design, machine learning that can adapt to varying noise conditions and movement patterns, as well as communication technologies that are robust to data losses arising from occlusion and probabilistic signal fading. Our final vision is to be able to analyze human motion with high spatio-temporal accuracy over extended periods of time, all with the use of a novel ultra-light-weight non-invasive wearable sensor networks that seamlessly conform to specific body movements (see figure). We believe that such a system has enormous potential. Immediately, it will find use as a free-style control device that enables 3rd person views in VR, without causing discomfort to users who would otherwise need to attach dozens of sensors carefully to their body. In the long term, it will allow us to seamlessly track sports performance and analyze human gait by maintaining round-the-clock connectivity to the cloud via a smartphone and other opportunistic networks.

Split image: human poses on the left with tracked poses on the right

Project Details

To achieve the end-goal of a highly-autonomous wearable system for motion analysis, we have tackled several sub-problems. We have explored the design of flexible batteries and electronic circuits that can harvest their energy from ambient sources of radiation. We have also built prototypes using in-house chemical processes to demonstrate the benefits of our approach. These we hope will form the core hardware of our interconnected wearable system. On the software front, we have investigated the use of machine-learning algorithms for motion analysis. Specifically, we have looked at representation-learning methods to eliminate signal artifacts, state-estimation models to conceal packet losses, regression-based DNNs to track human pose within a kinematic chain, and convolutional-recurrent neural-network architectures to accurately detect coarse gestures. We believe that these and other algorithmic building-blocks are vital to achieving a fully-autonomous wearable system. We have evaluated our algorithms on functional prototypes that we have developed internally.

Algorithms

Before reaching our target problem of pose tracking, we have looked at detecting gestures and as well as recognizing activities. During the course of this research, we have discovered and addressed several challenges.

Detecting gestures: We have applied ML techniques to detect hand gestures. Although we expect our system to be wearable, we conducted experiments with a rig that was designed to operate standalone (see figure). We utilized MVDR beam-forming to create intensity and depth images via linearly-modulated ultrasound signals. We then employed a CNN-LSTM network to classify gestures into 1 of 5 categories used to control an AR device. Overall, we were able to achieve accuracy in the range 65-97%, depending on how many and what gestures we tried to identify. You can find more details of this work in our ICASSP 17 paper.

Predicted gestures illustration

Illustration of time-domain layers

Recognizing activities: We found that high signal quality is important in order to achieve high recognition accuracy. Although, this is true in general, it becomes critical when we try to accomplish precise tasks like object tracking (or limb tracking in our case) in 3D space. We have made new attempts at denoising inertial motion unit (IMU) signals with representation learning. What we found was quite interesting. As we migrate from sensors that are tightly-mounted to those that are loosely-integrated into garments, signal artifacts can be significant (up to -12 dB SNR). Traditional signal-processing techniques are insufficient to mitigate these artifacts. Thus, architectures like deconvolutional sequence-to-sequence auto-encoders (DSTSAE) allow us to model the inherent data-generation process in IMUs and other wearable sensors, helping us eliminate high-complexity artifacts. Through experiments conducted on the OPPORTUNITY activity-recognition dataset, we have found that DSTSAE-based de-noising can improve F1-score of recognizing a small-set of activities by 77.1% (as a result of improving SNR from -12 dB to +18.2 dB). Although our method worked well with a small dataset, more investigation is needed to ensure that our approach is scalable to a larger set of activities and noise types.  Take a look at our DATE 12 and BSN 17 papers for more results on signal denoising. It remains to be seen whether denoising with this method helps limb-tracking algorithms, as opposed to detection.

Tracking pose: We have also developed an initial set of algorithms to robustly track pose when wearable sensors are integrated into garments. This was quite a challenging feat. Initially, we attempted to employ traditional kinematic algorithms after fusing sensor data from a dense network of IMUs. However, we found that the tracking performance of such algorithms suffered heavily due to several factors. Thus, we made our first attempt of using machine-learning for this problem. We discovered that with a very simple DNN (including an informational context window), we were able to lower tracking errors by up to 69% (see figure below). See our ICRA 18 paper for more details.Illustration showing DNN reducing tracking errors

However, there was a catch. Although, the network worked really well when tracking poses that were in the training database, it suffered heavily when presented with novel poses that were not present in the training set. This clearly showed limited generalizability of this model. Thus, we are exploring new network architectures that can capture human-motion patterns well, including unsupervised- and reinforcement-learning techniques. One other issue that we noticed is the impact of packet losses in a wearable sensing system. Large sequence of packet losses led to poorer tracking results overall. Thus, we are developing techniques to tackle this issue. Besides, sensor noise, motion artifacts and packet losses, there are several other sources of signal errors that impact the performance of our system. We are tackling one of them at a time. Eventually, we intend to develop an array of techniques that can be employed to realize our vision of a fully-autonomous wearable system for motion analysis. You can read about some of our work in this direction in our BSN 18 paper. Look out for more upcoming papers on this work.

Systems and hardware

Like our algorithmic work, we have made preliminary advances in system design. Our first technology component is flexible circuits. We have built flexible batteries and energy-harvesting circuits that would be useful for our wearable system. Separately, we have also developed BLE-connected and WiFi-connected wearable sensor networks with commodity hardware that allowed us to conduct our algorithmic research.

photo of circuits

Flexible batteries and circuits: We have developed Zn-MnO2/PrussianBlue batteries in-house that are flexible and can be charged via an ambient-energy harvesting circuit (see figure). Our first prototype in this effort, called RadioTroph, was a flex-tag that was self-sustaining — as in it could harvest energy from ambient radiation and sunlight, store it in a tiny battery and deliver power to other electronic circuitry as needed. Although, our prototype worked reasonably well, we are faced with several challenges pertaining to charge-retention on the battery and achieving high-quality resonance with the harvesting antennae. In future iterations, we intend to investigate these issues.

System v 1.0 (BLE-based): Our very first hardware system was in-fact based on rigid sensor mounts. We then extended it to a mobile version that employed commodity hardware for sensing, processing and communication. In order to do supervised machine-learning, we required measurements from moving and non-moving sensors on the body at the same time. Thus, we collected data simultaneously from rigid and mobile systems in subject trials (see figure). We used straps on a compression shirts and pants to collect IMU data. The bottom and front sides of one strap are shown in the lower part of the figure. The strap comprised 4 IMU sensors (for calibration, alignment and redundancy): 3 LSM9DSO IMUs and 1 MPU-9150 IMU. They were connected to an I2C switch, TCA9548a, which is in turn connected to an ATmega32u4 processor board clocked at 16 MHz. The micro-controller also recorded analog signals from 2 surface EMG sensors (used to make sure of body contact). The recorded signals were sent over a UART interface to a BLE radio, nRF8001. All of these sensors were sewn onto the Velcro strap with conductive-fabric thread. The architectural block diagram of the sensor platforms is also shown in the figure with grayed out components. We utilized the data collected from this system to develop algorithms that removed motion artifacts. You can read more details about this in our BSN 17 paper.

Figure showing rigid and mobile systems in subject trials

System v 1.1 (WiFi-based): We have continued to improve our system beyond simple components on flex PCBs and initial versions of the hardware. We have built a sophisticated signal-processing engine to aggregate data samples over an adaptive wireless network, clean, interpolate, re-sample and efficiently store them for processing.

Figure of refined rigid and mobile systems in subject trials

The second iteration of our system (see figure) was a little more refined. It comprised a dense interconnected sensor network of 38 IMUs. At least two IMUs were associated with each body segment. Infrared (IR) sensors were placed between arms and torso, and between two legs. Each hand and foot also had one IR sensor. These sensors complemented the IMUs by detecting distance between body parts based on time-of-flight proximity readings. There were also ultrasound sensors integrated that could be operated over extended periods of time. Although, we did have not utilize the ultrasound data so far, we believe that it is going to be useful when we tackle issues like sensor drift and position tracking in the future. The sensors were synchronized and connected over a high-bandwidth 802.11 ac WiFi network. Within the network, multiple CPUs sampled data from the sensors at rates of up to 760 Hz and streamed them to a base station at speeds of up to 27 Mbps (approx. 1600 byte UDP broadcast payloads at 90% 802.11 PHY rate). At the base station, we processed this data to track body-joints in free space. Simultaneously, we also recorded depth video from 2 calibrated Kinect sensors and fused them to track pose. We have archived the segmented and synchronized data from the wearable and Kinect sensors, along with the corresponding RGB video. To encourage future research, we have released this unique and sensor-rich data set to the public MIMC 17.  You can find more information about the system in our BIOROB 18 paper.

Going Forward

For our vision to become real, there are several technology pieces that have to come together. We have tackled some basic algorithmic and system-design issues so far. However, much remains to be done in the integration of devices with the body, improving robustness against moisture and heat, compensating for signal occlusions, maintaining persistent and scalable connectivity, distributed machine-learning, data compression, in-network processing, sustained time-synchronization etc. As we conduct research in this project, we recognize that there are smaller technical wins that we stand to gain—for instance, we hope to transfer some of our algorithmic knowledge to the benefit of hand tracking and gesture recognition problems in AR/VR devices, which can happen over the short term. Once we reasonably solidify our wearable technology, we intend to build more ambitious demos for novel end-to-end scenarios like VR games with a 3rd-person view and articulated pose tracking, personal activity trackers and posture-recognition systems.