A Light Touch

Introducing Popcorn, the sensor behind TR01’s sense of touch.

Astute readers of our first blog post, where we introduced the TR01 hand, will have noticed the strange device that our fingertips are mounted on. Many of them will have probably also guessed its role: yes, it’s a touch sensor.

Many papers have been written about how the human sense of touch works and how important it is to our own dexterity. Just as many debate whether touch sensing is equally important to dexterous robots, and, if so, how to achieve it. Without claiming to have settled these debates, here are our own answers: yes, we believe that touch sensing will be massively beneficial to achieving skilled manipulation, and we need the sensors to be sensitive, rich in information, robust, practical to integrate and use, and to not break the bank.

Watch on YouTube
The Popcorn touch sensor
The Popcorn sensor — named for what a good sensor should be: cheap and small.

Introducing Popcorn, Our First Touch Sensor

The Popcorn is the first touch sensor we have integrated into our robot hands. It is designed to sit at the base of the fingertip and provide information about the net contact forces that the fingertip experiences. In that sense, we refer to it not as a traditional “skin” but rather as a touch sensor. It has the following key characteristics:

  • Highly sensitive: in realistic settings, subject to noise, wear-and-tear, hysteresis, and sensor-to-sensor variability, the Popcorn can still reliably detect forces as low as 0.1 N.
  • Rich in information: the Popcorn provides a set of 14 raw signals that can map onto the full combination of force and torque experienced by the fingertip.
  • Robust: we have observed the Popcorn to be highly robust to overload forces, which are easily experienced while learning to manipulate. The Popcorn also separates sensing electronics from the part of the finger that makes contact with the world, allowing the latter to be built from any durable materials we choose, and easily replaced.
  • Easy to integrate in hands: the Popcorn is compact and easy to use, providing a direct digital readout with no external amplification electronics. It also places no restrictions on the shape of the fingertip, which can thus be adapted to the task as needed.
  • Low latency, high frequency: we read the Popcorn at 250 Hz, and without the latency typically associated with camera-based tactile sensors.
  • Easy to manufacture and low-cost: the Popcorn only uses commonly available and low-cost components, and can be manufactured without any complex equipment, based on externally sourced PCBs and flexures.

Basically, while transduction methods are interesting in and of themselves, we are manipulation people: we care about all the unsung aspects that make a touch sensor actually practical for integration into a hand and usable at scale for manipulation. The Popcorn checks all those boxes for us. In many manipulation labs, tactile sensors are precious, delicate things that are a pain to build into your hands and that you think twice before using lest they break. In contrast, we believe that touch sensors should be as easily available to builders of robot hands as, well, popcorn.

How it Works: Light as a Touch Transducer

The first fundamental choice we made was to use light sensors inside a robot finger to sense displacement of a soft element due to contact, and then equate that displacement to a touch signal. We do this because light is a phenomenal transduction mechanism: it provides exceptional signal-to-noise ratio with much simpler manufacturing and cheaper components than alternative methods.

The next key design choice, leading directly to the Popcorn, was that the first embodiment of this technology we would integrate in our hands would not be a “skin” covering the entire fingertip, but rather a sensor sitting at the base of the fingertip, giving us the overall contact forces applied by it.

This approach has obvious advantages. Fully sensorized fingertips tend to be bulky as they need to accommodate the transduction mechanism. Furthermore, once a fully sensorized fingertip is designed, its shape is hard to change. In contrast, the Popcorn can take any size and shape fingertip, including slender ones for manipulation of small parts. A damaged fingertip costs pennies to replace, since it contains no electronics.

There are clear parallels here to a different class of sensor that has been quietly doing the job for decades in industrial robotics: the good old six-axis force/torque sensor. In fact, there is previous work in robotic manipulation using F/T sensors embedded at the base of the fingertip. However, unlike industrial F/T sensors, which work extremely well but are designed for factory arms — large, expensive, with bulky amplification electronics — ours is meant for robot hands: compact, low-cost, robust to overload, with direct digital readout and no external amplification.

The optical signals from our sensor are not strictly forces, but a representation of the induced displacement. A learned model, trained on labeled data collected with a reference force/torque instrument, can map these raw readings to a full 6-axis force and torque reading, or to contact location and normal force — representations that are physically grounded, easy to reason about, and natively compatible with any modern physics simulator. (That last point matters: when training learned manipulation policies, the consistency between simulated and real sensor data can have a large effect on how well sim-to-real transfer works.) However, we still expect to train end-to-end visuomotor policies on the raw sensor data, not a learned intermediate representation.

This does not mean we have given up on the concept of tactile skin with full fingertip coverage. In the future, we might still experiment with fully sensorized fingertips, which do provide their own advantages over F/T sensors sitting at the fingertip base, such as information on contact area or contact pressure distribution. That said, the Popcorn gives us a lot of what we need to work with, for a fraction of the cost and integration effort.

What’s Next: Autonomous Visuotactile Manipulation

Force sensing alone does not make a robot hand dexterous, but we believe it is an important piece of the puzzle. Modern visuotactile policies, which combine visual perception with contact force information from each finger, are beginning to unlock manipulation capabilities that were utterly out of reach just a few years ago. This is one of those cases where getting the hardware and sensors right quietly determines how far the software can go.

We are collecting teleoperation data of complete manipulation tasks with touch sensing included. Stay tuned for upcoming posts on how this data will help close the loop on autonomous visuotactile manipulation.

Further Reading

Background and History

The Popcorn is based on an optical transduction approach for tactile sensing first developed at Columbia University, where members of our team built a robotic fingertip capable of sub-millimeter contact localization over complex, multicurved surfaces, and then showed, for the first time, that it enables in-hand object manipulation without any cameras, based exclusively on touch and proprioception. Our academic prototype of a dexterous hand integrating this technology was named one of Time Magazine’s Best Inventions of 2023. It then led directly to the very first Popcorn, which applied this transduction method for net contact force detection at the base of the fingertip. While the current version of the sensor is improved in many ways, a lot of the original DNA remains.

Technical Details

The Popcorn is an optical sensor consisting of two rigid plates — a top and a bottom — separated by a small gap and connected with carefully designed flexures, each carrying a custom PCB populated with infrared LEDs. Some LEDs act as emitters; others, on the opposite plate, act as receivers. At rest, each emitter illuminates its facing receivers with a known baseline intensity. When a contact force is applied to the finger (nominally connected to the top plate), the two plates shift relative to one another, changing the light intensity each receiver sees. The resulting sensor outputs 14 raw optical signals, streaming continuously at 250 Hz.

A key design choice — and a somewhat surprising one — is using LEDs as both emitters and receivers. LEDs used as photodetectors exhibit significantly higher signal-to-noise ratio than conventional photodiodes in this application, and their narrower viewing angle turns out to be a feature rather than a limitation: it makes the signals more sensitive to small lateral displacements, exactly what we want. In ideal conditions, this allows the sensor to detect relative plate displacements on the order of 10 microns.

We chose LEDs that work in the IR spectrum in order to minimize interference from ambient light. (LEDs are exceptional photodetectors only in the same band that they emit in.) Modern indoors lighting has virtually no IR component, and, indeed, we have observed very little interference. However, daylight can still have IR components, and small amounts of the light emitted by the sensor itself can also bounce back off the environment and show up as unwanted signal. To avoid both these phenomena, we wrap the sensor in a flexible outer shell made of TPU plastic that we have observed to have excellent IR absorption. While the Popcorn still works well without it, this outer shell virtually eliminates drift on larger timescales.

The mechanical interface between the two plates is critical. We need this interface to exhibit just the right level of compliance: too soft and it will introduce parasitic effects in high-speed manipulation, too stiff and it will decrease sensitivity. We also want as little hysteresis as possible, at both short time scales (seconds, touch-to-touch) and long time scales (hours or days). In fact, hysteresis at various time scales is often the secret weakness of many touch sensors, and we put a lot of effort into getting a handle on it. The result is a mechanical interface based on precision Titanium flexures, which significantly simplifies fabrication, reduces unit-to-unit variability, and produces more consistent sensor behavior.

As the sensor sits at the base of the fingertip structure rather than within the contact surface itself, it does not constrain fingertip design. The material, geometry, and stiffness of the fingertip can be chosen freely for different use cases — a compliant, high-friction surface for delicate grasps, or a stiffer, more geometrically precise tip for tasks that require it. For now, we went with a multi-material fingertip, with a 3D-printed TPU “skeleton” covered in a molded silicone rubber “flesh”. Adding a second layer of skin-level tactile sensing — for contact localization or pressure mapping — remains an option that this architecture leaves open.

Limitations

The Popcorn checks many of our boxes in terms of both performance and practicality for manipulation, and we are counting on it for our first examples of dexterity. However, areas of improvement remain. The “skin” approach has its own advantages, in terms of being able to identify complex contact areas or pressure distributions, which the Popcorn cannot do. The Popcorn is certainly compact enough for integration into human-size fingers, but it still has some size, and smaller is always better in this context. Finally, while our flexures do a good job of increasing repeatability, some amount of load/unload hysteresis and baseline shifts (especially after applying large forces) can still be observed; we have found that an auto-taring routine is effective at re-zeroing the sensor after contact.

We Don't Copy Human Hands — We Capture What Makes Them Work

Introducing the kinematic design and teleoperation stack for TR01 — Our first robot hand

For truly dexterous manipulation, such as threading a nut, peeling a film, or inserting a connector, precision makes a big difference. The human hand can perform subtle, minute adjustments effortlessly, but getting a robot to do the same is one of the hardest open problems in robotics. We built the TR01 hand model to capture some of these capabilities.

TR01 is our first robot hand prototype, and, to the best of our knowledge, the first robot hand capable of exactly tracking human fingertip positions relative to each other across a meaningful manipulation workspace. This is not a narrow claim — it enables highly dexterous teleoperation, and gives us new ways to teach a robot how to be dexterous.

Watch on YouTube

The Problem with “Just Copy the Human Hand”

Teaching a robot to manipulate is, at its core, a data problem, thanks to recent developments in robot learning. You need numerous examples of the robot doing the desired task. One of the most natural and popular ways to generate that data is teleoperation, and, in particular, using the human hand as an input device: the human operator moves their hand, the robot hand follows, and records observations and actions. The collected data can then be used for learning algorithms.

That brings us to a key design question: how do you build a robot hand that a human can control intuitively and precisely to demonstrate dexterous tasks?

The most naive approach to building a dexterous robot hand is to try to replicate human anatomy as closely as possible: same joint axes, same finger proportions, same range of motion. If the robot hand is a perfect copy, controlling it should feel natural, and the robot should be able to follow any movement of the operator's fingertips.

The problem is that such a perfect replica is physically extremely difficult to achieve. Motors have sizes, electronics need to fit somewhere, and structural parts cannot be too small to maintain strength. Making things worse, the anatomy of the human hand is complex, incompletely understood, and not based on engineering abstractions such as perfect revolute joints. No real-world robot hand can be a perfect replica of human anatomy, and even small approximations mean that the robot is incapable of perfectly tracking the human. When the mapping between your hand and the robot hand is imperfect, what should have been a clean pinch becomes a near-miss, and, for fine manipulation, such small errors are often the difference between success and failure.

This led us to ask a more basic question: what are we actually trying to replicate?

The Essence of Dexterity Comes from Fingertips

While the human hand has many important features and capabilities, for TR01 we focused on fingertip positioning ability as a key precursor of dexterity. Since we knew that it would be near-impossible to build a robot hand that can exactly replicate human fingertip movements, we instead set a different goal: designing a two-fingered robotic hand that can achieve any relative position of its fingertips, within a workspace large enough for useful tasks. In other words, the two fingertips should be able to achieve ANY position and orientation relative to each other — the complete six-dimensional space of position and rotation, within a given workspace. If successful, this would implicitly mean that the TR01 would be able to track any relative movement of the human operator's fingertips.

Interestingly, to achieve that, the TR01 does not have to exactly match human kinematics, and does not need to look like a human hand on the inside. It just needs its fingertips to be able to go exactly where yours go in teleoperation, with high precision.

This kinematic design means that the TR01 also has some super-human skills, in terms of being able to achieve finger-to-finger positions that are impossible for humans. When the hand is used purely in teleoperation (or imitation learning), such super-human abilities are not used, but we can imagine future policies that optimize performance on-robot to take advantage of them.

What This Enables

Precise fingertip tracking can turn dexterous teleoperation from an approximation into a true skill transfer system. We have found that, with this system, operators can demonstrate the fine, nuanced manipulations that are the hardest to teach, and collect high-quality demonstration data, which is the raw material for upcoming learning-based dexterous autonomy.

Since this is a robot hand designed from the ground up for skilled fingertip manipulation, we also put some work into the design and manufacturing of the fingertips themselves. We use a multi-material multi-process approach: the fingertips have a “skeleton” 3D-printed in Thermoplastic Polyurethane which provides some measure of compliance, coated in a “flesh” made from molded silicone rubber which is much softer and gives us large contact areas. We have found empirically that this approach gives us the right combination of compliance and durability in our early tests, though there is likely additional performance to be gained from further design iterations.

TR01 is Tangent Robotics' first hand, not our last. But its DNA — highly dexterous skills via exact fingertip tracking — is likely to carry forward into the hands we build next. We will be designing more complex and capable hands (potentially with more fingers) in the near future, but TR01 already gives us a new benchmark for dexterity.

Coming Up

This post focused on TR01's conceptual approach, kinematic design and teleoperation. Stay tuned for more on tactile sensing and autonomous visuotactile manipulation skills!

Technical Details

Concept. Consider the high-dimensional task space of your index fingertip relative to the thumb fingertip. Despite the complexity of the human hand's anatomy, human fingers still trace out a specific subspace, or manifold, of this space when performing fine manipulation tasks — a structured region within the full high-dimensional space of possible hand configurations. A robot hand with different kinematics will naturally inhabit a different manifold, or subspace of relative fingertip positions. The only way to bridge that gap reliably is to design the robot hand to mechanically span the full SE(3) space of one fingertip pose relative to the other, which implicitly means it will be able to also follow the human manifold at runtime.

Kinematic design. Each TR01 finger has four independently actuated degrees of freedom (DoF), in a roll-flex-yaw-flex configuration. The kinematic chain between the two fingertips thus spans eight DoFs, exceeding the six which is the theoretical minimum required for full SE(3) coverage. We found that the redundant DoFs improve conditioning near joint limits and also provide a two-dimensional manipulability nullspace useful for in-hand manipulation. One of these in-hand null-space dimensions (moving the grasped object front-and-back) is more human-like, while the other (moving the grasped object side-to-side) is less so, but still potentially useful.

Teleoperation stack. Fingertip poses are measured by magnetic trackers (sub-mm accuracy and high-frequency readings up to 900 Hz). At each timestep, we solve fingertip-relative Inverse Kinematics (IK) — computing the 8-DoF joint configuration that achieves the target relative fingertip pose. We use IK in a slightly unconventional way: one of the fingertips is considered the base (reference) frame, and the other fingertip is the target frame. The IK solution thus provides us with a way to exactly position the fingertips w.r.t. to each other, and also tells us where the palm needs to go to achieve these fingertip positions. Palm pose is tracked and executed via a Cartesian controller running at 500 Hz on the robot arm; finger DoFs are commanded to the servos directly. The full teleoperation pipeline runs at 50 Hz, with a lot of headroom to push higher. We have also added user-friendly interfaces for data collection, such as foot pedals to control the reset, start and stop. We have found that this teleoperation stack enables highly dexterous tasks and fine manipulation, giving us a wealth of data for kick-starting upcoming autonomous policies.