How to Observe, Measure and Think About Shoppers

This post lays out a perceptual framework for measuring “From Opportunity to Final Purchase”.  For shorthand we can refer to this as “5D Reality” – a detailed measurement scheme for the shopper, the store, the products – and a lot more.

The Shoppers

We ourselves are going to observe the shoppers, but us observing the shopper is a second person view of shopping.  The shopper observing is the first person view that we are interested in.

After all, shopper science is about the shopper.


We can learn a lot about vision by considering how cameras and machines “see.”

Amazon Flow is a smartphone app, that “looks” at products on the shelf, and recognizes tens of millions of products!  In the photo above you can see tiny dots of light, primarily on the brand name of the product.  These tiny dots are flashing across the scene with great rapidity, and essentially seeing what the product is.  If you look at this in practice, the Amazon Flow looks like nothing so much as looking at the dots (or cross-hairs) from an eye-tracking.  It is obvious that there is at least an apparent similarity between how image recognition software works, and how the eye works.

An image is simply a large array of light waves of varying frequency and intensity, striking the photoreceptors of either an eye or a camera.  From birth on our eyes/brains learn very quickly how to process images and store, not the images per se but something that is in reality more like mathematical descriptions of those dots you see on Amazon Flow, and that all image recognition software must create arrays of data that do not actually retain the images, but instead the data arrays have the property of evoking the original image, sensed in memory so realistically to the person, that it is as if in memory the person is seeing the image anew.  This is our goal with the mentioned “5D Reality” database.



Frames and pixels are the foundation of observation, digitizing our perceived analog world.   But that is perception only, since our eyes themselves are digital, essentially photosensitive cellular arrays (rods and cones,) which serve the same function as the pixel arrays of digital cameras.  Frames, are the link to the fourth dimension – time – as each frame represents a point in time.  And the distance from fixation to fixation, the passage of a unit of time.

Big, BIG! Data

Light waves are the most efficient means in the universe for conveying massive information.  We have become so familiar with the process, from birth on, as to miss, possibly, the significance of the information processing going on. Light waves can span from the furthest reaches of the universe, to microscopic detail, with breathtaking efficiency.



Capturing all the information that is flooding the world, carried by light waves, that has been poorly mastered, to date.  But companies like Amazon, with their Flow app, and my colleagues at Hyperlayer, with not just their “Amazonian” capabilities to recognize bricks-and-mortar shopper behavior, but far transcending with data and analytics of the earlier PathTracker® and vision path methods. (RFID for foot paths; point-of-focus tracking for eye paths.)  We are now ready to address a far more versatile and complete data source – the light we are bathed in!

The “Vision Scroll”

One way of beginning an integral understanding of all we see is to first convert the great moving parade passing before our eyes into a flat, two dimensional surface.  Think of it as any flat 2D photo, only in this case, continuing as long and as far as your own daily viewing of the world around you.  Consider that the “vision scroll” is a continuous, 2 dimensional, accurate representation of all the light rays striking the eye of a single observer over any given period of time, whether of a short or long shopping trip, or any other period or environment of interest.  The 2D scroll actually wends its way through the 3D world, capturing the third dimension as it moves through it.

Below you can see a series of consecutive frames overlaid to align frame on frame.  This illustrates the necessity of deforming images in order to get a perfect fit, image to image, but also to accurately capture the third dimension.  Using identical computer recognized features across two or more frames is the key to this process.  Each frame also includes a time stamp, the fourth dimension.


The point at this point is that vision provides the raw material of the most massive data set, probably in the universe. We are now moving into an era of ubiquitous production of these raw records, unlocking untold wealth in knowledge and insight – beginning with shoppers.

Annotation of the “Vision Scroll” – MEASUREMENTS and “5D Reality”

The key to unlocking data from the vision scroll is “5D Reality.”  What this means is that every single pixel on the vision scroll has x, y and z coordinates.  The fourth dimension is time, t, a very versatile metric, indeed.  See: Sorensen, Inside the Mind of the Shopper: The Science of Retailing (Kindle 1292-1305).

The fifth “dimension” is not really a dimension, but rather a vector describing the orientation of either a person or a display.  The value of this is to know exactly the orientation of any identified object on the vision scroll, but also to know the orientation (and location) of the first person observer that is creating the vision scroll.  The “5D Reality” generalized here can be integrated directly into computations of in-store advertising exposures.  See: Purchase selection behavior analysis system and method utilizing a visibility measureOnce “5D Reality” has been measured for all objects in the environment, over all time of consideration, all remaining shopper behavioral metrics can be derived from this exhaustive classified raw data set.

Observing/Measuring Fixed Data

Observing and measuring is not just about the shopper, but also about the shopper’s environment: all the stuff in the store, and the store itself, that changes very little, or not very often.



What this means is that every xyz on these maps can and should be known before presuming to study the interaction of shoppers with these.  To flesh out the shopper study paradigm, we need to give further consideration to the three persons in our observational framework.

The “Persons” in Shopper Observation

We have already mentioned that the shopper is the “first person” of shopping.  And it is fortuitous that since 1879, scientists have been studying the movement of the eyes while the “first person” is engaging in some activity – originally, reading; and now, not uncommonly, shopping.  The scientist observing the shopper is, of course, the second person.


It is the first person’s vision that we seek to measure, in relation to store infrastructure – including products – in order to understand the shopping process.  And by using a head mounted camera on the first person, we can see exactly what the shopper sees, with something like the ASL MobileEye device.  A camera captures what is seen, and a corneal reflection pin-points the exact point of focus. In the case of the MobileEye, the second person is essentially looking through “the same eyes” as the first person, so even though two people are involved, the camera recording is only of the first person.

The type of “second person” observation suitable for massive data collection are surveillance cameras.  The richness of the data is limited only to the resolution of the camera(s), and our ability to parse the digital pixel array of a fixed surveillance camera.  This is orders of magnitude simpler than parsing the first person video, for the simple reason that the second person video is mostly static, with only the shoppers, staff, shoplifters, etc., moving, plus those products actually removed from the shelf – which should correlate nearly perfectly with the store’s own transaction log.

The bottom line is that creating a harmonious 5D Reality map of the store, it’s fixtures and merchandise is a matter of pulling together existing data into this more coherent and universal data structure. This leaves the 5D Reality map of the shopper, continuous throughout their shopping trip – including items purchased, requiring only recognizing perturbations in the surveillance videos – the constant parts already harmonized with the constant store infrastructure.


Here's to GREAT "Shopping" for YOU!!!

Your friend, Herb Sorensen

Herb Sorensen, PhD, Shopper Scientist LLC

2013 Charles Coolidge Parlin Award  |  American Marketing Association    
2007 EXPLOR Award; with Wharton group |  American Marketing Association
2004 Top 50 Innovators  |  Fast Company Magazine
Adjunct Senior Research Fellow  |  Ehrenberg-Bass Institute for Marketing Science, University of South Australia
Scientific Advisor  |  TNS Global Retail and Shopper Practice
BrainTrust Member |  RetailWire


© 2014 Herb Sorensen  |  Terms of Use | This product may be covered by one or more of the following U.S. patents: 7,006,982; 7,606,728; 7,652,687; 7,933,797; 7,944,358; 8,041,590; 8,140,378; 8,666,790; and others pending.