All Data Are Not Equal
Obtaining good training data is the Achilles heel of many ML scientists. Where does one get this type of data? Surprisingly, there are many sources that provide access to thousands of free data sets. Recently, Google launched a search tool to make finding publicly available databases for ML applications easier. But it is important to note that many of these databases are very esoteric—for example, “Leading Anti-aging Facial Brands in the U.S. Sales 2018.” Nonetheless, data sets are becoming more accessible.
However, many databases that are relevant for ML applications have limitations such as the following:
- They might not have precisely what ML researchers are seeking—for example, videos of elderly people crossing a street.
- They might not be tagged appropriately or usefully with the metadata that is necessary for ML use.
- Other ML researchers might have used them over and over again.
- They might not represent a rich, robust sample—for example, a database might not be representative of the population.
- They might lack enough examples.
- They might not be very clean—for example, they could have lots of missing values.
As many researchers often say: All data are not equal. The inherent assumptions and context that are associated with datasets often get overlooked. If scientists do not give sufficient care to a dataset’s hygiene before plugging it into an ML system, the AI might never learn—or worse, could learn incorrectly, as we described earlier. In cases where the quality of the data may be suspect, it’s difficult to know whether an algorithm’s learning is real or accurate. This is a huge risk.
Knowing what we now know about machine learning and the risks and limitations of datasets, how can we mitigate these risks? The answer involves User Experience.
User Experience and Machine Learning
While not all datasets relate to human behavior, the vast majority of them do. Therefore, understanding the behaviors that the data capture is essential. Over the last decade, we have been engaged by several companies to collect the precise examples and attribute tags that are necessary to train or prove in AI algorithms. (In some cases, there were thousands of samples.) Here are some examples of the samples we’ve worked with:
- video samples of people doing indoor and outdoor activities
- voice and text samples of doctors and nurses making clinical requests
- video samples capturing the presence or absence of people in a room
- video and audio samples of people approaching a front door
- thumb-print samples from specific populations
Note that none of this data was available publicly. We had to acquire each of the datasets we needed through custom research, with specific intentions and research objectives.
At first glance, the sheer magnitude of the data that it is necessary to collect for ML applications screams non-UX techniques. For many scientists and researchers, the simple answer to this challenge is to use quantitative methods of acquiring data. But our clients who commissioned our projects understood a key shortcoming of these methods: low data integrity. Our project sponsors recognized that the underlying data had to be precise. This is especially true when it’s necessary to carefully consider the nuances of captured human experiences. We needed to collect the behaviors in context and had to observe them—not simply ask for a number on a five-point scale—as is often the case in quantitative data collection.
Capturing behavior is the prerogative of User Experience and requires research rigor and formal protocols. What we learned is that User Experience is uniquely positioned to collect and code these data elements through our research methodologies and expertise in understanding and codifying human behavior.
User Experience Measures Behavior
To measure behavior, follow this process:
- Identify the objective. To construct the conditions necessary for capturing user experiences, the first task is to understand what the ML researchers really need. What is the objective? What constitutes a good sample case? How much variability across cases is acceptable? What are the core cases and what are the edge cases? So, if we wanted to get 10,000 pictures of people smiling, is there an objective definition of a smile? Does a wry smile work? With teeth; without teeth? What age ranges of subjects? Gender? Ethnicity? Facial hair or clean shaven? Different hair styles? And so on. Both the in and out cases are components that ML researchers need to clearly define and have all parties agree on.
- Collect data. Next, plan for data collection. One of the strengths of UX researchers is the ability to construct and execute large-scale research programs that involve humans. How to collect masses of behavioral data face to face, efficiently, and effectively is beyond the experience and expertise of many ML researchers. In contrast, much of the practice of user research is about setting the conditions necessary to get unbiased data. Being able to recruit participants, obtain facilities, get informed consent, instruct participants, and collect, store, and transmit data is essential. Furthermore, UX researchers can also collect all the metadata necessary and attach that data to the examples for additional support. UX researchers are practiced in the art of sorting, collecting, and categorizing data—as is evidenced by a skillset that includes qualitative coding and the many tools that support these types of analysis.
- Do further tagging. After initial data collection, it may be necessary to organize and execute a crowdsourcing program such as Amazon’s Mechanical Turk to further augment the data you’ve collected so far. For instance, if we were to collect voice samples of how a person might order a decaf, skim, extra-hot, triple-shot latte in a noisy coffee shop, there could be several properties that would be of interest for each sample. In such cases, we might employ multiple researchers, or coders, to review each sample, transcribe the samples, and judge each of them for clarity and completeness. These coders would then have to resolve any observed differences to ensure the cleanliness of the coding.
These are just a few of the many reasons why UX researchers are uniquely positioned to help bridge the gap between ML scientists and the collection, interpretation, and usage of human-behavior datasets for incorporation into AI algorithms. The use of User Experience in this domain can help protect us against the limitations of available databases for AI and avoid the use of inconclusive, useless, or incorrect datasets whose limitations might not be obvious—whether the issues are with the data itself or the algorithm. UX researchers are well positioned to help ML scientists collect clean datasets for the training and testing of AI algorithms.