As more businesses bring sustainability into the core of their operations, they find themselves needing data for reporting and decisions or trying to make sense of the data they have. With child labor data, for example, many organizations don’t realize that the information they are getting can be deeply flawed. An early COSA study on child labor explains why asking the right questions can make the difference between allowing child labor to persist or getting sound and actionable data to make a difference.

COSA was evaluating the relative impacts of four major sustainability certifications in Tanzania in 2008. Our research team was asking questions to understand the prevalence and impacts of child labor in a range of communities. At the time, many researchers would typically ask “Do children work here?” Asking simplistic or naive questions can result in naive data or, in this case, present what economists call a “moral hazard” and so we looked for other ways to understand the issue. We discovered in other countries that there was a clear pattern of corollary data that could be used instead. Among the corollaries we used was the corollary for education, because if child labor is present then usually the children are falling behind in school and not getting an adequate education. 

So, COSA staff asked the standard education question, used by governments and aid agencies, about whether children were registered in school. The data that came back, from thousands of surveys, looked normal, and unsurprising for remote rural areas. But it was misleading.



In fact, our quality process discovered that the data were not reflecting reality. Most of these children were indeed registered at school, because parents do register their children. But the problem was that the children were not sufficiently attending school to get an education. This may have been for reasons related to poverty e.g. the cost of required school books, or institutional factors like low pay for teachers who then don’t always show up. But we had missed that distinction between being registered and being educated – as did most of the official statistics.

The following year, we made a simple change to the survey. We asked “what grade level has the child completed” and then in a separate part of the survey we asked them the age of each child in the family to calculate their expected grade level at that age. When we analyzed the results to see how many are achieving the expected grade level, the data absolutely plummeted.

The graphic below illustrates the “before” view – asking the simple “Are your children in school?”, and the “after” view – when survey questions were adjusted and new questions asked.



 “It was sad,” says COSA President Daniele Giovannucci, “knowing people in those communities, to think that these are the kids that in the future would negotiate a contract for their communities’ agricultural produce or would need to read the label on a pesticide container to know how much or how little to apply.” And they will not be able to do that if they are not educated.  

The problem proliferates but is not readily visible with generic data. Yet we can now understand this as one of the tangible consequence of child labor. 

We believe that, if it is worth spending money on data, then it is worth having it be accurate. We, therefore, encourage researchers and data consultants to consider the lessons of our experience with this and other data points to improve their data also.

Our journey of discovery can be distilled down to three simple steps that now make our data far more reliable than ever.


  1. Appropriate Technology – Our data collection technology includes quality control mechanisms. When we went back to validate the findings with the local communities, we were surprised. They looked puzzled. The data did not feel right to them. When we dug into the problem we discovered that we had been asking the wrong question.

Getting necessary data, and making sense of it, used to be much more difficult. But now, by simply engaging some of the available technologies, it is relatively easy to get reliable data in a timely way for the entire supply chain. Even far-flung and complex chains can now be managed at very reasonable costs.

Data logic models and cascading surveys allow companies to reach deep into their supplier network to understand it in ways that were impossible just 3-5 years ago. We can also embed data validation into the data gathering systems to improve results and nearly eliminate poor data.


  1. Right metrics – ‘Measuring what matters’ is the key to both low cost and accuracy. When we asked in the second year of the Tanzania study “what grade level has the child completed” and then in a separate part of the survey, asked the age of each child in the family, the analytical software then correlated those two pieces of data and we could see whether the child was in the appropriate grade level for their age in that country. We looked at the variance from that as a signal or a way of potentially suggesting that the likelihood or risk of child labor was high. Or, at the very least, that there was a problem with education that may be related to poverty or gender or other related factors.
  2. Agile Data Auditing – Just getting data is like buying a used auto on the basis of a conversation and with no warranty. Validation or auditing is a valuable aspect of data science to ensure the investment made is reliable. Fundamentally, it is important to understand if the data has real value for reporting and decision-making. We do this at three levels: internal validation, e-verification, and risk-based field audits. This combination enables good data quality with very little cost and can minimize the heavy burden of in-person field visits in many cases.

We take this data and other key factors into an algorithm that predicts the propensity for child labor at very low cost. This enables widespread application even in low-resource areas or low-budget programs. It can change our understanding of critical human rights issues that are easily lost or clouded in the complexities of today’s world. 

The bottom line: “Well, it was humbling for the COSA research teams to see the distinction in the before and after data. There are so many little things like this that we have learned over the years in doing surveys that cumulatively now make our data dramatically better”, says Daniele. Smart surveys make you smarter. When you multiply such improvements across a number of indicators or KPIs, the difference in the quality of information you have, and your ability to understand what’s really going on, can be markedly different. 

For any program or investment, measuring in ways that are meaningful yields the beneficial ability to take data-based actions that make you more likely to succeed.


Read more like this: “Beyond Compliance: Sustainable Sourcing means Working with Farmers