On the fringes of the Indian city of Kolkata, in the dusty, crowded neighbourhood of Metiabruz, 460 young women are working at the vanguard of artificial intelligence. The women, mostly from the local Muslim community, are helping to train computer vision algorithms used in autonomous vehicles and augmented reality systems, for the likes of Amazon, Microsoft, eBay and TripAdvisor. The all-female centre is one of eight Indian offices operated by iMerit, an India- and US-based data annotation company, whose 2,200 local employees label the oceans of data generated by industries as diverse as manufacturing, medical imaging, autonomous driving, retail, insurance and agriculture.

The operation is part of a growing data-labelling industry that employs hundreds of thousands of workers in lower-income countries including Kenya, India and the Philippines. Companies such as Figure Eight and Mighty AI, and more traditional IT companies such as Accenture and Wipro, are forming part of a so-called “AI supply chain” that creates algorithms able to interpret material including driving footage, search results and photos for the largest US and European multinationals, including Facebook, Volkswagen and Google.

“The largest technology companies don’t want to be in the business of training data, they want to own customer relationships [and] are using partners and procurement wisely,” said Leila Janah, founder and chief executive of Samasource, a San Francisco-based data labelling vendor with offices in Kenya, Uganda, and the US.

Leila Janah, founder and chief executive of Samasource © Fredrik Lerneryd/FT

As the volume of data that requires labelling has expanded exponentially, large companies have increasingly turned to third parties able to supply workers who specialise in specific types of data such as driving or medical information — and who are also paid and treated in an ethical manner.

Samasource, whose employees label data for Walmart, Google, Microsoft, Glassdoor, Continental and General Motors, among others, is headquartered in Nairobi and employs more than 2,800 people. “We have a labour model that employs people as full-time workers with benefits, paid at a living wage.

Workers at iMerit in Kolkata Samasource worked on a project for Bayer that required annotating vascular cross-sections of plants to detect diseased cells for crop protection, to train an aerial field image algorithm. Ms Janah explained: “We want to focus on complex edge cases that machines can’t easily grasp, where you need humans to provide nuance and judgment. That’s where we add value.”

As the AI training market starts to explode, western groups using AI are looking to work with more ethical outsourcing companies, with a social impact model. “For the first time, people are questioning the [labelling] companies that don’t guarantee a living wage to workers in the AI supply chain.

As a company, if you’re getting your data trained by those labourers, you owe it to them to treat them fairly,” Ms Janah said. Ms Basu said: “In the long run, these young rural, tribal workers will cause a real change in the economic empowerment of their communities.”

This article has been amended to reflect updated information from Samasource on its employee numbers

For more details click here.