Gangadhar Patil | Mar 6, 2018 | 6 min read
Bengaluru, Karnataka: It is common knowledge that private hospitals in India perform unnecessary cesarean operations to mint money, a fact that was until recently not substantiated. It was at this time that data journalists at How India Lives (HIL), a startup that aims to make public data easily available, found a database maintained by India's central health ministry.
The database tracked all pregnant and newborns for 36 months so as to reduce infant and maternal mortality rates. Founded by five journalists three years ago, HIL began analysing the data. What they found confirmed the widely held notion: the number of cesarean operations at private hospitals was almost thrice as much as that of government-run health institutes. Also, this number was three times higher than the norm set by the World Health Organization (WHO). Read the story here.
Similarly, HIL analyzed education data to show a strong correlation between a functional toilet in schools and the dropout rates of girls.
They have published more than 150 such data stories in the last three years in partnership with Mint, India’s second largest financial daily. In addition, they have done more than 60 data visualizations, involving large datasets.
Namita Bhandare, who worked with the HIL team for stories on education for Mint, said data and visualization provided by the HIL "enriched" stories to great extent. "We were understanding the data on how girls had caught up with boys in both primary and secondary education and how, in some states, they had even caught up with boys in high school."
India has lots of public data but very few people use it, said John Samuel Raja, co-founder of HIL. He has worked with India’s major financial dailies for more than 15 years, including The Economic Times, India's largest business newspaper. “That's because data is in silos, they are not linked and are in a format that is hard to understand.”
Journalists in India have two major hurdle: availability of data in a format that can be interpreted and its visualization.
Experience of reporters working on data-driven stories has been discouraging mainly because of its accessibility and structure, said Saikat Datta, South Asia Editor of Asia Times Online and a well-known investigative journalist in the country. "Public data is quite hard to come by in India. Even if it is accessible, it is structured in such a manner that it almost becomes impossible to use it effectively. The time and effort needed to structure and analyze the data leads to very poor returns, in terms of readership and insights," he said.
Often, government departments upload scanned copies of the data in jpeg (photo format) instead of making the spreadsheet available online. Therefore, it becomes tedious to use these numbers until someone manually feeds them in an Excel sheet.
This is the gap the HIL team is trying to fill. They are trying to sources these data under the country’s Right to Information Act (it entitles citizens to avail of information, including data, from all the levels of the government) and stack them in a searchable format. "Our product will allow users to add multiple filters to fetch required data," said Avinash Singh, co-founder of the HIL. Singh has worked for more than 15 years in mainstream media.
The HIL team is good at visualizing a story, something not many newsrooms are doing in India, said Nasr Ul Had, India program lead at International Center for Journalists' (ICFJ). "They are writing codes to present the data, which is something not many are doing," said Nasr, who collaborated with the founders of the HIL for a couple of data workshops.
While India has made progress in making public data available, with portals like data.gov.in and a data-sharing policy, certainty of data and quality of data remain areas of concern, opined John.
Indian authorities used to put out detailed export and import data on daily basis, but it was stopped without any notice. Apparently, companies involved objected to sharing of these figures, arguing that it revealed competitive information. The wider public was not notified of this objection, nor were their views sought.
Similarly, Census data, which was priced, is now free. But other data sources like Survey of India (which has monopoly over maps in India) and Indian Meteorological Department's are paid.
Despite having many hallmarks of a public good, John believes in the ‘for-profit’ model. The company is profitable since its inception and now has 11 people working in the organization. So far, they have earned money by not only doing data stories for Mint but also consulting work for companies, think tanks and not-for-profits. So far, they have worked with 28 clients.
John said, “Data consulting helped us in two ways. First, it helped us understand how people consume data. Two, we ploughed back surplus into product development.” This they have done without raising funds.
The idea of HIL was incubated at the Tow-Knight Center for Entrepreneurial Journalism, where they got a grant of $16,000 to kickstart their venture. It has been their only external source of funding so far.
While nothing much has changed in terms of company's focus of making public data available to journalists or in operations, the team believes that the nature of revenue is likely to change in the coming years with the launch of their paid product. Moving for doing just data-based stories, the HIL started building a product that will make data searchable, comparable and presented in a visualized form that is easy to understand.
“Now, with the launch of the paid product, the focus will be to continuously improve the product. We will be enablers for journalists to use public data for storytelling,” said John. He exuded hope that the product's revenue should outstrip their consulting revenues.
“As new technology solution comes, journalists use them. Our solution can be used not only by journalists but also by organizations for decision making," said Singh.
At present, the product has data on more than 18,000 variables covering on all possible geographical locations (that's nearly 715,000) in India. Each location, be it a village or a ward (an administrative unit what is represented by an elected member), the most granular unit, will have at least 550 data variables.
The product allows addition of data sets faster, said Singh. “We are not under an illusion that we alone can cover the entire canvas of public data. Shortly, we will allow people to add their own data.”
Besides the HIL, organizations such as IndiaStat, Social Cops and Gramener are working in data and visualization space. But HIL claims that they the only one working at the intersection of three circles--journalism, technology and public data--in India.
More stories published under