MBC Screened-In Blog – 4 by Rudra Narayan


India is a case of amusement as it is able to adjust to constant change in space and technology yet chooses to be backward when it comes to utilisation of statistics. Lack of capability is visible when we’re falling short of the target in every sector, whether it is making realistic GDP estimation , providing ration or managing the migration. These tasks might look daunting in today’s world of pandemic and social distancing, but all the answers lie in the DATA. 

In the last decade India has emerged as a data mine to all the MNCs and social media companies yet the government chooses to be unaware of the full utilisation of the data.

Indian statistical system was started by lord Mayo- viceroy of British India, for its trade and exploitation purposes. It underwent transformation post independence; the data collection organisations of India i.e., NSSO and CSO, which developed under the guidance of PC Mahalanobis, KVK Rao and Gadgil post independence, failed to keep up with the time and changing personality of the country. Especially after 1991 when Indian markets escaped the strangleholds of government and societies became more dynamic and versatile, the statistical organisations began to show gaps. All this has been exposed in recent times as now the states are declaring imaginary numbers of the migrants and no one knows how many migrants are exactly there.  Moreover, the newspapers are quoting the number of hospital beds available based on a 2013 survey.

Recently , in the G20 summit the Indian government refused to sign the data sharing agreement but at home it is failing even the data which is collected by its own agencies. The government of India under the obligation of open data maintains a public repository of data in but one single glance can tell that most of the data is not being updated frequently and most ministries don’t even publish the data. This not only hinders the government but also the third party, especially, the citizen’s participation in decision making.

Data serves dual purpose, for making policy and to see the outcomes of it. So when India conducts Socio Economic and Caste Census 2011 (SECC) it gets to know about people having ration cards, nutrition level, household composition and poverty. Comparing two surveys can help us to understand the change. Similarly , the same data is used to provide jobs under MNREGA and ration cards to the poorest of the poor.


Even after such usefulness of data every now and then, there are reports of fudged data from the government. Ex CEA Arvind Subramanian wrote a paper to tell that the Indian GDP data is misleading and overstated. In another case, the government decided to drop the household survey- which tells about the consumption of goods, deeming it flawed. PLFS report of 2017-18 was delayed and Annual Survey of Industries was dropped. Above all, many economists showed  reservations while using the MCA database for GDP calculation as there was miss-match between what companies declared individually and to the government. Even recent GDP figures were revised by the MoSPI multiple times and there were glaring differences between the estimates and actual figures which is acceptable till some point but not always.

Recent pandemic has made everyone ask about how many migrants are there. Chief labour commissioner puts the figure at 26 lakh , Solicitor general in his report to supreme court puts figure at 97 lakh. UP, Bihar, Bengal reported that 21,10 and 3 lakh migrants respectively have returned back.

Economic Survey 17 had a chapter on “India on move and Churning new evidence”. This looked at the number of internal migration using Cohort based migration metric and railway data. Migration data historically are derived from the Census which happens every 10 year. Census is a data mine that gives information not only on migration but also on demography, distribution, composition and poverty levels etc. one can only wonder how policies are made a decade later without data. Economic survey puts inter state migration between 2011-16 at 90 lakh and census of India 2011 say 33 million only. Ambiguity wasn’t addressed and no further data was published in the economic survey’s next edition.

When a country has population growth of 1.6% per year and decadal growth of 17.64% i.e., a new Australia every year, doing census every 10 year and depending upon it for all the data makes no sense. Similar is the case of SECC which is used for PDS but many researches have shown that millions don’t have ration cards and are also not enrolled in BPL even when they are supposed to be. This is not only a mistake but a glaring injustice towards the poor. 

Nothing can be more unfortunate than a country without a complete employment survey. PLFS and EPFO data is not an answer as the former has smaller sample size than EMPLOYMENT-UNEMPLOYMENT SURVEY and later captures organised sector only.

Lack of data affects every tiny factor of public policy making, effective use of data such as NFHS(4) has exposed the health related issues in the country. Although this survey also requires an update. Another example is of AADHAR and GSTN which have mobilised the public policy and taxation in a better way though glaring gaps still remain especially due to the evolving nature of technology.

Indian statistical system is dependent upon the guidelines of PC Mahalanobis, we use lengthy questionnaires and door to door surveys. While it makes sense when the sample is small and static in long term, as India was pre 1991. This methodology demands change with modern times. 


Data is the key to all our problems, it depends upon the user or AI machine. Every now and then the government is making programs on AI. Basic component of AI which decides what a machine will do is data. Without clean, efficient data we can’t work.

Indian statistical system needs overhaul, under Pranab Sen, a committee has been constituted by the government to see the board question from discrepancies and calendar of data which needs change as shown. 

Data shall be treated as important as treasury. From the collection to utilisation of data, government shall be accountable in parliament just like it does in case of money bills and grants. Formulation of data commission to look after state specific data demands will be an excellent thing to do.

Data collection needs to be decentralised. Collection and analysis of data at local government level must be done at regular small intervals depending upon the need. Present method of collection from the top tier is not feasible, It is more error prone.

Availability of local data provides a leak proof net. This can be utilised in formulating welfare schemes, pension schemes, banking, employment generation and mitigating pandemic scenario.

In common terms efficiency lies in 3Ts

Technology, Trust, Transparency 

Technology:- In the era of Cloud Computing, Data Analysis and Big Data, there is rarely a need for data surveys being collected at decade delays. We can have real time data in hand with efficient use of data tech. For example, Census can be frequently updated with every institutional birth and death. 

Trust:- Trust among researchers, statisticians and between governments is important. Failure can cause harm to a nation. Stakeholders should have a moral responsibility for using and misusing data. for example using data for political motives.

Transparency:- Any type of data sought should be backed by the desired outcome, and the person should know the implication of the data. Judicial reach should also be maintained while maintaining personal privacy.

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp

Leave a Reply

Your email address will not be published. Required fields are marked *

Post comment