The role of applied statisticians in a digitalised world

Monday, 2 December 2019 00:20 -     - {{hitsCtrl.values.hits}}

 

The danger with the wrongful use of statistics is the creation of a statistical fallacy that would be costly to those who make use of the deliberately doctored statistics


 

  • Keynote: Institute of Applied Statistics Nov 2019

     

Real time statistics at the speed of light

Sri Lanka is passing through a phase in which applied statisticians are pretty much in demand. Statistics were used or in the proper sense, abused, by political contenders recently like a drunkard using a lamppost. It is said that a drunkard uses a lamppost not for illumination but for support.

In a digitalised world where people can now have access to information on a real time basis at the speed of light, the prospect for abusing statistics is much more prominent than ever before. Accordingly, fake statistics can be created and distributed by interested parties for personal gain. In a world in which people have not developed their critical and logical abilities, misinformation – presenting correct information wrongly – and disinformation – creating wrong information deliberately – can be easily fed to them. 

Fake statistics are created or existing statistics are twisted to support such misinformation and disinformation campaigns. But when applied statisticians enter the scene and expose the falsehood of such claims, the damage has already been done irreversibly. It is therefore useful to revisit the role of applied statisticians in a digitalised world. 

 

Application of advancements in information and communication technology

One profession which has benefitted mostly from the advancements in information and communication technology has been that of statisticians. When we studied statistics some 50 years ago, our challenge was to calculate complex statistical numbers manually. When I joined the Central Bank in early 1970s, the Bank had an IBM 4300 mainframe which occupied almost one wing of the Bank building. Data had to be fed to the computer through cards that had been punched by punch-card operators who had been employed for this specific job.  

If I wanted to get a regression done, I had to forward the numbers to the Data Processing Department and punch-card operators would punch the data to the cards. Then, a special program had to be written by program writers before the data are fed to the system through punched cards. I could expect the results not earlier than two weeks. 

But today, there are many software packages that could be installed in a desktop or a laptop or even a smart mobile phone and once the data are inserted, with the clicking of a single button, the regression results together with diagnostic assessments would appear on the screen. Hence, our challenge today is not the calculation of numbers. Our challenge is to interpret the calculated numbers and make inferences out of them. As a result, we need more brainpower to interpret numbers than to calculate the same. 

That is quite a challenge faced by applied statisticians. When the public at large is not in a position to do this interpretation by themselves, they have to be helped out by others who are capable of doing so. If it is not done, it is inevitable that they become victims of crafty people who will abuse statistics for their own personal gains. The list is long, but some such prominent crafty people are politicians, marketers, and religious cult preachers. Thus, in this digitalised world, the applied statisticians have an important role to play not only to provide correct statistics but also to provide correct meaning to them.

 

Making statistics free from emotions

Consciously or unconsciously, we use statistics in everything we do in our day-to-day life. When we say that it rained yesterday, we simply pronounce a fact we have observed as having happened. When we say that it is raining now, we again mention something that we have observed as happening. When we say that it will rain tomorrow, we are predicting a future event. Our life is full of such statements: facts that have been experienced by us in the past, facts that are happening now and facts that we feel would happen in the future. 

What has happened and is happening now are our personal experiences. What will happen are our learned judgements based on our experiences. If we keep these experiences and learned judgments to ourselves, then, it would not form the subject matter of statistics. It becomes statistics only when we share them with others. 

Hence, statistics is basically, observing, analysing, learning and sharing facts about the real world. That sharing of facts need not necessarily be in quantitative or measurable form. They can be simple expressions in verbal form so that others could form opinions on the happenings in the real world. 

Statistics do not have a heart or a religion. When we express our personal experiences, we step into a dangerous territory. Our personal experiences are guided by our emotional and subjective feelings. When we share them with others, we are inviting them to accept our emotional and subjective feelings as if they too have experienced the same. This is where we run into problems. Unless others too have the same emotional and subjective feelings, there is no reason for them to accept our experiences as their experiences. 

Hence, statistics to be shared by everyone should necessarily be based on objective considerations. In other words, statistics should not have a heart. Its religion should be pure objectivity. It should convey facts as has been observed by an individual free from personal biases or prejudices. Only such an impersonal statistical framework has the capability of serving people intending to use them for making judgments about the real world.

 

There is a demand for statistics

An economy has to play a specific role towards its members. It has to produce and supply goods and services as demanded by them having consideration for timeliness, quantity and quality. When an economy produces these goods and services in larger and larger volumes year after year, new wealth is created, raising the well-being of its members. The continuous creation of wealth by people in this manner raising the overall welfare levels brings about what is called ‘economic development’.  

That has become the prime objective of all societies today. Wealth is created in any society by people who make choices between consumption and production, decide on appropriate production methods and take risk on what they do. A vital input which they use for this process is ‘information’. Statistics is nothing but another name for information presented in a more sophisticated and analytical form. Hence, any society desiring to attain the highest level of economic development cannot disregard the importance of statistics.  

 

False statistics can be created to twist our view of the world. The advanced digital apparatuses available will make the production of such false statistics easier and less-costly, on one side, and allow swift universal distribution, on the other. This is decentralisation of the production of statistics which may be a salutary development.

However, if the job is handled by those with ulterior motives, it is the genuine applied statisticians who would suffer ultimately. Hence, the challenge faced by applied statisticians today is how they should get themselves bound by a code that upholds both the presentation of correct statistics and making of correct inferences out of them

 

Demand for statistics creates a market

If statistics are an input, like any other input, there should be a demand for them. If statistics help people to create wealth, they should be prepared to pay a price to acquire them. When there is a price for statistics, there should be a supply of statistics as well. It, therefore, connotes that there is a market for statistics, like the market for all other inputs. This means that people who have information that can be traded in the market will have to package and sell it. 

The packaging should be done in such a way that the users would be able to consume statistics as an instant product without having to process them further in-house. This is the biggest challenge which statisticians face today: how to sell their product to would-be users as a readily consumable product and help them to create wealth. 

 

Market price for statistics

There are market based statistics-producers in developed countries. The market agents are ready to pay a price in order to acquire such statistics. The producers of statistics conduct frequent market surveys, analyse results, supply them on line at a price and help market agents to create wealth. Unfortunately, in Sri Lanka, we do not have such market based statistics-producers. The collection and analysis of vital data that are useful to market agents are being done by a few governmental organisations. Like any other product supplied by the government, such data are also supplied as a public good free of charge. 

Even when the governmental agencies could sell statistics at a price, they do not venture to do so, because they are guided by such principles as “doing utmost benefit” to people as a social service. The country too, therefore, expects free goods from these governmental organisations. But this creates a problem known as “the principal-agent problem” in economics. 

 

The principal-agent problem

 The principal-agent problem is typical to any government service. It says that the agent who is a government bureau or a department or even a university does not have incentive to produce its output at its best. The principal who is the user of the service, on the other hand, is scattered and not in a position to influence the agent to improve quality. This is why in many countries the governments have tried to make the agents amenable to public’s requirements through the implementation of such devices as “people’s charters”. The result is the production of the agent remaining at low quality, becoming unreliable and failing to satisfy the users. The same fate has befallen the governmental organisations that produce and supply statistics to the public free of charge. Many have witnessed the increasing occurrence of the “misuse of statistics” by those who produce and supply statistics to the market. 

 

Unreliability of statistics

Why does the production of statistics become unreliable? It becomes unreliable due to misuse. There are many pitfalls to which statisticians fall when they produce statistics. The misuse of statistics occurs when a statistical argument asserts a falsehood. It could be due to both accidental and purposeful. When it is purposeful, it is always perpetrated in order to gain an undue benefit to the perpetrator. The danger with the wrongful use of statistics is the creation of a statistical fallacy that would be costly to those who make use of the deliberately doctored statistics. Such statistics are damaging to the quest of knowledge in the sense that, once they are rooted in the minds of the people, it would take years to correct the falsehood that it would have created in the society. The types of misuse of statistics are as follows:

First, the statistician may discard the unfavourable data and make use only what is favourable for him to prove his point. 

Second, in field surveys, the statistician may ask loaded questions in order to elicit an answer of his choice. 

Third, the statistician may tend to over-generalise facts and make wrong conclusions. 

Fourth, the samples used may be biased. 

Fifth, the causality outlined may be fallacious. 

Sixth, the data may have been manipulated in order to show a result favourable to some interested party. 

Seventh, data may be dredged or mined in order to find a correlation that would not be there. 

 

Misuse of statistics

All these instances of the misuse of statistics make the statistics less reliable and suspicious. The governmental statistical agencies throughout the globe are criticised by public on this ground. Once the organisations lose credibility regarding the compilation of statistics, it would be very hard to regain trust and confidence of users. The statistical bureaus run by former Soviet Union and its satellite states have been subject to this criticism. The way to avoid this criticism is to adopt global best practices with regard to compilation and dissemination of statistics. It requires countries to adopt a code of ethics and practices when it comes to dissemination of information. Many member countries of the International Monetary Fund have adopted such a code in the form of signing for following the principles of outlined in a general data dissemination system and, at a more stringent level, special data dissemination standards. The Central Bank of Sri Lanka is a signatory to this latter code.

 

Governments are losing monopoly over statistics

Historically, governments have been holding monopoly power over the production of large scale data bases like the compilation of GDP, inflation, employment and unemployment and poverty etc. It was because only the governments which had the needed financial resources to produce such data. However, with advanced ICT and internet facilities at incredible speeds, private firms have ventured into this job. Accordingly, two private entities in USA have ventured into producing alternative national statistics. One is the Shadow Government Statistics which takes pride in presenting ‘analyses behind and beyond government economic reporting’. This agency produces shadow inflation, employment and GDP numbers for USA. The other agency is the State Street Associates which operates the PriceStats data dissemination service. It produces up to date inflation numbers for 22 countries which are demanded by global investors and lenders as a counter-check of the inflation numbers produced by official statistics agencies. The list is expanding and within a few years, it will cover almost the entire world.

For USA, ShadowStats uses the same methodology as the US Bureau of Economic Analysis or BEA to estimate its national level statistics but has found that BEA has constantly overestimated GDP and employment and underestimated inflation. PriceStats uses prices of products purchased by consumers in retail supermarkets, collected online simultaneously, to prepare instant inflation numbers for the 18 countries it has covered at present. Both these series are available to interested market participants on payment of a fee. Accordingly, a merit good produced by the government has been converted to a private good in the market. Hence, the US government monopoly in producing national level statistics has been broken in USA today. It has forced the US BEA to be careful when publishing economics statistics.

 

Challenges of the digitised world

In summary, the digitalised world has made our life easier. But at the same time, it has posed several threats to our own understanding of the world. False statistics can be created to twist our view of the world. The advanced digital apparatuses available will make the production of such false statistics easier and less-costly, on one side, and allow swift universal distribution, on the other. This is decentralisation of the production of statistics which may be a salutary development.

However, if the job is handled by those with ulterior motives, it is the genuine applied statisticians who would suffer ultimately. Hence, the challenge faced by applied statisticians today is how they should get themselves bound by a code that upholds both the presentation of correct statistics and making of correct inferences out of them. 



(The writer, a former Deputy Governor of the Central Bank of Sri Lanka, can be reached at [email protected].)

Recent columns

COMMENTS