Wednesday, 29 March 2017

Numbers Don't Speak: Misleading Statistics Come To Science




Humans are bad with numbers—mainly because centuries of evolution molded them to think quickly and intuitively, not abstractly. Though this method of thinking, known as heuristics was enough to get them out of savannas and settle into small subsistent communities—it did not make them the ‘masters of the planet.' In fact, it was their ability to bend the laws of nature and manipulate the strings of earth—which involved observation, exact measurement, and scientific thinking. This recipe allowed them to weasel out truth from sham, hunch from ‘hit,' and luck from meaningful occurrences.

There are a variety of tools which help humans do this. One of them is statistics. Though not old, statistical tools have profound applications in almost every realm of our lives. Google uses it. Netflix and Amazon depend on it, but most importantly, our source of knowledge—published research and journals, won’t exist without it. However, among the goodly proportions of scientists and researchers who use it, only a few truly know how to. This realization came with the paper, “Why Most Published Research Findings Are False,” written by John Ioannidis at Stanford.

Statistical illiteracy is not just an epidemic among ordinary people, who get misled by poor statistical reporting of mainstream media, are ignorant of sampling, errors, and fallacy of averages. But, as the paper reveals, the scientific community is neither immune from it.

A lot can go wrong while doing statistics. Choosing a faulty sample, ignoring extreme values, maneuvering graphs and visuals, proposing causation instead of correlation for example, implying pornography as a ‘cause’ of sexual violence, and finding nonsensical correlations, like correlating the number of Hollywood movies released per year with the number of bird deaths in that year, are some among the well-known. Other problems are inherent, which one can never eliminate such as randomness of real world or complexity of the system. Others result from scientific conventions—popular practices developed over time. Those are rampant in the fields of experimental psychology, neuroscience, medicine, health, and nutrition.

Probably the most articulate book on the topic Statistics Done Wrong highlights many of the statistical sins committed by academicians and researchers in different study areas. A handful of them is given below.

Everything wrong with ‘statistical significance.'

This phrase has become a de facto of statistics, whether in textbooks or prestigious journals like Nature and Science. A study is said to be statistically significant when its p-value is lower than a certain cutoff, usually set to p < 0.05. However, this p-value cutoff is somewhat arbitrary in nature—a mere convention tracing its lineage to RA Fisher, the god-father of significance tests.

But that is not the only problematic thing about p-values. In fact, it is their frequent and much-widespread misinterpretation which are troubling. Most people think a p-value of 5% means there is a 5% chance that a result is a fluke, or put in a better way, a 5% chance that a result is because of luck—which is downright false. The probability of getting a lucky result is much higher; a good 38%. Besides, there is not much difference between a p-value of 6% and 4%. But, as David Colquhoun in his paper puts it, “one of them gets your paper published, and the other does not”.

P-values are handy when used correctly. They are meant to be used as a gauging tool for researchers only to make informal guesses about whether the results make sense. They are not intended to be strict engraved rules for hypothesis testing.

Another head-scratching quality of p-values is that they are simply probabilities. They have no practical value. They don’t tell whether your medicine works, or there is a difference between two bacteria cultures, and therefore to try and lower your p-value as much as possible and present it as proof that you have found something is—futile. P-value is only supposed to tell you how good your data is. It’s not meant to tell whether you have made a discovery. And even if you want to judge your data, the confidence interval is a much better option because it calculates the variation bar, which gives you insights about uncertainty in your data. Sadly, however, only 10% of research papers in experimental psychology use confidence intervals, and in journals like Nature, 89% of papers involve p-values without any confidence interval.

It is precisely this reason that editors and scientists have screamed about the ‘replicability crises’ in many disciplines, especially the nascent ones which lack vigorous standards to test against such as neuroscience and behavioral studies. The p-values should not be deciding what has been ‘discovered,' the replication studies should.

Little Extremes

Little extremes were aptly explained by Kahneman, in Thinking Fast and Slow through an example which goes like this. In the US, for instance, it was found that counties with the lowest rates of kidney cancer tended to be Midwestern, Southern, and Western rural counties. And also counties with the highest rates of kidney cancers tended to be in the Midwestern, Southern and Western rural counties. How is that so? Well, because the countryside have less population than urban areas, and therefore extreme values were supposed to be found in the countryside, instead of urban settings where they averaged out.

The way of correcting this is to account for the population by using weighted averages instead of simple averages.

Pseudo-replicability

This erratum too is a common disaster. It involves collecting thousands of data points to give the illusion that you have replicated the results when in fact you have pseudo-replicated. It is when you take a smaller sample but measure your parameter again and again, with that same sample, For example, I want to know the average IQ of my class. I chose two people from the class and measured their IQs five times a day for twenty days. I collected 100 data points now. But they are useless. I should have taken 100 different individuals, may be measured their IQ five times and then averaged those five data points to make a single one. Now, I still collected 100 data points, but they are much robust.

Statistics can be quite baffling. Thier counter-intuitive nature sways even the best of people. I remember, once my father was cursing the weather prediction on his phone and said: “Look it says here that there is only a 20% chance of rain today, but see it’s raining so much”. Well, the problem did not lie with the prediction, it laid with his naïve understanding of probability. The chance of rain was 20% if you repeat your observation on several different days a good amount of time. Think of it this way, if you flip a coin, you get head. You flip a coin again, and you get head. And so you reach the conclusion that probability of getting a head is more than a tail—even though it is 50/50. Probability does not tell you the exact outcome on your next toss, and it only shows you the outcome if you repeat the experiment an enormous number of times. You won’t get a 50/50 head and tail ratio with just one toss, or ten tosses or even hundred tosses. That is why we simulate tosses millions of times using random numbers. Another quirky use of statistics is in the modern American rhetoric that “most people are above average”, which is laughable (most people cannot be above average. They are just average. ‘Average’ is called average for a reason).

But statistics is not only counter-intuitive, it also remains beyond the reach of those who don’t study them. A researcher cannot possibly run hundreds of simulations (even though they should), or apply sophisticated techniques. However, statistics is not the only reason science faces the perils of non-replicability. There are many other reasons as well, and it would be unfair not to bring them. Since the replicability crises mostly occur in behavioral sciences, can it be that the nature of those fields renders them unavoidable complexity and randomness — something you cannot ever eliminate? And, therefore it may not be that alarming that studies of those disciplines fail to withstand re-tests. Well, there is a difference between science failing to replicate, which is good, and science not attempting to replicate, which is horrendous. Replication is at the heart of scientific query. It cannot be detached from the scientific organism.

Therefore, when replication studies are treated as “lowly” than new and original endeavors by journals, or when replication stalls career growth of a researcher or hampers her from being funded, it all should be a violent blow to the integrity of science. In fact, nascent fields should be more careful about replicability, because they have yet to establish strong credibility. They should be the ones getting wrong most of the time, by putting their studies to test often. Being wrong is okay in science, not trying to prove wrong is disastrous. Hence, it should concern every one of us when universities or academia puts unnecessary pressure on researchers to ‘get published.' Instead, they should be propagating academic attitudes, most importantly, scientific attitudes —which hold replication dearest.

References and Further Reading

Reinhart, Alex, Statistics Done Wrong, 2015.
LA Times, "Why Failure To Replicate Findings Can Actually Be A Good Thing", 2016
The Economist, "Trouble At The Lab", 2013

Thursday, 23 March 2017

Robotonomics: Why Economics Needs To Take Robotics Seriously?


In 1932, Bertrand Russel penned a bombastic essay called In Praise of Idleness, which made a harrowing observation of “the leisure gap.” His much heathen genius recognized the way work was organized in modern societies with some people sitting idle all day and others working ceaselessly like labored donkeys. Russel, was, of course, referring to our long-sought value of work-ethic which says that donkiness is virtuous.

Parallel to this radical thought extended the arm of technology and machines geared to replace much of the “big dirty work” on factory floors and assembly lines in the late half of the 20th century. Despite the whines of economists and policy-makers, the coming of technology did not only alter the work landscape but led to greater productivity, consistency and safety standards in the economy, with previous skills of workers though slowly, transferred to other areas such as retail and service. But as we tread further up, affording a bird’s eye view, the picture looks different this time. In The Rise of Robots: Technology and the Threat of Jobless Future, Martin Ford argues that the rocketing pace of informatics will absorb jobs more than it will create or transfer thus building a giant crater pooled by mass unemployment in the economy. Some of these jobs are what can be called “bad jobs”, not because they are inherently bad but because of the nature of their work which is often too repetitive, meaningless sometimes even dangerous and which churns out the least satisfaction among workers. One such example is fruit-picking, which Agrabot, a Spanish company is trying to robotify. The machine only picks the ripe fruit and can reach to heights and angles which a tall Sapien cannot. Another is driving, which Google’s self-driving cars with its camera sensors, GPS, and horse-like computing power, are taking over preventing a sobering 5 million road accidents per year, reducing congestion, fuel consumption and carbon print in the atmosphere. Box-picking, which is being mastered by Industrial Perception’s robot has the visual perception, spatial location, and dexterity to move boxes in warehouses without back injuries or fatigue. There are bot waiters at Japan’s Kura Sushi restaurant who are helping to make sushi and serving customers through conveyor belts.

The demise of bad jobs

It is not an under-rugged truth that these jobs and many others such as cleaning, lawn-mowing, burger-flipping, packaging, quality-checking, and mining are driven to extinction. Fearing that these jobs would be lost, or trying hard to keep them for the mere sake of employment simply shows a lack of a visionary direction and good policy making. It is like searching for a quick fix while keeping the real problem under the rug. In other words, it is like saying “Oh let’s keep low-paying, low-satisfaction jobs so that we distract ourselves from bigger, louder and more pressing issues in the framework of our economy. Let’s keep these jobs so that low-skilled workers have something to hold onto, and let’s ignore the gushing cracks in the education and social system which churns them out in the first place”. It is no wonder that Trump’s campaign which played on the rhetoric of “savior of jobs” did so well.

The Good Jobs

The discussion should ideally begin when we consider what happens to “the good jobs” as they are slathered on by technology. Jerry Kaplan in Humans Need Not Apply makes exactly this point. The 20th century notion that “computers can only do what they are programmed to do” should be match burned and buried deep in the bowels of earth (or thrown in a bot bin) because this hour’s feats in machine learning and AIs are enabling soul-less robots to teach themselves anything from cooking, teaching, caring to accounting, medicine, law and even coding. If what you do, can be learned by reading textbooks, attending lectures and passing tests, chances are that a learning algorithm can easily do that.

Machine-learning, which tells you what stuff to buy and read on Amazon, what to listen and watch on Netflix and YouTube, who to make friends with on Facebook, and what to declare spam in your emails, is also being used in medical diagnoses, legal research, stock markets, and writing programs. The bonded ‘soul of machine’ exists and it is here by the name of Big Data. There cannot be any artificial intelligence without data as there cannot be good statistical inferences, pattern recognition and correlation without it. AI is only fated to succeed because we are all creating data at furious speeds such that almost all of the humanity’s knowledge was materialized just in the last two years. And so, every time you write a post, snap a photo, search a query or just browse the web, you are helping machines get smarter.

When IBM’s Watson sits in front of petabytes of published medical journals to learn diagnosis or reviews past legal cases to predict court’s decisions, it should seem intuitive that it can do a much better job than any of our human folks. Some might find this preposterous. It seems far-fetched to allow robots to, in a way, ‘play God’, by letting them make medical and legal decisions. The truth however is, they can play God, and they do a pretty good job at this. They can diagnose diseases with much fewer errors and more accurate ‘hits,' lack of which kills thousands of people every year or predict legal decisions which are brimming with vulnerabilities of cognitive and social biases. One of the popular AI memes of this half of the century is that robots will make decisions ‘on their own,' which directly emerges from our anthropomorphic view of the objects. Ethicist and technologist, Nick Bostrom has talked about the AI fallacy, which basically makes us think that robots are human-like, or that their thinking mechanism mimics the human brain, the way nature evolved it. “Airplanes don’t flap their wings” is an eloquent summary of it put famously by Frederick Jelinek.

The Creative Snowflakes

Looking past the big dirty jobs and the predictable white-collars, a lot many harbor the idea that creative and empathetic jobs will always remain the unclaimed territory of humans, even though bots are doing their best to push those boundaries as well. Music, much of which is already deluged by mechanized ghosts, some of them have started creeping up onto art, writing, and entertainment. As far as empathy goes, there are bot receptionists, nurses, babysitters and teachers, jobs traditionally thought to remain at the discretion of humans. It is true that the ‘care’ these robots exhibit is fake, or it is one-way, such that only humans get the pleasing effect during their interaction with the robots, not the robots. However as the future may unfold, it may not be far enough that a bot writes and produces a piece of blog article like this one (and does a better job at it!) 

All of it surely is overwhelming, no matter how much one reads up on it, give talks about it, or write papers on it. In all pinching reality, there bubbles a few good ton of questions, which we as a futuristic human society need to consider and attempt to answer. The biggest of them is “What next?” After most of the work has been unintentionally put at the discretion of robots, and a large chunk of the population is unemployed, which is an educated hunch corroborated strongly by most technologists and academicians, what will we all do in that era?

Thomas Paine had an answer in the 1930s. Universal Basic Income, an idea hugged by both liberals and conservatives, business people and employees, blue-collars and white-collars, individuals in the East and the West, mostly because it is to a good degree, huggable. It is workable, it has already been experimented in countries such as Netherlands, Finland, and India and it is producing results. For example, the MINCOME project by Canadian government from 1974 to 1979, gave a basic income to residents of a small town consisting of about 10,000 people. The experiment did not only help most recipients get above the poverty line, but it also improved graduation rates, health and birth control.

However many skeptics of this too-good-sounding deal, bring up the already present welfare state and unemployment benefits in developed countries, which they argue, has discouraged many job-able recipients from seeking out work. And if UBI was to be implemented, which frankly is another form of welfare state, one should better expect masses of individuals who feel meh about work. Apart from that, critics also blame the idea for its naïve approach, which is seeing work just as a means of earning a livelihood, when the fact is that there are many psychological and social surpluses to ‘work’ itself which cannot be measured by Econ101 apparatuses. And just because it cannot be measured, does not mean that it stays out of our policy decisions, and therefore, some might call an out-of-work society even worse than a collectively wealthy, post-scarce, and equal society. In "Why UBI is a terrible idea", the author aptly points out some of the slippery areas in our discussions, including the assault on America’s most heartily cultural values of work and life.

But here is the problem. The author was answering the question, what would happen if we implemented UBI right now, at this very moment, which in all honesty would surely be a terrible idea, considering the loopholes in our educational, economic, governance, and social capacities. It would not be an over-statement when we say that we might not be ready for it yet. Not just yet. Good policy does not only need a good idea, but it also needs supportive and compatible institutions for it to work and therefore, we are a long way from doing the necessary duct-taping and gluing.
The author also seamlessly laments the current loss values of work and hard-work, which is fair considering that the robotic future is a radical shift. It might as well turn out that work itself gets fundamentally re-defined in ways it never did before. Work in principle would become optional. A choice pursued solely for fulfillment, joy or to afford better holidays. But what will work really become like? Many theorists think volunteering, championing causes, art and most important, entertainment is what most of us will be doing. Nevertheless, our risen education levels might give us more illumination on more productive uses of our time. Is it, however, possible that we all become couch rats, maybe stacked away indoors playing video games till our last breaths? Well, Japan is facing that problem with its newer generation, who mostly remain unbothered from bills and familial restraints, thus almost corkscrewing its population growth.

Is Idleness that bad?

Studies show that the people who remain in the unemployed demography suffer from sheer psychological and social loss comparable to that of losing a loved one, mostly because work is tied so closely to their social status and integrity. However, idleness is different from unemployment. Idleness, which in Russel’s definition is mostly engaged in by feudal lords and elitists, who frankly, have both social status and integrity. And if we microscope history, it turns out many of our groundbreaking achievements and discoveries did come from people with plenty of time on their hands or those who were not hampered by financial and social constraints. And so, will a UBI spur a never-seen era of creativity, risk-taking, and entrepreneurship? Well, we don’t know yet even though it seems quite logical. 

Between Luddites and Singularitians 

On a scale of Luddite to Singularitian, where should we all lie? Luddite is another name for "computer-phobic", a person who fears technology and its implications, the kinds which make movies like Terminator. Singularitians, on the other hand, are optimists who see a human-robot cooperation as an invariable outlook. We don't know yet which side is winning, or rather which is more robustly entrenched in evidence. However, it is crucial to make sure that we are not drawing castles in the air, but taking educated and also practical policy decisions. And thus there are some of the things which every nation ought to do, like reforming education system so that it produces liquid and agile workforce as opposed to ‘specialized skills’ chambers of Econ101. Or re-designing tax-policy to bridge gaps in income equality. Some economists are proposing taxing the robot workers, which discourages businesses from investing in technology to start with. It might sound like a good idea, but when looked closely, it too is a quick fix. All in all, the focus today must lie with the root-grass institutions which inevitably shape the fate of human civilization.


Further Reading

Thompson, Derek "A world without work", 2015 
https://www.theatlantic.com/magazine/archive/2015/07/world-without-work/395294/
Eric and Brynjolfsson, The Second Machine Age: Work, Progress and Prosperity in the Time of Brilliant Technologies 
Barrat, James, Our Final Invention: Artificial Intelligence and the End of Human Era
Minsky, Marvin, The Emotion Machine