What is common among Smokers, Statisticians, and Data Engineers

Source: https://unsplash.com/s/photos/no-smoking

Superficially there is no commonality present among smokers, statisticians and Data Engineers. The only thing common is that they love their tools.

For smokers, a cigar is a tool to get relief from stress or pressure. The same thing applies to statisticians and data engineers. An unresolved problem creates stress and anxiety, so statisticians and data engineers use statistical methods and tools.

Above this superficial analogy, there lies a deep similarity among smokers, statisticians, and data engineers.

A chain smoker knows about the statutory warning that smoking is injurious to health. Still, they embrace smoking because smokers value near future over unforeseen long-term side-effects. Hence they are pragmatists.

Like smokers, a professional statistician knows the circumstances when their solution is going to fail. So they warn about the threat to the validity of their solution. They are thus figuring out situations when the answer will not work. Hence they are realistic.

Like smokers, statisticians are happy because they have provided a solution to a pressing problem. One paragraph about ‘Threat to validity.’ in a 10 pages long research paper is like one sentence warning stating that ‘Smoking is injurious to health’.

In rural India, you will find a kind of cigar called ‘Bidi’ (बिडी). Like cigars, bidis are tobacco wrappings but without filters and any statutory warnings. Bidi smokers are ignorant about the health problems, tobacco source, and they are careless about the bidi brand they smoke.

Data Engineers are like ‘bidi’ smokers. They are less fussy about the source of data. They like to use data from wherever it is available. They rarely prefer to provide a statutory warning about misuse of their solution, ethical issues, or consequences of their solution.

An ignorant bidi smoker smokes a bidi even without anxiety or pressure. Because S/he thinks bidi smoking is a cultural practice. (Yes, bidi smoking is a gender-neutral habit)

Like a bidi smoker, data engineers like to use machine learning methods because it is a recent trend or an industrial practice. A Data Engineer can invent a problem when there is no. A data engineer can invent a problem only to suit his/her machine learning skills.

Hence, a data engineer may get an insight into his/her profession when s/he hears the song ‘bidi jalai le’ from the 2006 movie OMKARA.

Na ghilaf, na lihaf, (I don’t have a sheet nor a quilt)
Na ghilaf, na lihaf, (I don’t have a sheet nor a quilt)

Thandi hawa bhi khilaf sasuri, (Even the damn cold air is against me)
Na ghilaf, na lihaf (I don’t have a sheet nor a quilt)

Thandi hawa bhi khilaf sasuri (Even the damn cold air is against me)
Itni sardi hai kisi ka lihaf laile (It’s so cold that take a quilt of someone)

Ja padosi ke chulhe se aag lai le (Go and take some heat from the neighbour’s fire)
Ja padosi ke chulhe se aag lai le (Go and take some heat from the neighbour’s fire)

Beedi jalai le jigar se piya (Light your cigarette from my bosom)
Jigar maa badi aag hai (There is a lot of fire in my bosom)

(source: https://www.filmyquotes.com/songs/881)

Enjoy data engineering but no smoking

.

Arvind is a Professor of Computer Engineering in Dr. Babasaheb Ambedkar Technological University Lonere India.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store