Why the title?
The Hitchhikers Guide to the Galaxy is frequently almost nonsensical and that is what makes it appealing. Rendering linear thought impossible, the reader jumps through hyperspace and the Infinite Improbability Drive to understand the Universe as seen by Douglas Adams, and emerges a little wiser if not more confused.
An example that rings a bell in academia is the mice, who ask an all-knowing computer Deep Thought, the answer to life, universe and everything, only to be told that it is 42. Then, knowing that they've blown up taxpayer dollars, because they had not framed a proper question here, the answer fails to make sense. So they commission another experiment to build a computer to find what the Ultimate Question is, the answer to which is 42 because obviously, "What gives when you multiply 7 and 6?" lacks a ring to it. Sort of reminds you of labs that churn out data and worry about the analysis and the questions/hypothesis that lead to the story later. Written in the period, 1979-1992, they ring quite a bell in 2015.
This article introduces nothing new that you wouldn't have already realized if you are in the middle of your Ph.D, however, just like Mark II had a purpose, so does this piece, so read at your own risk
Long Story:
Once upon a time, a faculty told me that the format of a Ph.D was something that had survived all the way from Renaissance to modern-day. According to him, students start as apprentices, fulfilling the whims and fancies of their masters/supervisors. The grind, or tough-love or whatever he thought of it, was in fact necessary. Another faculty happened to tell me the same thing 3 years later rephrased as "these kids need to
go through the grind because that is the only way the proper work-culture can
be established".
Work-culture here implies a hierarchical structure where the
kids are constantly in fear/awe of the faculty and are always corrected by them
and rarely does it happen the other way round. Questions are expected from the
students but only rhetorically, the student has the moral responsibility to
show his work as the next big thing in the world no matter how visibly mediocre
it maybe. A student cannot criticize his own work/methodology or existing
frameworks of scientific discovery. The un-spoken pressure on the student is to
show his work in a positive light and defend it to his death no matter how he
feels about it in his own head.
Nonsense, this is killing them inside, their self-confidence
goes to an all-time low, and they suffer from feelings of inadequacy and uselessness which can
be debilitating. Always in a pressure to show their work in a good light takes
a toll on their scientific temper and causes them to develop confirmation bias and become religious about their theories and hypotheses and causes them
to irrationally become argumentative about it. Doesn’t that beat the whole
point of doing a Ph.D in the first place?
What is missing here? Proponents of the current system argue
that only through such a system, do they eventually become good scientists, eventually being the key word here. This
system also ensures that the hierarchy does not get destroyed by chaos caused
by arguments between faculty and students. It is a deeper question that needs
to be addressed on why there are or should be arguments in the first place.
It is undeniable that everything
comes with an expiration date and that applies to knowledge as well. Faculty
who do not revise their basics and who can’t tell that a flat line along the
x-axis signifies no correlation or having the tendency to use a battery of
statistical tests until the data gives the answer that they are looking for are
the dangerous people here. The picture
of Science they give to their students is almost formulaic and unimaginative.
The protocol is simple enough. Take two conditions, one control, one treated.
Apply statistical tests (T-tests, other multivariate analysis) until some difference is found between the
two and then do an enrichment analysis to find which entities (genes, proteins
or metabolites) are differentially expressed and find the pathways
corresponding to them. In this list of pathways find the ones that are enriched
using a frequency difference test like the Chi-square or Fisher exact test. You
can increase the samples and conditions and make it high-throughput and then
looking at the gene ontology lists, try to guess at what is happening within
the system.
They are too comfortable in their positions and while that
is not necessarily a bad thing for them, it is a bad thing for their students.
I’m not going to claim that it is their moral responsibility to take care of
the intellectual growth of their Ph.D students but to passively damage their
learning is also something that is entirely undesirable.
However I believe that if students are given all the
relevant information from the beginning without overloading them with minutiae and assuming they’re individuals who are fairly systematic and sensible. It
isn’t entirely impossible to believe that they can execute tasks which are
considered the main-stay of scientific research, which, frankly if you consider
biological sciences, are not arcane or inaccessible to a person with school
level knowledge of calculus and algebra unlike some of the other sciences.
To presume apriori that they are dumb and need to be
spoon-fed is a dis-service you do to them. You should probe and test them and see what they
can do, if they can’t do it, then, there is no point force-feeding them. At
this stage in life if they can’t be interested in something that they chose to
do it is probably healthier for them to find an alternate career that they
would love and enjoy. Encourage such students to leave because pushing them
towards something that they clearly don’t like is a waste of time and energy
for you and them. However if you insist that they need to be molded into
replicas of you then you should go see a shrink about this control-freak,
micro-managing condition that you seem to be developing.
What do I think we should do about it?
Encourage chaos in discussions and encourage them to
question everything from the experiment design to the statistical tests and
violated assumptions. Experimental designs should be criticized by having alternate
designs and comparing them to see which experiment will answer the question
posed while minimizing the confounding factors. When the merits and de-merits
of each method are discussed an emergent thought process occurs and there is a
lot more absorption when they are forced to think something through to the end
rather than receive it passively as an instruction.
Statistical tests have been the same for quite some time now
and their limitations are known, for instance, the inflation of the T-statistic
when variance is low and that correlation should be accompanied by a P-value.
Here what is important is that the students realize that the statistical test
is not a means to the end. You don’t just acquire data using experiments and
then sit down to apply statistical tests one by one until you get the answer
you want. Rather (in another school of thought), the choice of the statistical
test can more or less dictates the kind of experiment that should be performed
and specify the number of replicates among other things.
For instance if you wish to find an association between two
variables, one of which you can vary (like the addition of a compound) and one
which you can observe (optical density). In this case correlation is the way to
go to test for an association and regression will give you a model that allows
you to predict the response of the system (output) given a certain quantified input
of the drug.
However if you wanted to find out if the numerical
difference of a particular parameter that you are observing between two groups
is significantly different then you go for something like the T-test. The fact
that the T-test involves a variance term means that you should have replicate
observations in the groups whose significance of difference you are testing.
The above mentioned is a small example of the fact that the
choice of your statistical test dictates the experiment that must be performed.
To a small extent, this thought process prevents you from just gathering data
without systematic planning and then trying to find patterns, patterns that
could exist in random data as well. The P-values are an additional checkpoint
but if you are trying to game the P-values as well, then we are talking about
serious ethical issues here.
The
standard classical hypothesis testing format is designed to give you the
scientist, some lee-way in making mistakes while on the process of scientific
discovery. A process which itself is a little labile and prone to error, but
trying to subvert the procedure just to find anything, any association or
differential expression is not a healthy career move.
Apart from all of this, learn to ask questions and formulate experiments to answer them. Questions do not come
out of thin air, in fact they do but when they come out of thin air they aren’t
the cleverest of questions and most of the times have been worked to death by
someone, somewhere in the world. To begin addressing questions of importance,
one must first know everything that is known. Only after the limits of
knowledge are known is when you can start asking questions that nobody else has
asked before and the journey to the answer for those questions will lead to the
learning process that makes scientific research worthwhile.
However it is also possible that reading too much can
confuse you. When you read too much, there is an overload of information and
the inability to chew on it and digest it. So space out your reading and integrate
it with your work, don’t read at a stretch and work at a stretch because you
could get stuck in a rut that way, use one to freshen up the perspective on the
other. Write down the interesting things that you read into your lab notebook,
whether they be clever methods or new ways of statistical analysis.
Sometimes, knowing too much can also be paralyzing and
render you unable to work with the sheer weight of all that knowledge inside
your head. In that case, stop reading and start working. The hope here is that
there is something truly unique about your perspective that the rest of the
world doesn’t share with regards to the solution to your Ph.D problem and that
is the fresh perspective that your work needs. It’s unique because it is this
conscious thing in you that has absorbed all that you ever read about the
things that you liked, your hobbies, your interests and the games you play and
the puzzles you’ve solved, your abilities at any of the physical sciences,
music, craft or engineering. They all contribute to your unique perspective which
should advance the understanding of your Ph.D problem if not, help you solve
it. Above all, have fun and appreciate the good things in life.
No comments:
Post a Comment