AuntieMatter

  • Home
  • Blog
  • About
  • Science
  • Resources
Ceci_Nest_Pas_Un_Power_Law.png

Ceci n'est pas un Power Law

October 12, 2020 by Fiona Helen Panther

Every interview with a scientist seems to lead with the question ‘what sparked your interest in science?’. Usually the answers that people give are of some single inciting incident: a sighting of Halley’s comet, the first item in an expansive collection of items from the natural world , or a chance encounter with a famous scientist. I wonder how organic these stories really are - I am sure I have told them. A chance viewing of comet Hale-Bopp in my back garden is a lot more romantic than the reality.

I only realised the inciting incident for what motivates my work last year. For the three and a half years I worked on my PhD, and for a substantial portion of the first year of my postdoc career, I genuinely believed I was purely motivated by the question of ‘where does all the antimatter in the Milky Way come from?’. I pushed any doubts that this was a worthy question, or even the right question to ask, out of my mind as I wrote (and was knocked back from) several fellowship applications. It wasn’t until my partner encouraged me to apply for a job seemingly unrelated to my PhD I realised I had been prioritising the wrong question.

Now I spend my days studying the ways we search for gravitational waves. Humans have only survived this long, and advanced science as far as we have, thanks to our ability at pattern recognition. It’s too reductive to say that I am interested in how we use pattern recognition to understand gravitational waves, whether it is detecting the tiny fluctuations in the fabric of space-time or whether the sources of the gravitational waves are dotted isotropically across the sky. What I really care about asking is ‘what are the limits of our ability to recognise patterns? When does our desperation to find patterns lead us astray?’

My inciting incident for wanting to ask this question wasn’t the observation of any particular celestial object, but a book. I discovered the author William Gibson when I was 16 years old, in a small ’15 minutes with’ interview column in the back of Cosmos magazine. I ordered and read Neuromancer, from Matamata PaperPlus, where I worked at the time. Once finished with that, I set out for the library and borrowed their careworn copy of ‘Pattern Recognition’. To take a line from Wikipedia summarising one of the central themes of the novel, the book ‘involves the examination of the human desire to detect patterns or meaning and the risks of finding patterns in meaningless data.’

As scientists, we try to find patterns as a means of filling in gaps of data we do not have access to (and I include extrapolation in this), or as a means to derive physical meaning from an abstract observation. Prior to 2020, I may have had to struggle on for several paragraphs to explain one of the most popular patterns we seem to find again and again in data (meaningless or otherwise), but after months of the news forcing a crash course in data analytics on everyone, the concept of a power law is now depressingly familiar. 

Meaningless patterns in meaningful data

The obsession with the power law in many areas of science is a stunning example of the risk of finding a pattern that fails to convey any physical meaning. Of course, there are things in nature that are distributed according to a power law, and in these cases the variables that parameterise the power law, particularly the power law index (usually denoted with ‘alpha’, this is the gradient of the power law if plotted on log axes, or equivalently how fast the thing on the vertical axis changes with respect to the thing on the horizontal axis) have genuine physical meaning that can be directly translated into physical properties. 



powerlaw.png

The danger comes, however, from empirical fitting of data. Given any arbitrary set of data, it is always possible to define a functional form that can represent that data by some metric (say, least-squares regression, minimising the distance between all the data points and the model). 

This doesn’t mean that given a set of data (say, the luminosity of a gamma-ray burst source at various photon energies) we should choose an arbitrary model that passes through all the available data points. If I was to fit a many-degree polynomial, it is easy to see that while the model may satisfy minimising the mismatch between data and model, the model is ridiculous: even an untrained professional would look at this and laugh. More importantly, the constants that parameterise the model (i.e. if a quadratic of the form y = ax^2 + bx +c, these would be a, b and c) have no physical meaning that can be drawn on. 


overfitting.png


The issue becomes more subtle, and easier to ignore, when we choose a simple model that may, in part, be physically-motivated (but still unphysical). A very popular empirical model for the luminosity of a gamma-ray burst is the Band function(*), comprised of a broken power law. The physical motivation is not only that the functional form seems to represent observed data relatively well without the obvious overfitting of the previous example, but that the underlying mechanism that drives the emission of gamma-rays is synchrotron radiation emitted by electrons encountering shocks in the ejecta of the dying star system. The combined synchrotron emission of a population of electrons can be described by, yes, a power law. 

The stumble with the Band function comes when we try to go from the fitted parameters that warp the model to fit best the data we supply to the physical conditions within the source of the observed gamma-rays. Is the Band function useless? No. Is it limited in allowing for physical interpretation? Yes. Based on the value of one of the parameters, many works analysing data of gamma-ray bursts have either accepted or rejected the hypothesis that the fitted spectrum is produced by synchrotron radiation.

What the Band function fails to capture is a full physical picture of what goes on to produce the data we observe, and if our function does not capture an aspect of physical reality, we cannot infer that information. An empirical model can only return information that was baked into the model to begin with, and nothing more. Is this bad? Not necessarily - one can use the Band function to study gamma-ray bursts, but with the caveat that accurate information about electron cooling cannot be determined in the detail that is sometimes required, as this mechanism is not captured in the model. 

One of the big critiques I hear from those who dislike my criticism of their dependence on empirical models is that ‘reality is too complex to capture in our model’. My response to this is that, in that case, you cannot make a detailed physical inference from a model that does not directly probe this. If your empirical model assumes no cooling takes place, you cannot infer detailed information on the cooling regime from your model outside of the fact that either you are a) correct and no cooling takes place or b) your model does not fit the data and so cooling must take place. Any details of that cooling are lost.

Meaningful patterns

There is a happy medium in all of this: we can develop models that explain our data well, and report their interpretations in a responsible way. This is far from a new concept - while some astronomers would like you to believe that concepts like Bayesian statistics, information theory and various methods of inference are new, and they are the discoverers, the concept of reporting a finding given a specific model with degrees of belief described by probability is not new. Statisticians have been doing this (and have, in some cases, been ignored) for years, and we would probably do well to consult them more often. 

Reporting a discovery with caveats does not make for a flashy press release - it’s actually rather disappointing to watch some journals persistently reject papers that go to great lengths to develop physically motivated models, explain their limitations, and make a careful interpretation of the data in light of these models. This kind of work lends itself less toward the Nature ‘go big or go home’ method of reporting discoveries, and more toward journals like Physical Review D, and is reflected in the choice of the likes of the LIGO scientific collaboration choosing to report discoveries in the Physical Review journal series. But reporting a discovery in a way that makes it clear a) what model and assumptions were used b) what the limitations of that model are and c) what our degree of belief in the outcome is the only way we can mitigate the risks caused by our desire to find patterns, regardless of whether they are there or not. 

(*) named after a person, not because it splits the emission into bands as I initially thought. Then again, I also spent three years of my undergraduate education believing the Heaviside function got it’s name from being ‘heavier’ to one side of x=0 than the other, not from the scientist who named it!

October 12, 2020 /Fiona Helen Panther
  • Newer
  • Older

Powered by Squarespace