The limited role of preregistration in observational studies

Transparency, and open science, are crucial in all stages of research, this includes prospective registration of studies, code and data sharing and high-quality reporting of studies. Open science practices are important to avoid simple questionable research practices such as ‘p-hacking’, HARKing (hypothesising after results are known), and at the most basic level, selective reporting ie. picking the ‘best’ results to include in a paper. Although with recent high-profile examples of research fraud and data manipulation, resulting in potentially millions of dollars wasted, the need for transparency in research is becoming even more important. Preregistration is one important method used to reduce questionable research practices and is typically strongly recommended by the open science community. However, there are pushes for preregistration of studies to be mandated, this is currently focused in the domain of clinical trials (as it should be), The focus of this post will be on the role of study preregistration in observational research using real-world data, which is an area rapidly growing in popularity and being used as evidence by decisionmakers to inform policy.

Why do we preregister studies?

At its core, study preregistration is simply the act of specifying your plan for your research before you start it and sharing this on a register (ie. clinicaltrials.gov, OSF, etc.). Preregistration’s benefits have been clearly demonstrated for all research with prospective data collection, ie. clinical trials. Because of this, I will only briefly touch on these.

In clinical trials, it has been recommended by the International Committee of Medical Journal Editors (ICMJE) that all trials be pre-registered. This has slowly been adopted in journal policies, however, is far from being universal. The reason for this recommendation is that preregistration allows people (ideally in peer review) to check the submitted manuscript, against its registration for deviations, making any selective reporting, p-hacking and HARKing more visible, reducing its prevalence. Selective reporting is probably the most common and occurs when the results which are not sexy or positive are omitted from the publication. This can result in conclusions of studies being misleading, and results in a lot of research never being published, contributing to up to 85% of research being wasted. Preregistration allows this to be noticed and questioned, hopefully reducing the occurrence of selective reporting. Essentially, preregistration reduces the ability of researchers to make changes to their methodology after finding out their results, reducing the ability for the results to guide the methods.

A common concern is that preregistration locks authors into a plan when the research process should change as more information is gained. This isn’t the case, as there is nothing mandating that the protocol has to be rigid, being completed exactly as was intended, however, preregistration provides a base to explain why these changes took place, which requires authors to provide justification for any changes, improving how trustworthy their research is.

Clearly, preregistration is important for clinical trials, however, the ICMJE made an important distinction and made observational research exempt from preregistration. However, a typical ‘open science bro’ might say everyone should preregister everything because it’s always going to reduce questionable research practices. However, that may not be the best solution.

As discussed above, we preregister studies to reduce selective reporting, p-hacking, and HARKing. However, preregistration only reduces this if the data collection has to occur after registration, and it can be empirically verified. This is becoming more common in clinical trials but is not the case for observational studies.

Preregistration in observational studies

Preregistration is (relatively) easily enforced due to the requirements of clinical trials and prospective data collection. However, what about when the data already exists? There are two examples which I think support why preregistration does not benefit observational research much. Let’s say the authors of a paper are a bunch of researcher-clinicians who work at a hospital, they think, “I think we should have a look at the effect of drug X.” They get a waiver of consent from their ethical review committee and collate all the data. They have a look through it, run a few analyses and come up with an interesting result.

How would preregistration benefit us here?

Perhaps preregistering their observational study might make them think more about their study design, but they also don’t know what data they actually have yet, they have a broad idea but feel like it’s not good enough to write up a study plan listing everything required for their study. Basically, they don’t want to shoot themselves in the foot by specifying they need a certain confounder measured, prior to knowing that it is indeed measured in their data. You could say, this is no excuse, if you can’t do science transparently, or don’t have the data to conduct the desired study, don’t do it at all. This is definitely true. Having less, but higher quality research would benefit everyone but that is not what happens in reality. Our authors decide to go ahead with the study and hold off ‘preregistering’ the study until they can figure out what they have measured in the data. they register the study, knowing that nobody can verify that they didn’t have access to the data prior to preregistering. In this scenario, preregistration is useless.

Another example, which is becoming increasingly common is the use of ‘big data’ in research. To access this data, which may include electronic health records, claims or other linked data, researchers typically require approval from the data custodians. Currently, for the large data custodians of these sources of big data (the many Nordic registries, the US Veteran’s Affairs health database, or Medicare claims database) to release the data, the process involves submitting a research proposal which is assessed and approved prior to access to the data is given. I am not aware of any custodian that requires this proposal to be publicly registered. If this was required, preregistration would have the same beneficial effect for observational research from ‘real-world data’ as for clinical trials. This is because it would be verifiable that the registration occurred prior to exposure to the data. However, even in the most structured cases, where data custodians grant access to the data, this is not the case, so if an observational study is preregistered, we just have to trust researchers that they did indeed register it prior to having access to the data, and we’re left in the same position as last time, preregistration relies on trusting the researchers are telling the truth, an assumption which is increasingly harder to believe.

So the goal of preregistration is to reduce selective reporting and HARKing, where results are chosen after the fact based on how much they support the researcher’s agenda or will help them get published. In clinical trials, preregistration works as an excellent mechanism to reduce HARKing, but is not very effective at all for observational studies, as the timing of data access is not measured nor public, therefore it is impossible to verify the ‘pre’ in preregistration. You could still argue that most researchers will act honestly and therefore on average preregistration will improve how thoughtful authors are in designing their study and reduce HARKing. But what if there were negative effects of preregistration?

Every decision requires weighing up the benefits and harms of the effect of that decision. Something not often discussed are the potential ‘harms’ of preregistering studies. These harms could include dichotomising evidence into preregistered or not, resulting in the dismissal of studies not preregistered as being inferior, or at a higher risk of bias than their preregistered counterparts. This is only true if indeed preregistering does reduce the risk of bias, which in observational studies, is unclear. Indeed preregistering a study may be associated with higher study quality as researchers more likely to preregister may also be more likely to conduct a rigorous study, however, preregistration may not cause them to conduct a better study. Therefore, preregistration becomes an erroneous measure of quality, which could lead to ignoring high-quality evidence that is not preregistered and overstating the quality of low-quality evidence that states that it is preregistered.

In summary, there are many benefits to preregistration in clinical trials as it is easier to verify that registration occurred prior to data collection. Currently, no mechanisms exist in the infrastructure of observational data to verify preregistration, diminishing its beneficial properties. Further, as preregistration may not improve the quality of observational studies, it becomes useless as an indicator of quality, and potentially harmful if used in evidence synthesis or decision-making.

Is there anything that could be used instead of preregistration to reduce selective reporting or HARKing?

Yes, well, at least partially. It appears that specifying a target trial, ie. a hypothetical randomised trial that would be conducted in order to answer a causal question, and then emulating that trial as closely as possible with observational data improves causal inference in observational studies. This is emerging as an important methodology in observational research and is being used to provide evidence to inform decision-making. Like preregistration, emulating a target trial cannot reduce selective reporting or HARKing itself, but it will make it more visible, which by extension, may reduce authors’ willingness to undertake these practices. In observational studies, HARKing or p-hacking could be done by developing a specific combination of eligibility criteria, or outcome definitions. Importantly, this selection could be done after testing many different criteria, choosing the final one based on the results. By clearly specifying the target trial and emulation, these unusually specific criteria can then be noticed in peer review or by others, and the quality of the study can be more easily appraised. Alternatively, like in clinical trials, data custodians could begin to require public registration of the target trial protocol prior to giving access to the data. Although even this has limitations as often many studies could come from one dataset, it may not be feasible to require authors to register protocols for all potential studies from that dataset. Another opportunity is to more freely share data to allow others to reproduce and verify results, and despite barriers to this dissolving over time, it remains a rare practice when using ‘big data.’

Conclusion

Ultimately, increasing transparency in research is clearly important, although preregistration of all study types without regard to the actual implications of this should not be recommended in the current research infrastructure. Despite the benefits of preregistration in clinical trials, these are unlikely to extend to observational studies as it is currently impossible to verify whether the study is indeed ‘pre’ registered. However, explicitly emulating a target trial, and reporting the protocol and that of its emulation may reduce selective reporting by making it more easily visible.

Isometric exercise MAY reduce blood pressure and appears safe!

High blood pressure is the leading risk factor for death across the globe, affecting 1.13 billion people, resulting in over 10 million deaths in 2019. Clearly high blood pressure is a huge problem, and the issue of treating this condition is much researched. There are many drugs which are effective at reducing blood pressure, but this has still not reduced the overall burden of the condition, potentially due to unwanted side effects. Exercise (aerobic and strength-based) has been shown to reduce blood pressure, but often is quite tiresome and time-consuming. Isometric exercise is a type of resistance training which involves exercises such as wall sits, or hand grips. Isometric exercise however is quick (usually taking just twelve minutes!) to undertake and relatively easy to do!

My team and I recently conducted a systematic review and meta-analysis looking at the effects of isometric exercise on blood pressure (https://www.nature.com/articles/s41440-021-00720-3). Systematic reviews are the highest level of evidence and usually are used to inform clinical practice and health policy.

We found that isometric exercise appears to be safe, in all populations including those with hypertension and older adults. This is an important findings as this was previously unknown and many clinical guidelines have not recommended isometric exercise due to safety concerns.

Importantly, we also found that isometric exercise may significantly reduce blood pressure. We found that it may reduce systolic blood pressure (the top number on a blood pressure reading) by 8mmHg, and diastolic blood pressure (the bottom number) by 4mmHg, both of which are clinically important amounts, similar to many other drugs!

It is important to note that I say isometric exercise “may” reduce blood pressure, because the quality of the studies included in our review was very poor, which limits how much we can trust them and their results. However, it appears unlikely the isometric exercise is dangerous as was once believed and could be another great tool for people living with high blood pressure who don’t like exercising, struggle to make time for exercising or have physical limitations which makes exercising difficult.

Have a read of our study and feel free to contact me if you cannot access the paper!

Reference:

Hansford, H.J., Parmenter, B.J., McLeod, K.A. et al. The effectiveness and safety of isometric resistance training for adults with high blood pressure: a systematic review and meta-analysis. Hypertens Res (2021). https://doi.org/10.1038/s41440-021-00720-3

When is a healthcare intervention actually ‘worth it’ to a patient? – The smallest worthwhile effect

When seeking to identify if healthcare intervention was successful, we typically look for statistical significance in a change to prove to us that the effect wasn’t down to chance. This very good and is most widely used, but simply statistical significance (ie. p < 0.05) is not enough to actually say an intervention was worthwhile. In a study with a lot of participants, it is possible to find a very small change to be statistically significant, and that change may actually have no importance to a clinician or a patient. Let’s use an example, if there is a new drug for blood pressure and it is studied and the researchers find that it reduces blood pressure, with statistical significance, by 1mmHg, this effect does not mean anything clinically, thus, it likely isn’t an intervention you would recommend. However, if another drug reduces blood pressure by 10mmHg and is statistically significant, most people would also consider that change clinically important. That 10mmHg change may be the difference between someone being hypertensive (>140 / 90mmHg) and moving to being in the high-normal category (130-139 / 85 – 89mmHg), which has important reductions in risk of a cardiovascular events. But how much of an effect means something is clinically important, or, what is the minimal clinically important difference (MCID)?

The MCID was defined in 1989 by Jaeschke et al. as “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient’s management” (1). In my mind, the key part of this statement is that it is the smallest change which patients perceive as being beneficial, because, at the end of the day, if we as clinicians are giving interventions which are not likely to make the patient feel better, what is the point of administering them?

The MCID has been defined for many different things, from the six minute walk test (6MWT) to ratings of pain, but the methods used to determine the MCID raise some questions about whether they are truly patient centred. Let’s take pain for example, pain can be measured on many different scales but lets use the 11-point numerical rating scale of pain (NRS-P) which goes from no pain to worst pain imaginable. The way the MCID is calculated is by putting patients through an intervention and asking their pain on the 11 point NRS-P at the beginning and end of the treatment, to determine the change which occurred. Then, at the end of the treatment they are also asked how they feel overall on a global rating scale, ie. do they feel the same, slightly worse, much worse or slightly better, or much better. The responses on both these scales are compared, and the change in scores which most closely correlates with feeling ‘slightly better’ or ‘slightly worse’ is determined to be the MCID.

This sounds pretty good on the surface, but it is the researchers or clinicians who decide that patient’s only have to feel ‘slightly better’ in order to see a clinically important change. But what if patient’s want to feel much better and only ‘slightly better’ wasn’t actually worth it for the treatment they went through? These are the major limitations of the MCID, it doesn’t factor in either the patient’s view on what amount of change is important, nor the costs, risks and inconveniences of the treatment which produces the effect.

In 2009, Ferreira et al. (2) coined the term, ‘smallest worthwhile effect’ which is intervention specific and factors in the costs, risks and inconveniences of the intervention. To demonstrate the importance of having intervention specific measures, let’s imagine two patients were to undergo different treatments for their pain, one had major surgery and the other attended a series of educational sessions with a clinician, if the MCID was a reduction in pain by 2 points on the 11-point NRS-P, and both patients achieved a reduction of 2.5 points, would both patient’s be equally happy? Would they both consider that they saw a clinically important change? Probably not, because the surgery has much more severe costs, risks and inconveniences.

When calculating the smallest worthwhile effect, patient’s are explained the intervention, then asked what effect, over and above the effect of no treatment, that would make the intervention worthwhile to them, considering the costs, risks and inconveniences. They are then asked by the clinician, “what if that effect was 0.5 points less? would that still be worthwhile?” and this is repeated until they don’t consider the treatment worthwhile, and thus, the smallest worthwhile effect is established for that treatment. Another aspect of the smallest worthwhile effect is that the hypothetical effect size patients are considering is in addition to the natural history of the condition. For low back pain, most people see a 30% reduction in pain over the first few weeks of a flare up, thus, the effect of any intervention must be over and above this natural recovery or regression to the mean.

The current research (3 & 4) on the smallest worthwhile effect for pain has looked at several different physiotherapy interventions and non-steroidal antiinflammatory drugs (NSAIDS) in low back pain, and there are definitely many treatments beyond this, thus, for my Honours year I’m conducting a study looking to identify the smallest worthwhile effects for different interventions for low back pain.

So why is this important?

The value in knowing the smallest worthwhile effect of an intervention is that it enables clinicians to know, on average, what effect patients consider worthwhile from different treatments. From there, they are able to identify whether those treatments are indeed able to produce that effect. For example, if a patient believes that in order to take a drug for their pain, they would need a 3 point (on an 11-point NRS) reduction in pain in order to make that treatment worthwhile compared to no treatment, considering the side effects of the medication. But the clinician knows that the best evidence shows that the drug in question typically only reduces pain intensity by 1 point, so they may recommend other treatments which have a more favourable cost-benefit profile, or, a smallest worthwhile effect which aligns with the efficacy of the treatment in question more closely. Ultimately, it is crucial that we in research ask patients what they think of the interventions that we are applying to them, and get their input into whether a treatment actually ‘works’ and is worthwhile from their perspective.

References:

(1). Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials 1989;10:407-15.

(2). Ferreira ML, Ferreira PH, Herbert RD, Latimer J. People with low back pain typically need to feel ‘much better’ to consider intervention worthwhile: an observational study. Aust J Physiother 2009;55:123-7.

(3). Ferreira ML, Herbert RD, Ferreira PH, et al. The smallest worthwhile effect of nonsteroidal anti-inflammatory drugs and physiotherapy for chronic low back pain: a benefit-harm trade-off study. Journal of Clinical Epidemiology 2013;66:1397-404.

(4). Christiansen DH, de Vos Andersen NB, Poulsen PH, Ostelo RW. The smallest worthwhile effect of primary care physiotherapy did not differ across musculoskeletal pain sites. J Clin Epidemiol 2018;101:44-52.

P.S. this was to help me solidify my topic in my own head, ensuring I understand it, if you’re interested in it, hit me up

Learning and Asking for Help

At University there are many ways of learning new content, but in this post, I’m going to be talking about learning things in a more self-directed way, more relevant to courses like honours or research-based classes, where things aren’t necessarily explained explicitly.

Largely, there are two ways of learning things: you can ask a tutor or supervisor, or you can try to figure it out for yourself. I am currently doing a research project in collaboration with several staff members in the final year of my degree. In this project, we are doing a systematic review and I have been tasked with writing up the protocol, something I’ve never been taught how to do. Initially, I was very nervous as I was completely lost on how to start and realised I had two options, either ask my supervisors how to do everything when I would hit a barrier or learn how to write a protocol from first principles and figure it out all on my own; I chose the latter.

The reason I chose to learn from first principles is that I was to go into research after I finish this degree, so I decided it was crucial I understand how to do research. This makes a lot of sense but at the time this task can seem extremely burdensome. I picked up the Cochrane Handbook of systematic reviews, a 750-page text explaining how to do a systematic review and was initially very overwhelmed with the complexity and magnitude of what I had to learn. Rather than sit down and read this before bed (when I normally read) I decided I would essentially switch out my kindle for my phone when I’d normally just scroll aimlessly through social media and read the handbook. Whilst this doesn’t seem too fun, it was extremely effective as I’ve now written my protocol from scratch and have completed a significant chunk of the handbook, and crucially, I understand how to conduct a systematic review. I could have asked the supervisor to hold my hand throughout the process but I wouldn’t have gained anywhere near as much knowledge out of it, which is fundamentally the goal of the project. 

Despite reading the handbook, I continued to hit barriers where I would feel the need to ask for help, a recent example being the statistics surrounding meta-analysis. Statistics have never been a strong point of mine so reading example protocol and seeing a whole bunch of letters such as I, Q, Tau and more I was again, very overwhelmed. I resisted the urge to raise a white flag and ask for help and researched what these statistics mean, why they are important and how to use them so I’d be able to write how we would use them in our review. It took me two days to unravel the secrets of these statistics but I eventually got there and asked a PhD student if I had understood them correctly, and thankfully I did.

Learning these tasks from first principles was very useful for me, not just because I got a deeper understanding for the processes which I was learning, but because they reinforced that I could learn anything I put my mind to, even if I was clueless beforehand. This extra confidence is really reassuring and empowering because I now feel like nothing can stop me and I’ll always be able to figure out something if I need to. 

Ultimately, no matter what you are studying you will encounter scenarios where you are at a loss on how to do something. I’m by no means recommending you don’t ask questions, because throughout the process of learning all of this I sought feedback and asked plenty of questions but when it comes to significant processes, don’t ask for the answer or solution to be spoon-fed to you because you gain nothing from that. If you try to learn from first principles, you will gain a much deeper knowledge of the task at hand, and this will always result in higher marks. So next time you hit a barrier and don’t know what to do, look it up, and try to figure it out for yourself, I guarantee you’ll be able to and you’ll gain much more in the process.