Introduction to surgical evidence

Evidence-based medicine is a relatively recent innovation. It has only been over the past two centuries that scientific methods have become the accepted means of establishing the most effective treatments and tests. Nowhere in medicine has this transformation been more vibrant than in surgical disciplines, where numerous innovations have paved the way for the modern treatments of today.

The generation of high-quality evidence in surgery can be particularly difficult. The reasons underpinning this are four-fold. First, performing surgery is a complex intervention. There are many variables to consider when designing research studies, including postoperative care, variation in surgical techniques and factoring in the natural learning curves required for surgeons to learn new approaches or operative techniques.

Second, surgeons themselves are divided in their attitudes toward research. A survey of Australian surgeons found they believed their own clinical practice was superior to clinical guidelines, that evidence-based surgery adversely affected decision-making for patients and that not using evidence in clinical decision making did not adversely affect patient care. Such attitudes present a large barrier to the uptake of evidence in surgery and require addressing if outcomes and patient care are to be the best they can be.

Third, many operations have historical origins, where an operation was performed for a given indication leading to resolution of the disease process. There may be limited evidence for such procedures as it would be unethical to deny patients an effective and established treatment in the context of a research study.

Finally, a lack of funding and interest has led to research studies becoming the exception rather than the norm. Currently, it is unusual for a patient undergoing surgery to be enrolled in a research study. When these factors are considered together, it is unsurprising that in 1998 surgical research was compared to a “comic opera” by the editor of The Lancet.

Over the past 20 years, considerable improvements have been made. Several large initiatives launched by surgical organisations and collaborations have led to new surgery-specific research frameworks being introduced (e.g. IDEAL framework https://www.ideal-collaboration.net ). This is proving to be successful, with surgical research increasingly being published in the world’s largest medical journals.

Medical research and challenges

Recommended videos:

Changing the world with evidence

For surgical research to change practice and improve patient care, it must address a new question or an area of genuine clinical uncertainty. The uncertainty around which treatment is best is termed ‘clinical equipoise’. A key assumption for the ethical conduct of any interventional research is that equipoise exists; that is, there must be genuine uncertainty around which treatment is best for a given patient group. If clinical equipoise does not exist and it is definitively known that one treatment is better, it is unethical to knowingly expose patients to the inferior treatment.

The research studies listed below can be read, considering the following:

  • Who is the patient population or target condition?

  • What was the intervention (for interventional research i.e. clinical trials) or exposure (for observational research)?

  • What was the comparison (or control group)?

  • How was the effect of the intervention measured, and was it measured accurately?

  • Are there any sources of bias or confounding present?

  • What was the result?

  • Is this study relevant to my clinical practice?

Throughout this chapter we will discuss each point in further detail.

MRC-CLASICC

This study looked at whether laparoscopic surgery for colorectal cancer was equivalent to open surgery. The study began in 1996 when there was genuine uncertainty as to whether laparoscopic colorectal surgery was safe and effective.

The study demonstrated that for colonic resection for cancer, laparoscopic surgery was as safe as open surgery. The authors went on to publish data on both the short- and long-term outcomes. In conjunction with several other trials, this study paved the way for the implementation of routine laparoscopic colorectal surgery.

Formulating a clinical question

STITCH trial

The small bites suture technique versus large bites for closure of abdominal midline incisions (STITCH) trial was a double-blind, multicentre, randomised controlled trial published in 2015. It compared two different spacings of suture for fascial closure following midline laparotomy—one bite per 1 cm versus one bite per 0.5 cm.

This study demonstrated that with smaller spacing between sutures when closing fascia, there was nearly a 50% reduction in the incidence of incisional hernia in the first year following operation. This trial is an excellent example of how even small technical modifications have the potential to improve patient outcomes if properly evaluated.

Protect

This was a study of surgery versus radiotherapy versus surveillance for localised prostate cancer. The use of prostate-specific antigen (PSA) testing for the detection of prostate cancer is controversial. When PSA is elevated and cancer is detected, it is uncertain whether there is benefit to treating tumours detected in this way.

The 10-year results of the ProtecT study found that survival following treatment of PSA-detected local prostate cancer did not differ between treatment groups. Surgery and radiotherapy were found to have lower rates of disease progression; however, these had associated risks. The results of this study pose a dilemma for men with localised prostate cancers in making a choice between surveillance or a radical intervention with urinary, sexual and bowel function complications, both of which have similar long-term survival outcomes.

The first step in the design of a clinical research study is to formulate a study hypothesis or question. Understanding the constituent parts of a clinical hypothesis are key to evaluating the relevance and quality of surgical research studies.

A simple, structured approach can be used to formulate clinical questions. This approach considers several important aspects of a clinical research study:

  • ‘P’- Population (those patients with the target condition)

  • ‘I’- Intervention or exposure (the intervention or exposure being studied)

  • ‘C’- Comparison (the control group that the intervention is being compared to)

  • ‘O’- Outcomes (what was the outcome of interest and how was it measured)

  • ‘S’- Study design (how the study was conducted)

This is known as the ‘PICOS’ approach. We can consider many clinical questions in this way ( Box 1.1 ).

Box 1.1
Comparison of prospective and retrospective studies

Prospective cohort studies Retrospective cohort studies
Advantages Advantages
  • Real-time data collection

  • Enables investigators to seek missing data as the patient is present during data collection

  • Allows interventions and changes to the patient pathway to be made (in non-observational studies)

  • Faster to conduct

  • Less expensive

Disadvantages Disadvantages
  • Longer time frame

  • More expensive

  • Requires more staff time

  • Difficult to find missing data

  • Cases may be missed more frequently

  • May require case notes to be found

  • Investigators assessing independent/explanatory variables may already be aware of outcome

As an example, we can study the use of antibiotics for adults with acute appendicitis. In this situation, antibiotics are used to reduce complications of appendicitis and reduce the necessity for surgical intervention.Using a PICOS approach:

  • P: The population is adults with non-perforated acute appendicitis;

  • I: The intervention is use of antibiotics;

  • C: The comparison group is that of usual clinical care (appendicectomy);

  • O: The primary outcome is complication rate, defined using a validated grading system;

  • S: The study design is a randomised control trial.

Population or target condition

A study population is a group of participants selected from a general population based on specific characteristics. Having a clearly defined study population is key to ensuring research is clinically relevant and can answer a specific question. If a target condition is used as the basis for selecting a population, validated diagnostic criteria should be applied to identify such a population.

The basis for the inclusion of patients in a study should be systematic, to avoid bias. The best way of ensuring a sample is both representative and free from bias is to approach every eligible patient in a consecutive manner. This is often referred to as a consecutive sample.

Specific study populations may have special considerations that must be considered in the study design. An example of this may be an older age group, where visual or hearing impairment may present difficulties with particular data collection methods, such as telephone interviews.

Intervention or exposure

The intervention is the main variable changed within the treatment group. In observational research, patients and clinicians decide which treatment will be received. As no direct experimental intervention occurs, the variable is termed the ‘exposure’.

When considering an intervention for surgical research, particular care must be given to standardisation between patients and clinicians. Delivering interventions in a standardised manner is essential to ensure patients are comparable. Variation in how a treatment is delivered should be considered during the design process. For example, in a study of a new surgical technique or approach, training in the delivery of the new intervention requires both time and some means of ensuring all patients receive a similar treatment. Standardising the surgical intervention may include an assessment within the research study to determine whether an acceptable level of competence has been achieved. A good study protocol will help to address this (more about Study protocols can be found in the Bias section).

The acceptability of the intervention must also be given due thought. If an intervention is not acceptable to patients, it will be very difficult to convince patients (and research ethics committees) to participate. One means of ensuring interventions are acceptable is to involve patient representatives when designing research studies. These patient representatives are part of the study design team and provide feedback to investigators on the best means of ensuring studies are conducted in a feasible and acceptable manner.

Comparison

The comparison describes the control group the intervention or exposure of interest is being evaluated against. This group should be sufficiently comparable to the intervention group to ensure valid conclusions may be drawn about the true effects of the any exposure or intervention.

In controlled trials, the comparison group often uses a placebo to reduce bias and maintain blinding. In many studies, the comparison group receives the current gold standard of clinical care, to create a group where the new treatment can be compared directly with the current best or usual therapy.

If the control group differs from the intervention group, bias may be introduced leading to invalid conclusions. This is of particular concern in observational research, such as case-control studies, where the comparison group is selected by investigators.

Outcome

A study outcome is the variable by which the effect of the intervention or exposure of interest is measured. For an outcome to be useful, several properties of the chosen measure should be considered:

  • Incidence: The outcome of interest should be sufficiently common. The more common the outcome of interest is, the smaller the sample size required to demonstrate a difference between treatment groups.

  • Directness: The outcome of interest should directly measure what the intervention is ultimately intended to achieve. For example, intraoperative warming devices are designed to keep the patient warm, thereby reducing postoperative complications. Therefore, any study investigating the efficacy of intraoperative warming devices should use postoperative complications as a primary outcome rather than looking at differences in temperature (termed a ‘surrogate’ outcome).

  • Definition: All outcomes should be measured using clearly defined criteria. Preferably these criteria would be used widely across the field, so all studies are measuring the same outcome in the same way. One example of such criteria would be the Response Evaluation Criteria In Solid Tumours (RECIST) criteria for the assessment of tumour progression.

  • Relevance: Outcomes should be relevant to the research question and to patient care. What may be relevant to clinicians may not also be relevant to patients. To this end, patient representatives should be consulted when selecting study outcomes. Often quality of life and functional outcomes are of primary concern to patients.

  • Timing: Outcomes must be measured at appropriate time intervals, which are relevant to the timeframe where the outcome of interest would be expected to occur. For example, measuring an outcome at 72 hours following a procedure may be relevant for postoperative bleeding, but not for surgical site infection.

  • Reliability: The chosen outcome measure should be able to reliably detect an event and should be standardised for all patients. This is particularly important for multicentre research, or large-scale trials where multiple observers are judging outcomes. Here having a standardised means of assessing an outcome is important, as it must be ensured that all patients are being assessed in the same way, otherwise the results of the study may be inaccurate.

Patient-reported outcome measures (PROMs) are outcomes which are reported by patients themselves. This contrasts with outcomes where an investigator or clinician reports an outcome. Example of PROMS may include validated general quality of life questionnaires (such as the EQ-5D or HR-QoL). Alternatively, PROMs may focus on specific areas of interest such as sexual function and continence, which are of particular importance in pelvic surgery. ,

Cost-effectiveness is sometimes used as an outcome measure. This is often used by clinical governance bodies to decide whether a treatment represents value for money and should be recommended for use.

Occasionally, different outcomes may be combined to form a single measure. This is known as a composite measure. The use of composite outcome measures can enable researchers to measure two or more outcomes simultaneously and gives better statistical efficiency for the number of patients in a study. An example may be unplanned critical care admission or death, where both represent a major complication, but they are pooled to increase the number of events, which reduces the sample size required for demonstrating the effectiveness of a treatment. While this is beneficial, it can result in outcome measures that prove challenging to interpret.

Study designs

There are many study designs available to answer clinical questions. Study design methodology is constantly evolving; however, all types of study can be broadly divided and sub-divided into the following categories:

  • Primary research (research at the individual patient level)

    • Randomised trial

    • Prospective cohort study

    • Retrospective cohort study

    • Cross-sectional study

    • Case-control study

    • Case series

    • Case report

  • Secondary research (research that considers multiple sources of primary research)

    • Systematic review

    • Systematic review with meta-analysis

Figure 1.1 below provides a useful means of identifying the type of study where a report may not be explicit as to its design:

Figure 1.1, Defining study design flowchart. 10 (© NICE [2012] Methods for the development of NICE public health guidance [3rd edition]. Available from https://www.nice.org.uk/process/pmg4/resources/methods-for-the-development-of-nice-public-health-guidance-third-edition-pdf-2007967445701 . All rights reserved.)

Another way of classifying evidence is according to the Oxford Centre of Evidence-based Medicine (CEBM), which divides evidence according to the risk of bias. These are commonly referred to as the ‘Level of evidence’ ( Table 1.1 ).

Table 1.1
Oxford CEBM levels of evidence
Level of evidence Interventional study Diagnostic accuracy
1a Systematic review of randomised controlled trials (RCTs) Systematic review of level 1 diagnostic studies
1b Individual RCT with narrow confidence interval Validated cohort study with good reference standards
1c Where all patients with condition have previously died without fail, but with the intervention tested some survive Studies with a diagnostic sensitivity or specificity so high the result can rule in or out a diagnosis with very high accuracy
2a Systematic review of cohort studies Systematic review of level 2 or higher diagnostic studies
2b Individual cohort study or low-quality RCT Exploratory cohort study with good reference standards, or only validated on split sample databases
2c Ecological studies
3a Systematic review of case-control studies Systematic review of 3b or higher studies
3b Individual case-control study Non-consecutive study, or study without consistently applied reference standard
4 Case series and poor-quality cohort and case-control studies Case series and poor-quality cohort and case-control studies
5 Expert opinion without explicit critical appraisal Expert opinion without explicit critical appraisal

One well-known way of classifying these studies is using the ‘pyramid of evidence’, which provides a simplified means to consider the advantages of one study design over another. This pyramid considers the inherent properties of each study design and classifies them according to how likely the study is to provide a reliable answer to the study question, as close to the ‘true value’ as possible.

The pyramid of evidence is accompanied by several caveats ( Fig. 1.2 ). First, not all questions can be answered by a randomised controlled trial either due to lack of equipoise (where it would be unethical to conduct a trial) or for logistical reasons (where a trial would be too expensive or simply unfeasible). Second, a poorly conducted study may give an inaccurate, or unreliable answer compared with a well-conducted study of a different design. For example, a small poorly conducted randomised trial may not provide a better answer to a question than a large, well-conducted prospective cohort study. Here is where the Oxford CEBM classification and Grading of Recommendations Assessment, Development and Evaluation (GRADE; discussed later) assessments are useful to help determine whether there can be certainty in the body of evidence for a given research question. ,

Figure 1.2, The classical pyramid of evidence.

Systematic reviews and meta-analysis

At the top of the pyramid are systematic reviews and meta-analyses, which take multiple sources of evidence for a given clinical question. These sources may take the form of randomised trials, cohort studies or even case-control studies ( Fig. 1.3 ). Systematic searching methods are used to attempt to find every possible study which may contain the answer to the clinical question of interest. These methods often include searching multiple databases, searching the references of key articles in the field and contacting experts to ask if they have suggestions for potential studies to include.

Figure 1.3, (a) Parallel group design. (b) Cross-over group design. (c) Cohort study schematic. (d) Case control schematic.

Once all studies that may be eligible are screened, each study is critically appraised individually (see Critical Appraisal). This approach of systematically identifying every possible study, followed by scrutiny of the study methods is then synthesised into the final review, which should present an overarching and balanced view of the evidence for a given clinical question.

Meta-analysis describes the use of statistical methods to combine the numerical results of similar studies in order to derive an estimate of how well the intervention of interest works (i.e. the treatment effect). This is often combined with a systematic review where the results of studies are combined after being screened for inclusion. Combining studies in this manner is referred to as ‘pooling’.

An important part of meta-analysis is an assessment of the similarity between different studies of the same clinical question. Variation between studies is termed heterogeneity and takes two forms, clinical and statistical heterogeneity. Clinical heterogeneity concerns how clinically similar a population in one study is to another. It is unlikely to make sense to combine the results of two trials examining the same treatment in two distinct populations, for instance, adults and children. Statistical heterogeneity refers to the differences in the actual results of included studies and whether these are likely to be due to chance (sampling error) or true differences in outcome. This is summarised by the I-squared statistic and is 0% when all included trials provide a similar result. If studies show completely contradicting results, the I-squared statistic is 100% meaning the combined result likely lacks any real meaning. A rule-of-thumb for the interpretation of the I-squared statistic is as follows:

  • 0–30%: Low level of statistical heterogeneity

  • 30–50%: Moderate level of statistical heterogeneity

  • 50–100%: Substantial level of statistical heterogeneity

Randomised controlled trials

Randomised controlled trials are primary research studies that test the effect of an intervention using experimental methods. After participants are recruited, they are allocated to treatment groups at random, where the choice of treatment is not determined by the patient, clinician or any other person. Removing choice of treatment reduces the risk of selection bias, which is often the largest source of bias in medical research. Selection bias simply refers to how doctors and patients select treatments based upon certain patient characteristics.

The controlled part of the name refers to two integral components of high-quality trials. The first is that there is a comparable control group with which to compare the intervention of interest. The second is that the trial is conducted in strict accordance with a pre-defined study protocol.

Study protocols are important and are essentially a detailed instruction manual outlining how a piece of research will be conducted. The main advantages of a study protocol are that it ensures investigators stick to a pre-planned analysis of their data and in the case of multicentre research, promotes standardisation across centres. Study protocols are not unique to randomised controlled trials. It is highly recommended that all clinical research (including systematic reviews and cohort studies) should be conducted according to a pre-defined protocol. Many journals now stipulate this as a requirement for even considering a study for publication.

Randomised controlled trials can be grouped into two categories:

Explanatory trials

An explanatory trial is designed to explain precisely how an intervention may work, in other words, it is designed to elicit mechanisms. To do this, an explanatory trial will find a highly homogenous group of patients to test a hypothesis on. The use of a placebo is common as investigators will attempt to control for as many factors as possible.

Pragmatic trials

A pragmatic trial is designed to encompass as much variation in clinical practice as possible. The population of these studies is often highly heterogeneous to make the trial generalizable to the true population the intervention will be used upon in clinical practice. Often, instead of placebos, pragmatic trials compare interventions to the current standard of care. Pragmatic trials are usually large in size and conducted across multiple centres. These trials are intended to determine whether an intervention works in ‘real-life’ healthcare systems rather than a highly controlled environment.

Monitoring clinical trials

Research studies should ideally capture measures of both effectiveness and safety. A new treatment that is not safe but very effective might not be very useful. The drug thalidomide is an example of why safety monitoring is crucial. Thalidomide was found to be effective in relieving morning sickness in pregnancy and became very popular. However, poor testing and disregard of emerging reports of teratogenicity led to thousands of children being born with severe limb malformations.

Well-conducted clinical trials may identify adverse effects prior to wider uptake of a treatment. However, if an adverse effect is rare, it may only be identified after lots of people have taken the treatment.

Data monitoring committee (DMC) is a team of experts who are independent from the clinical trial investigation team. The DMC looks periodically at data from the clinical trial as it progresses. If a treatment within a trial looks as if it could be causing harm, the DMC should stop the trial to avoid further harm to future participants who might be enrolled. Similarly, if a treatment is shown to be very effective, more so than first thought, the trial can be stopped as it would be unethical to give future participants a less effective control treatment.

LEOPARD-2

This randomised trial compared laparoscopic versus open surgery for removal of pancreas cancer (pancreatoduodenectomy). In other types of surgery, laparoscopic surgery has been shown to reduce complications and shorten the time taken to recover from an operation.

Monitoring of the LEOPARD-2 trial whilst it was ongoing found that 15% of patients who received laparoscopic surgery died within 90 days of surgery compared with none in the open surgery group, prompting discontinuation of the trial.

Randomisation

Randomisation allocates patients to treatment groups without the clinician or patient choosing the treatment.

If done correctly, randomisation aims to ensure treatment groups are equally balanced for confounding factors which may influence outcomes, both observed factors and unobserved factors. This enables us to draw fair comparisons between two treatments.

Randomisation ensures treatment groups are balanced for characteristics and confounding factors, both of which may be observed (data collected upon them) or unobserved (no data collected on or not yet discovered). Randomisation may be performed in two different ways:

Simple randomisation

Simple randomisation methods involve allocating participants to treatment groups using techniques such as coin flipping or rolling a die. These methods do not use any pre-planned methods of allocation. The advantages of these techniques are that they are easy to use and do not require any specialist equipment or planning. The disadvantage is that these methods cannot account for more complex allocation requirements, such as multiple groups or ensuring other factors are accounted for (see ‘minimisation’ below). With a large enough sample size, simple randomisation methods should result in approximately equally sized groups. However, for small trials, simple randomisation methods can lead to unequally sized treatment groups.

Block randomisation

Block randomisation describes the process of randomising ‘blocks’ of patients to treatment groups, which may be based upon a specific characteristic (i.e. sex or age). Using blocks theoretically reduces differences across the treatment groups being compared. This is of particular advantage in smaller trials, where unbalanced or unequal groups cause greater problems than in trials with large sample sizes.

A good example of where selection bias may arise would be studying the use of non-steroidal anti-inflammatory drugs (NSAIDs) after surgery. In observational research, it has been found the use of NSAIDs is associated with fewer surgical complications. However, in these studies, where clinicians or patients choose to take NSAIDs, the group of patients who are given NSAIDs are fitter and healthier. Therefore, the observation that patients who take NSAIDs are less likely to suffer complications is confounded. If this was a randomised trial, both treatment groups would have similar baseline characteristics due to the randomisation process, which would reduce or eliminate this confounding.

Randomisation may be stratified, which is a technique that first divides patients into discrete risk groups, or strata, and then randomises a patient within the given stratum to a treatment. This is a straightforward means of ensuring treatment groups are balanced for factors that may result in an increased likelihood of a given outcome occurring. For example, patients with diabetes are more likely to suffer wound infections following surgery. A trial looking at methods to prevent wound infections could use stratified randomisation to first divide patients into ‘diabetic’ and ‘non-diabetic’ strata and then randomise these patients to the different treatment arms. This would therefore result in both treatment groups containing a similar number of diabetic patients, thus enabling a fairer comparison.

An extension of stratified randomisation is minimisation. Using sophisticated computer programmes, randomisation procedures can be designed which take into consideration several different patient characteristics, a method known as ‘minimisation’. This ensures treatment groups are better balanced.

Blinding

Blinding is the act of disguising which treatment a participant in a research study has received. Blinding in surgical studies often proves to be difficult, and requires some flair or creativity to identify a suitable strategy for implementing blinding successfully.

There are four main parties which can be blinded to an intervention or an outcome:

  • Patients

  • Those administering the treatment, i.e. the clinical team

  • Those who are measuring the effect of the treatment, i.e. observers

  • Those performing the data analysis, i.e. the statistician

In the past, where one of these groups was blinded the study was termed ‘single-blind’, where two groups were blind ‘double-blind’ etc. Nowadays it is best practice to specify explicitly who was blinded and what exactly was done rather than use terminology that may be unclear without qualification.

In studies where drug administration is involved, participants and clinicians can be blinded by using placebos (i.e. ‘dummy pills’). Where intravenous drugs are used, normal saline could be used in replacement for a drug. Coloured intravenous fluids may be masked by black tubing. If groups are receiving a different surgical procedure altogether, it can be difficult or impossible to blind patients and is impossible to blind the surgeon to the procedure. There are, however, means to address this.

Observers are often used in surgical studies where blinding patients or clinical teams is impossible. These observers measure outcomes but are independent from the clinical team and are blinded to which intervention the patient received.

In all these cases, blinding enables the person who is assessing the outcome to judge it fairly. If the person assessing the study outcome did know which treatment the patient received, they might have preconceptions and judge the outcome in a biased way to favour one treatment over the other.

Finally, it is important to ask patients at the end of the study which treatment group they believe they were allocated to. By asking every patient this question, it enables the adequacy of the blinding to be tested.

Open-label studies

An open-label study is one where both the researchers and the patients are aware of which treatment they are receiving. They can be randomised, non-randomised and even omit a control group.

FIDELITY

In Finland, Sihvonen et al. compared arthroscopic surgery versus no surgery for degenerative meniscal tear. The intervention group received arthroscopic surgery and the control group received sham surgery. In the sham surgery group, the patient was exposed to an environment where they believe they were undergoing surgery, but no operation was actually performed. In this trial, participants were taken into the operating theatre, draped and shown a video of arthroscopic surgery. Dressings were then placed to make it difficult to identify whether they underwent operation or not.

Sham surgery is ethically contentious, particularly with regards to procedures that must be conducted under general anaesthesia due to the associated risks of harm.

Cluster randomisation

Some studies may randomise at a group rather than patient level, e.g.by hospital. This can be useful when investigating healthcare processes, public health interventions or complex treatments. Cluster randomised trials ( Fig. 1.4 ) have specific considerations that must be taken on board whilst designing the study and are often more complex to conduct.

Figure 1.4, Cluster randomisation.

Some studies use methods which rely upon factors that appear to be random to allocate patients to treatment groups. These may include birthday, day of the week or even whether the patient’s hospital number is odd or even. Although these may appear to be random numbers, they are not. This is called ‘pseudo-randomisation’ or ‘quasi-randomisation’ and studies utilising this approach should not be classed as randomised trials.

Allocation concealment

Allocation concealment is often confused with blinding. It describes the principle that those determining which treatment a patient receives should not influence the selection process, either consciously or subconsciously. The classical way of addressing this is to use opaque envelopes to disguise which treatment allocation is contained in the envelope. Nowadays, this is more commonly done using a computer.

Phases of clinical trials

To take a new drug or procedure from concept to clinical practice requires several steps. There are five phases of research trials ( Fig. 1.5 ):

Figure 1.5, Phases and stages of clinical trials.

Phase 0 trials

These trials are ‘first-in-man’ studies. The compound or procedure has usually only been used in preclinical research (usually in animals or cells) before phase 0 or phase 1 trials. Here, very small quantities of promising new compounds are given to healthy humans in a highly controlled environment. These studies aim to approximate the pharmacokinetic and pharmacodynamic properties of a drug (e.g. half-life, distribution, excretion, absorption, metabolism and potential toxicities).

Phase I trials

Phase I trials aim to formally assess the safety of a treatment. Here a range of doses within the postulated therapeutic range are tested (this is called ‘dose ranging’) in a larger sample and the side effects of these varying doses assessed. Usually, this sample is of healthy volunteers, but can be those with the target condition. From these studies, the dose with the most acceptable side-effect profile is selected.

Phase II trials

In phase II trials, a new treatment is administered to patients with the target condition to determine safety and efficacy . As these trials are designed to elucidate whether a new treatment works or not, they are usually tightly controlled and may compare the new treatment to a placebo.

Phase III trials

This is the final stage of testing prior to a new treatment being released on the open market. Phase III trials are the largest trials, where a new treatment is given to patients with the target condition. Large sample sizes are used in order to further identify side effects and to accurately estimate the therapeutic effect of the new intervention. Some phase 3 trials are pragmatic rather than explanatory, and seek to confirm that the new treatment will work in the real world.

Phase IV: Post-marketing studies

Once the new treatment has passed national regulatory approvals, it can be sold to healthcare providers. After approval, all drugs undergo monitoring for rare side effects and other harms that may have been missed in the initial studies. This process is called post-marketing surveillance. One well-known approach to post-marketing surveillance in the UK is the ‘Yellow-Card Scheme’, where clinicians can report adverse events that are suspected to have been due to a specific drug.

IDEAL framework

In surgery, the IDEAL framework (Innovation, Development, Exploration, Assessment, Long-term monitoring) describes how innovations in surgery should be approached to enable direct translation to surgical care ( Table 1.2 ). It describes five stages of development which should be used to guide a new idea from conception to widespread clinical implementation.

Table 1.2
The stages of the IDEAL framework
Stage Innovation Development Exploration Assessment Long-term monitoring
Description of stage Proof of principle, typically done in an animal or select patient Refinement of idea, through practice or laboratory development Beginning to compare refined idea to current practice to establish feasibility in clinical practice Compares idea to current practice through well-powered study aimed to show clinical efficacy and safety Monitors the idea in practice for long-term outcomes, safety and rare events
Type of study Case report Case series Randomised clinical trial/ Prospective study Randomised clinical trial Database

Cohort studies

In a cohort study, a defined population with a specific condition or a population undergoing a specific type of surgery is identified and followed over a defined period. Cohort studies are typically observational research, which means instead of patients being allocated to treatment groups, patients and doctors are free to choose. Instead of an ‘intervention’, which implies allocation to treatment groups, the intervention is often referred to as an ‘exposure’ ( Fig. 1.3c ).

As cohort studies are simply observing what happens in clinical practice, they may be conducted looking forwards in time (prospective) or backwards in time (retrospective). There are advantages and disadvantages of each approach are listed in Box 1.1 .

You're Reading a Preview

Become a Clinical Tree membership for Full access and enjoy Unlimited articles

Become membership

If you are a member. Log in here