40 Writing research

So far, you have learnt the about process of research: asking a RQ, designing a study, collecting data, describing and summarising the data, and analysing the data (confidence intervals; hypothesis tests). In this chapter, you will learn to:

  • write research effectively and clearly.
  • appropriately structure your research writing.

40.1 Introduction

All students in scientific, engineering and health professions need to read the research of others; that's how they stay up-to-date with the discipline. Some students will also need to write about their own research or the research of others. Understanding the language of research is important in either case.

The purpose of writing research is to effectively and clearly communicate. Formal guidelines for writing and reporting research exist, for experimental studies (CONSORT) and observational studies (STROBE), though we will not delve into these specifically.

The style and expectations vary widely between disciplines, and between journals (even in the same discipline), so in this chapter we make general comments about writing, rather than give specific requirements.

40.2 General tips

As noted above, the purpose of writing research is to effectively and clearly communicate. In the scientific disciplines, writing carefully and precisely is important: using the correct words appropriately is crucial. Think carefully about every word you use to ensure it conveys the correct and intended meaning. With this in mind, scientific writing:

  • should avoid ambiguity.
  • should use terminology consistently.
  • should use simple, clear but technically-correct language.
  • should present the facts in an unbiased manner.
  • should be clear, concise and complete.
  • should use facts to make statements.
  • should be complete enough that other professionals can repeat the study.
  • should avoid unnecessary repetition.

 

  • should not contain unnecessary words and phrases.
  • should not be haphazard, jumbled or illogical.
  • should not promote personal opinions.
  • should not reach conclusions not supported by the given evidence.
  • should not overstate what has been learnt from the study.

As William Howard Taft (27th president of the United States) is claimed to have said: Don't write so that you can be understood; write so that you can't be misunderstood.

Example 40.1 (Write what you mean) A student project at my university was titled:

Driving behaviours: Are dark-coloured car owners more likely to park undercover?

What they actually studied was:

Driving behaviours: Are drivers of dark-coloured cars more likely to park undercover?

Don't just be understood; avoid being misunderstood!

A series of experimental studies (Oppenheimer 2006) concluded that students often believe that using fancy words makes them appear smarter. However, one conclusion of the research was that using 'fancy' language does not achieve this: 'needless complexity leads to negative evaluations...' (Oppenheimer (2006), p. 151; emphasis added). Always use the best and most appropriate word; only use fancy words if necessary. One recommendation from the study is to (p. 153)

... write clearly and simply if you can, and you'll be more likely to be thought of as intelligent.

40.3 Article structure

Many scientific papers are structured to have these (or similar) sections, though it varies by discipline and journal:

  • Title and authors.
  • Abstract: A summary of the whole paper, without details.
  • Introduction: Why was the study done, and what was hoping to be achieved?
  • Methods: How was the study done?
  • Results: What was found?
  • Discussion (or Summary, or Conclusions): What does it mean?

Sometimes the Conclusion and Discussion are separate sections. Sometimes the acronym AIMRaD or IMRaD is used to remember these sections. These components capture the six-step research process in this book (Fig. 40.1).

The connection between the paper and the steps we have studied.  The Abstract briefly covers all aspects of the study, and the Discussion may combine elements from all areas also.

FIGURE 40.1: The connection between the paper and the steps we have studied. The Abstract briefly covers all aspects of the study, and the Discussion may combine elements from all areas also.

40.3.1 Titles

Titles are important: poor titles can discourage a reader from reading an article. A title should clearly describe the main purpose of the article. Titles sometimes pose questions ('Do warning lights and sirens reduce ambulance response times?'; Brown et al. (2000)) or provide answers ('No harm from five year ingestion of oats in coeliac disease'; Janatuinen et al. (2002)).

Example 40.2 (Article title) A good example of a title is:

Beauty sleep: experimental study on the perceived health and attractiveness of sleep deprived people

--- Axelsson et al. (2010)

A poor example of an article title is:

The nucleotide sequence of a 3.2 kb segment of mitochondrial maxicircle DNA from Crithidia fasciculata containing the gene for cytochrome oxidase subunit III, the N-terminal part of the apocytochrome b gene and a possible frameshift gene; further evidence for the use of unusual initiator triplets in trypanosome mitochondria

--- Sloof et al. (1987)

40.3.2 Authors

Ensure that everyone who has made an intellectual contribution is listed as an author. This is ethical practice (Sect. 4), and includes (based on Brand et al. (2015)) those who helped with:

  • conceptualisation.
  • methodology.
  • software.
  • validation.
  • formal analysis.
  • investigation.
  • resourcing.

 

  • data curation.
  • created image or took photographs.
  • writing, including writing drafts, reviewing and editing.
  • visualization.
  • supervision.
  • project administration.
  • funding acquisition.

40.3.3 Abstract

The Abstract is a short section at the start of an article summarising the whole paper, including the results; it is not an introduction! The Abstract is often the most important part of any article, as it is the only part that many people will read. Some (but not all) journals require a structured abstract, where the Abstract contains specific headings; these are usually much easier for a reader to follow.

The Standards for Reporting Diagnostic Accuracy (STARD) statement (Cohen et al. 2017) list essential items for Abstracts (slightly adapted):

  • Background and Objectives: List the study objectives (the RQ).
  • Methods: Describe:
    • The process of data collection;
    • The type of study;
    • The inclusion and exclusion criteria for individuals;
    • The settings in which the data were collected;
    • The sampling method (e.g., random or convenience sample);
    • The tools or methods used to collect the data.
  • Results: Provide:
    • The number of individuals in all groups included in the analysis;
    • Estimates of precision of estimates (e.g., confidence intervals);
    • Results of analysis (e.g., hypothesis tests).
  • Discussion: Provide:
    • A general interpretation of the results;
    • Implications for practice, including the intended use of the index test;
    • Limitations of the study.

These loosely align with the six steps of research used in this book.

Example 40.3 (Structured abstract) A research study examined the long-term effects of mortality after amputation (Singh and Prasad 2016). The structured Abstract, edited for brevity, is given below (p. 545):

Background: Mortality after amputation is known to be extremely high and is associated with a number of patient features. We wished to calculate this mortality after first-time lower-limb amputation and investigate whether any population or treatment factors are associated with worse mortality.

Objective: To follow up individuals after lower limb amputation and ascertain the mortality rate as well as population or treatment features associated with mortality.

Study design: A prospective cohort study [i.e., a study with forward direction]

Methods: Prospective lower-limb amputations over 1 year (\(N = 105\)) at a Regional Rehabilitation Centre were followed up for \(3\) years.

Results: After \(3\) years, \(35\) individuals in the cohort had died, representing a mortality of \(33\)%. On initial univariate analysis, those who died were more likely to have diabetes mellitus (\(\chi^2 = 7.16\), \(\text{df} = 1\), \(p = 0.007\)) [...] Diabetes (odds ratio\({} = 3.04\), confidence interval \({} = 1.25\)--\(7.40\), \(p = 0.014\))...

Conclusion: Mortality after amputation is extremely high and is increased in individuals with diabetes...

40.3.4 Introduction

The purpose of the Introduction is to:

  • gain the interest of readers, and encourage them to read more of the article.
  • establish the context and background.
  • define the language and definitions used in the study.
  • introduce the theoretical groundwork of the subject.
  • state the purpose of the paper: Why it was written, and what the authors hope to learn.
  • show how the research fills a gap in existing knowledge.

The introduction often includes a literature review too, though sometimes a literature review is a separate section.

40.3.5 Methods

The Methods section (sometimes called Materials and Methods) explains how the data were obtained; for example:

  • how the sample was identified and located.
  • how the data were collected (the data collection protocol).
  • how the data were analysed, including the software (and version number) used, and the statistical methods used.
  • what specialized equipment was used (don't list pencils, rulers, paper, etc.!).

40.3.6 Results

The Results summarise the conclusions from the analysis, especially regarding the initial RQ. The Results section:

  • shows all the relevant findings from the research.
  • presents a summary of the data: the number of observations, the number of missing values, and a verbal description of all variables.
  • presents tabular, numerical and/or graphical summaries of the data and relationships of importance.
  • gives a brief verbal interpretation of these summaries.
  • gives the results from any hypothesis tests and confidence intervals.
  • identifies trends, consistencies, anomalies, etc.
  • does not interpret or explain the results (that is the purpose of the Discussion).

Unless the dataset is small, the data itself is usually not given (though may appear in an Appendix or online).

Cutting-and-pasting software output into reports is rarely acceptable, except for graphs.

40.3.7 Discussion

Sometimes, articles have separate Conclusion and Discussion sections; sometimes they are combined. No new information should be presented in this section. The Discussion section:

  • summarises the results.
  • gives a short evaluation of the results.
  • answers the stated RQ.
  • discusses limitations (Sect. 9), strengths, weaknesses, problems, challenges.
  • tries to anticipate and respond to potential questions about the research.

Readers should reach the conclusions based on the evidence presented.

40.3.8 Other sections

Most research articles have other sections too.

The References (or Bibliography) section gives the full citations of any work referenced, in the required format (such as APA, Harvard, etc.). This is ethical practice. Most journals have strict guidelines for how references must be listed and formatted.

An Acknowledgements section thanks and acknowledges other contributions, such as people who legitimately contributed to the manuscript, and research funding bodies. Avoid saying "The authors would like to thank..." and instead directly thank them: "We thank...".

Often an Appendix is included, which contains important material that would otherwise break the flow of the article's narrative. The Appendix may include large tables, images, discussions of technical details, ... Sometimes, this material is placed online.

40.4 Further advice

40.4.1 Constructing tables, graphs and images

Good figures and tables take time and care to prepare. Their purpose should always be to display the data in the simplest, clearest possible way, and should be produced to display the important information of interest. In general, tables and graphs:

  • should be discussed (not just presented) in the text.
  • should be clear and uncluttered.
  • should includes units of measurement (such as kg) where appropriate.
  • should be able to be understood without reference to the paper, as far as possible.
  • should use easy-to-read fonts and colours: for example, ensure the font size is sufficiently large when placed in the article.
  • should avoid using different colours, line types or fonts unless these have a purpose (i.e., to differentiate between groups in the study), and that purpose is explained.
  • should not include chart junk (such as artificial third dimensions for graphs (Sect. 18.2), and unnecessary lines in tables).

Figures and images typically have captions below, while tables typically have captions above. The source of all images (e.g., the photographer) should be acknowledged, when appropriate.

40.4.2 Presenting numbers

  • When presenting numbers, ensure that all figures are rounded appropriately. Software may report more decimal places than necessary, for instance.
  • If appropriate, ensure units of measurement are given.
  • Be consistent and careful with decimal numbers. Some journals require numbers to be written with a leading zero, and some do not. For example, some require writing \(P = 0.024\) and some \(P = .024\).

Output from software may have to be sensibly rounded before being included in a report (including in tables and graphs).

40.4.3 Lexically ambiguous words

Avoiding the possibility of readers misinterpreting your writing is important, so write carefully and precisely. One potential source of confusion is words with a different meaning in research compared to every-day use (lexical ambiguity; Richardson, Dunn, and Hutchins (2013); P. K. Dunn et al. (2016)). If you are unsure of the definitions used in this book, use the Glossary (Appendix D). Some specific words where care is needed include:

  • Average: In research, 'average' refers to any way of measuring the typical value (Sect. 12.5) including the mean and the median, but also other measures too. Use the specific word 'mean' or 'median' if that is what you intend!
  • Confidence: In research, 'confidence' is usually used in the phrase 'confidence interval' (Sect. 26.3).
  • Comparison: In research, distinguishing between a 'within-individuals' and 'between individuals' comparison is important (Sect. 2.5).
  • Control: In research, a 'control' refers to a specific situation, and is helpful for maximising internal validity (Def. 2.8).
  • Correlation: In research, correlation describes the relationship between two quantitative variables (Sect. 17.4.1).
  • Estimate: In research, 'estimating' usually means to calculate a sample estimate for an unknown population parameter. In general use, 'estimate' often means to take a guess.
  • Experiment: In research, an experiment is a specific type of research study (Sect. 3.5). Use the word 'study' to talk about experimental and observational studies more generally.
  • Graph: In research, a 'graph' is used to summarise data (Chap. 18.2).
  • Independent: This words has many uses in statistics and research, in science, and in general use. The word 'independent' in this book refers to events that do not impact each other in a probabilistic sense (Sect. 19.7).
  • Intervention: In research, an 'intervention' (Sect. 2.5) is when the researchers manipulate the comparison.
  • Normal: In research, 'normal' often refers to the 'normal distribution' (Chap. 22.3). If this is not the meaning you intend to convey, consider using the word 'usual'.
  • Odds: In research, 'odds' has a specific meaning (Sect. 13.4.3) and is different that probability. In general use, 'probability' and 'odds' are often used interchangeably.
  • Population: In research, the 'population' refers to a larger group of interest (Sect. 2.2.1). In general use, 'population' usually refers to groups of people.
  • Random: In research, 'random' has a specific meaning. In general usage it often means 'haphazard' or 'without planning'.
  • Regression: In research, 'regression' refers to the mathematical relationship between two quantitative variables (Sect. 39).
  • Sample: In research, we say (for example) that we 'have one sample of \(30\) fungi' (Sect. 5.1); in some disciplines, this could be described as 'taking \(30\) samples'.
  • Significant: This is perhaps the most misused word in scientific writing. In research, 'significance' is usually understood to refer to 'statistical significance' (Sect. 33.6). If this is not the meaning you intend to convey, consider using the word 'substantial'.
  • Variable: In research, a 'variable' is something that can vary from individual to individual (Def. 2.10).

Some symbols may also have different meanings in research than in some other scientific disciplines; again, care is needed when using these symbols:

  • \(\beta\): In this book, \(\beta\) refers to the regression parameters (Sect. 39.3).
  • \(\rho\): In this book, \(\rho\) refers to the population correlation coefficient (Sect. 17.4.1).
  • \(\pm\): In this book, the symbol \(\pm\) is used for confidence intervals to describe a range of values in which the population parameter probably lies (Sect. 26).

40.5 Style

Different disciplines and journals have their own styles; read articles from your discipline or target journal to see how to write in the required style. Some general style recommendations include:

  • Use short sentences.
  • Write using complete sentences.
  • Use inclusive language (e.g., 'fire-fighter' rather than 'fireman').
  • Omit words, phrases, sentences that add nothing to the paper.
  • Check for commonly confused words: there/their; your/you're; affect/effect; chose/choose; etc.
  • Ensure capitalisation is correct.
  • Use apostrophes (not apostrophe's!) correctly.
  • Crucially, be unambiguous: Say what you mean, and mean what you say.
  • Ensure correct content, grammar, spelling, punctuation, format.
  • Use, but do not rely upon, the spell checker.

Example 40.4 (Short sentences) The first sentence should be accessible and engaging. Here is a very poor first sentence:

Until recently, atypical hemolytic uremic syndrome (aHUS), conventionally defined in the pediatric literature as a syndrome of the triad of renal failure, microangiopathic hemolytic anemia, and thrombocytopenia without a prodrome of hemorrhagic diarrhea, has received little attention in adult practice because the patients are commonly given the diagnosis of thrombotic thrombocytopenic purpura (TTP) or TTP/HUS and treated as TTP with plasma exchange, augmented in refractory cases with rituximab and sometimes even splenectomy.

--- Tsai (2014), p. 187

Example 40.5 (Writing carefully) This sentence appeared in a published article (Salimirad and Srimathi (2016), p. 14; emphasis added):

600 teachers, from both Government and Private Schools, have been drowned by random sampling.

This sentence is poor: no-one has ever been drowned by random sampling. Possibly, the authors mean that teachers were 'overwhelmed by participation in many research studies'. However, later the article states: "Using random sampling a total number of \(600\) teachers were selected..." (p. 17), so the initial wording is wrong, and I suspect the sample probably wasn't random either!

40.6 Chapter summary

  • 'Don’t write so that you can be understood; write so that you can't be misunderstood' (attributed to William Howard Taft).
  • Write what you mean; mean what you write (attributed to many).

40.7 Quick review questions

  1. What is the correct word to complete this sentence? 'The subject were told to eat [______] snacks at about 8am.'
  2. What is the correct word to complete this sentence? 'Seedlings were transplanted [______] pots containing one of three different soils.'
  3. What is the correct word to complete this sentence? 'Each kangaroos was observed for signs that [______] tracking device caused discomfort.'
  4. What is the biggest problem with this sentence? 'We took \(50\) samples of students; the mean age was 26.2 years.'
  5. What is the biggest problem with this text? 'Subjects are not blinded. Because the subjects would clearly know they were in a study.'
  6. What is the biggest problem with this text? 'The sample of pedestrians were all taken on a Thursday.'

40.8 Exercises

Selected answers are available in App. E.

Exercise 40.1 An article (Oyerinde, Bamisaye, and Essien 2019) reports (p. 1):

The regression correlation coefficients of \(0.999996066\) and \(0.999653453\) were obtained for the temperatures and speeds respectively.

What is the problem with this statement?

Exercise 40.2 Consider the NHANES data again. In preparing a paper about this study, suppose Fig. 40.2 and Tables 40.1 were produced. Critique these.

TABLE 40.1: A table of results
\(206.6\) \(46\)
\(214.64\) \(48.7945\)
\(8.03\)
\(1.25\) \(14.8\)
A boxplot

FIGURE 40.2: A boxplot

Exercise 40.3 In a student project at the university where I work, the students recorded the reading speed for students reading a portion of text, and compared the reading speed for people reading two different fonts. Their RQ was:

Which font allows [...] students to read a pangram the fastest, between a default and what is considered to be a 'easy to read' font.

In their Abstract, the conclusion was given as:

The Georgia font was the fastest to be read and is therefore the faster of the two.

  1. Explain why this is a poorly-worded RQ. Rewrite the RQ.
  2. Explain what is wrong with the conclusion. Rewrite the statement.

Exercise 40.4 In a student project at the university where I work, the students compared the heights that students could jump vertically, starting from a crouch (squat) or standing (counter movement jump; CMJ) position. Every student in the study performed both jumps. Critique their numerical summary (Table 40.2).

TABLE 40.2: The information showing how much higher the (standing) jump height is compared to the squat jump
\(n\) Mean Standard deviation Standard error Confidence interval \(95\)% \(t\) value \(P\) value
\(50\) \(7.48\) \(4.674\) \(0.661\) \(6.152\) to \(8.808\) \(11.316\) \(0.000\)

Exercise 40.5 In a student project, the aim was 'to determine if the proportion of males and females that use disposable cups on [the university] Campus is the same'. The two variables observed on each person in the study were whether or not the person used a disposable cup, and the sex of the person. In reporting the results in their Abstract, the students state:

Based on the sample results, the \(95\)% confidence interval for the population mean number of disposable cups used by males and females is between \(0.690\) and \(1.625\). Meaning that the population mean is likely to fall between those two intervals.

Critique this statement.

Exercise 40.6 In a student project, the aim was 'to determine if the average hang time is different between two types of paper plane designs'. The two variables in the study were design type (Basic Dart; Hunting Flight), and the hang time of the flight of the plane (in seconds). In reporting the results in their Abstract, the students state:

Very strong evidence proving a difference (\(P = .000\)) between the Basic Dart mean hang time (\(881.84\pm 140.73\) ms) and the Hunting Flight mean hang time (\(1504.19\pm 699.86\) ms). \(95\)% CI for the means of The Basic Dart (\(829.29\) -- \(934.39\)) and the Hunting Flight (\(1242.86\) -- \(1765.52\)).

Critique this statement.

Exercise 40.7 An article (Baur, Christophi, and Kales 2012) includes this in the Abstract:

Cardiovascular disease (CVD) accounts for 45% of on-duty fatalities among firefighters, occurring primarily in firefighters with excess CVD risk factors in patterns resembling the metabolic syndrome (MetSyn). Additionally, firefighters have a high prevalence of obesity and sedentary behavior suggesting that MetSyn is also common. Therefore we assessed the prevalence of MetSyn in firefighters and its association with cardiorespiratory fitness (CRF) in a cross-sectional study of 957 male career firefighters.

--- Baur, Christophi, and Kales (2012), p. 2331

  1. Critique Table 40.3.
  2. Critique Fig. 40.3. What would be a better graph to use?
TABLE 40.3: The OR and \(95\)% CI of MetSyn as a function of increasing METS and age (continuous) model 1: unadjusted, model 2 adjusted for age or cardiorespiratory fitness (CRF) (METS)
OR (\(95\)% CI) \(p\)-value
Model 1
CRF \(0.691\) (\(0.634\)--\(0.752\)) \(<0.0001\)
Age \(1.037\) (\(1.020\)--\(1.055\)) \(<0.0001\)
Model 2
CRF \(0.693\) (\(0.630\)--\(0.762\)) \(<0.0001\)
Age \(1.002\) (\(0.982\)--\(1.021\)) \(0.8713\)
A graph like that in the Baur et al. paper

FIGURE 40.3: A graph like that in the Baur et al. paper

Exercise 40.8 A study (Baughman, Sparkman, and Lower 2007) gave this information (p. 208):

The aim of our study was to determine the range of 6MWD [6-minute walk distance] in an unselected group of sarcoidosis patients. We performed a prospective study [i.e., forward direction] of sarcoidosis patients followed up in one tertiary sarcoidosis clinic.

Critique the graph in Fig. 40.4 which appears in the paper.

A graph like that from Baughman (2007)

FIGURE 40.4: A graph like that from Baughman (2007)