Skip to main content

Beyond health system contact: measuring and validating quality of childbirth care indicators in primary level facilities of northern Ethiopia



Measurement of quality of health care has been largely overlooked and continues to be a major health system bottleneck in monitoring performance and quality to evaluate progress against defined targets for better decision making. Hence, metrics of maternity care are needed to advance from health service contact alone to content of care. We assessed the accuracy of indicators that describe the quality of basic care for childbirth functions both at the individual level as well as at the population level in Northern Ethiopia.


A validation study was conducted by comparing women’s self-reported coverage of maternal and newborn health interventions during intra-partum and immediate postpartum care received in primary level care facilities of Northern Ethiopia against a gold standard of direct observation by a trained third party (n = 478). Sensitivity, specificity and individual-level reporting accuracy via the area under the receiver operating curve (AUC) and inflation factor (IF) to estimate population-level accuracy for each indicator was applied for validity analysis.


455(97.5%) of women completed the survey describing health interventions. Thirty-two (43.2%) of the 93-basic quality child birth care indicators that were assessed could be accurately measure at the facility and population level (AUC > 0.60 and 0.75 < IF< 1.25). Few of the valid indicators were: whether women and their companion were greeted respectfully, whether an HIV test was offered, and whether severe bleeding (hemorrhage) was experienced by the woman. An additional 21(28.4%) indicators accurately measure at the facility or individual level, but the indicators under or over estimate at population level. Thirteen other indicators could accurately measure at population level. Eight (8.6%) indicators didn’t meet either of the validity criteria.


Women were able to accurately report on several indicators of quality for basic child birth care. For those few indicators that required a technical understanding tended to have higher don’t know response from the women. Therefore, valid indicators should be included as a potential measurement of quality for the childbirth care process to ensure that essential interventions are delivered.

Peer Review reports

Plain text summary

What is already known on this topic? As facility deliveries increase and the global community pays greater attention to service quality, observation as a clinical quality assessment tool may be a valuable measure. However, the existing observation-based measures are lengthy, introduce the possibility for measurement error and are difficult to administer.

What does this study add? Our finding revealed that women were able to accurately report on several (n = 32) of the 93 basic quality child birth care indicators across phases of labor and delivery. A few of them were: whether women and their companion were greeted respectfully, whether an HIV test was offered, and whether severe bleeding (hemorrhage) was experienced by the woman. An additional 21 indicators met the facility-level accuracy, 13 met the population level accuracy and 8 indicators did not meet either of the criteria.

Indicators that met both validity criteria (AUC and IF) could be appropriate for the measurement of quality for care at the facility and population level.

Indicators that do not meet criteria for the AUC but do meet the IF criterion may be suitable to measure intervention coverage at the population level as false positive and negative reporting balance each other at the aggregate level.

Moreover, indicators that do met AUC criteria, but not IF criteria may be useful for facility level measurement or useful for individual level classification but may be over reported at the population level. However, indicators that met neither of the criteria are invalid for measurement purposes, but should be used in accordance with the rationale for their distinctive use.

What are the implications for practice and further research?

These accurate indicators are very important for other researchers as a potential measurement of quality of routine child birth care signal functions in the Ethiopian setting. Despite this, there have been few efforts to develop standardized metrics of quality for both mothers and newborns throughout the continuum of maternity.

Therefore, a further nationwide survey should be conducted to facilitate movement toward a collection of fewer but better metrics of quality (content) of care indicators across the phases of labor and delivery services.


The time encompassing labor, delivery and the first 24 h after birth is the highest risk period for maternal and neonatal health. Adhering to the standards of maternity care services is essential to ensure quality services are delivered [1, 2]. In low and middle-income country settings, where the vast majority of maternal and newborn deaths occur, data on the coverage estimates of routine facility childbirth interventions often rely on the contact of skilled birth attendant for monitoring purposes [3, 4]. But the presence of a skilled birth attendant does not guarantee the actual content of care [5, 6]. Research findings indicate that measuring interventions that a woman actually receives is more informative than measuring contact with care providers, provided the women can accurately report this information [4].

Measurement of the quality of processes for basic care of childbirth interventions is complex and requires attention to empirical validation of the indicators to strengthen the quality of measurements [7]. Studies have documented the poor quality and limited sensitivity of obstetric facility records and databases for assessing the performance of care processes in both low and high-resource settings [8].

Absence of health monitoring systems that can provide accurate data on population coverage or demographic health survey programs collect inadequate information of content of care received during facility childbirth. Additionally, several researchers have highlighted discrepancies between contact with care providers and receiving quality care [9, 10]. This research notes that a number of composite measures or checklists have been developed through expert opinion, but few have been validated. They suggest that empirical validation is important in strengthening quality measures [11]. While direct observations of clinical care is considered the gold standard for measuring the quality of care, the existing observation-based measurement of childbirth process of care are lengthy, at times including hundreds of indicators [12]. This complexity introduces the possibility of measurement error, difficulty of administration, costliness, and lack of feasibility for routine use in most resource poor settings [13, 14]. In addition, we were unable to identify a publication describing a tangible study on validity of quality indicators of basic childbirth care interventions in Ethiopian settings.

Therefore, it is important to identify alternate indicators that describe the actual basic content of child birth care that can be reported accurately to be included in facility or routine data collection programs and population based-surveys. The aim of this study is to determine which aspects of basic child birth process of care indicators are able to differentiate between the two sets of measures (women’s self-reports compared against third-party observations), to explore whether women can appropriately report on these indicators and provide suggestions for modifications to data collection procedures that could advance the measurement of maternity care. This investigation also provides an opportunity to apply the results of this study for tracking progress and to enhance the monitoring of effective coverage of essential and basic interventions for both mothers and newborns.


Study setting and population

A facility-based cross-sectional validation study was conducted among primary health care facilities of South Eastern zone of Tigray, Northern Ethiopia. The zone has four rural districts, namely Degua Tembien, Enderta, Saharti Samre and Hintalo Wajrat. In the districts, there are a total of 4 primary hospitals and 27 health centers [15]. At the time of the study, nearly a quarter of the populations (23.4%) were of reproductive age (15–49) years. According to the 2016 Ethiopian Demographic Health Survey, nationally 26% and in the Tigray region 57% of mothers delivered in a health facility [16]. Reproductive-age group mothers who received labor and delivery care in primary health care facilities were the source population.

Indicator selection

To identify indicators to be validated, reviews and scans of published and grey literature focused on indicators of the content of care received during facility child birth; this review was conducted between April and June 2018. Indicators were identified by a key term search of maternal health, safe motherhood, quality of care, indicator, valid, skilled birth attendant, obstetric, and intra-partum care. After collecting a list of 112 indicators, a group of reproductive health experts identified 93 key dimensions or set of indicators for validity testing [Additional file 1].

The validation indicators were selected based on the frequency of use and/or potential to assess the essential elements of mothers and newborns. The final indicators were placed into one of four categories: (1) respectful maternity care; (2) content of care; (3) non-indicated obstetrics care; and (4) maternal-neonatal outcome sections.

Sample size

Buderer’s formula is used for sample size calculation in diagnostic accuracy studies at the required absolute precision level for sensitivity and specificity [17, 18]. Considering the proportion of mothers who received essential care practices at childbirth was 35.7% from a prior study in India [19], type 1 error was set at α = 0.05, considering a sensitivity level of 80%, a precision of 6%; specificity of 60% and a 5% non-response rate were used. As a result of all these conditions, a target sample size of 478 laboring women to be observed was calculated.

Sampling technique and participant recruitment

The South-Eastern zone of Tigray region was selected purposely. All health centers with their respective catchment primary hospitals were included. The total sample size of the delivering women was distributed over each of the health facilities proportional to their sample size considering the average number of deliveries per facility per month and all the skilled birth attendants consented to participate in the study were enrolled. Finally, a consecutive sampling technique was used in which every laboring mother meeting the criteria of inclusion (normal first stage of labor) is selected until the required sample size was achieved. Each skilled birth attendant was observed 3–5 times.

Data collection procedure

Data collection was conducted between July 15, 2018 and October 5, 2018. Twelve pairs of midwives, health officers and nurses worked as data collection teams constituting one observer and one interviewer for each facility. Data collectors had previous research experience and were trained for four days. The team worked in two shifts (day and night). Providers were observed by a third party consisting of trained data collectors using a structured checklist. An indicator matrix or structured checklist [Additional file 2] was developed from the Ethiopian basic emergency obstetrics guidelines [20] and published literature [4, 5, 9]. The interview questionnaires were translated into the appropriate local language “Tigrigna” and underwent minor modifications to improve local understanding and clarity for participants. Moreover, for few of the technical questions special emphasis was given in how the mothers could be easily understood by their local language expressions. The method of observation was nonintrusive, where the health care providers (HCPs) did what they normally do without being interrupted or disturbed by the observer. Observations were used as the reference standard as they reflected all facets of care including all interactions between the women and the providers. Data were again collected using exit interviews with an interviewer-based questionnaire from the delivered mother at the time of facility discharge. Interviewers and observers were not the same individuals and were external to the study facilities to reduce the possible social desirability bias.

Statistical analysis

For each participant, a unique identification code for the client exit interview and observation record was matched. Questions of basic intervention indices were coded one if the response was performed “Yes”, zero for “No” responses and all other responses were coded as “don’t know (DK)”. Unmatched cases, missing and “DK” responses by the woman, as well as indicators that had less than five counts per cell were excluded from further validation analysis because of not fulfilling the assumptions of the validity analysis criteria. We assessed two aspects of indicator validity (accuracy of the women’s reports against the observer). The first one is accuracy at the individual or facility level, calculated as the area under the curve (AUC) which is a plot of the sensitivity (i.e., true positive rate) versus 1-specificity (i.e., true negative rate).

AUC scores range 0 to 1, with an AUC of 0.5 representing a random guess and AUC of 1 representing perfect diagnostic accuracy. For the purposes of this study we used an AUC of 0.6 or greater as a priori benchmark of validity testing [9].

The second measure of validity is to estimate the prevalence of the indicator that would be obtained from a population-based survey or population level accuracy, calculated based on the sensitivity and specificity of each indicator to its true prevalence (i.e., observer report) using the following equation: Population based prevalence = true prevalence x (sensitivity + specificity − 1) + (1- specificity). Inflation factor (IF), or the ratio of the survey-based prevalence to the true prevalence, is estimated to assess the degree to which each indicator would be over or under- estimated at the population level [14]. A priori validation criteria for the IF was set at 0.75 < IF< 1.25. In order to summarize indicator validity based on reports from women’s giving birth, we considered meeting both the individual-level (AUC) and population-level (IF) criteria (0.60 < AUC and 0.75 < IF < 1.25) [12, 21, 22]. All analysis was performed using Stata Version 14 software [23].


Sample descriptive characteristics

Overall, 478 women admitted for labor and delivery were consented to participate. Of those who consented to participate, a total of 467 women were enrolled in the study. Among those enrolled, 2.5% (n = 12) were lost to follow-up or discontinued their participation. Finally, a total of 455 observer reports and client exit interviews were accurately matched and analyzed.

Socio-demographic characteristics of participants

The mean age of women was 28 years (SD = 6.38), and ranged between 17 and 45 years. Over a third of women (41%) reported no formal education. Around 95.4% (n = 434) of women received antenatal care provided by skilled health personnel for reasons related to pregnancy at least once during their current pregnancy [Table 1].

Table 1 Percent distribution of women by background characteristics in Northern Ethiopia, N = 455

Validation results for recognized indicators of quality childbirth care signal functions

Based on a woman’s report about her experience of care during childbirth, of the recognized quality care indicators (n = 93) for validity analysis, 14 had a greater than 5% DK response by women and 5 indicators did not fulfill adequate cell size (i.e., at least 5 counts per cell). The latter five indicators included the use of enema, pubic shaving, slapping the newborn, something other than breast milk given to the baby in the first hour of birth and recording the birth weight of the newborn.

Finally, 32 indicators met both validation criteria, 21 indicators met individual - level, 13 indicators met population-level and 8 indicators didn’t meet either of the criteria. About women’s responses: a high percentage of women who responded “DK” were for the indicator of Apgar score (43.5%). While minimal “DK” responses were reported for the indicator that the provider palpate the woman’s uterus 15 min following delivery of the placenta (5.93%). All the indicators of DK women’s response were lying in the content of child birth quality indicator category [Table 2].

Table 2 Indicators with greater than 5% “Don’t Know” of women’s response in Northern Ethiopia, (N = 455)

The subsequent findings report validated quality of care during child birth indicators in accordance to: (1) Respectful maternity care, (2) Content of care, (3) Non-indicated care practice and (4) Maternal and newborn outcomes.

Respectful maternity care indicators

Of the 18 indicators that reflected features of respectful maternity care (see Additional file 1 for the list), seven met both acceptable validity criteria. These were: women and their companions were greeted respectfully (AUC = 0.61, 95% CI:0.56–0.66, IF: 1.24), the provider actively listened (AUC = 0.66, 95% CI: 0.61–0.70, IF: 1.17), the woman was allowed to have a companion of her choice during labor (AUC = 0.61, 95% CI:0.52–0.67), IF: 1.24), the woman was allowed to ambulate during labor (AUC = 0.66, 95% CI:0.61–0.70), IF: 1.10), the woman was encouraged to drink or eat during labor (AUC = 0.63, 95% CI:0.58–0.69), IF: 1.18), privacy was provided during clinical care (AUC = 0.64, 95% CI:0.58–0.68), IF: 1.21) and the provider spent enough time with the mother (AUC = 0.62, 95% CI:0.57–0.66), IF: 1.09).

Three respectful maternity care indicators had accuracy at the individual or facility level. These were (from highest AUC to lowest): the provider introduced him or herself to the woman (AUC = 0.68, 95% CI: 0.63–0.72), the provider encouraged the woman to assume different positions during labor (AUC = 0.63, 95% CI: 0.58–0.68) and at least once, the provider explained what will happen during labor (AUC = 0.61, 95% CI: 0.56–0.65). The prevalence for each indicator as reported by the women and the observer were incongruent for these indicators. For example, 54.29% of the women surveyed reported that the provider introduced one’s own name and role, while only 23.96% of the observers reported this with low specificity (Sp = 54.34, 95% CI: 48.92–59.67).

Four respectful maternity care indicators showed population-level accuracy: The provider responded professionally (AUC = 0.58 ± 0.05, IF: 1.08), the provider did not physically abuse the patient (AUC = 0.61 ± 0.06, IF: 1.08), the provider did not abandon patient without care (AUC = 0.52 ± 0.04, IF: 0.75), and the provider maintained good communication/collaboration (AUC = 0.59 ± 0.05, IF: 1.03). Those population level accurate indicators had high false positive rate with high sensitivity ranges from 85.86 to 97.26% and low specificity that ranges (18.82–40.96%). Four respectful maternity care indicators did not meet either of the validity criteria [Table 3]: women provided oral consent before examination (AUC = 0.58 ± 0.05, IF: 1.33), were allowed to have a companion during delivery (AUC = 0.50 ± 0.04, IF: 1.26), providers did not verbally abuse their patient (AUC = 0.51 ± 0.04, IF: 1.29) and providers treated clients equally without discrimination (AUC = 0.57 ± 0.05, IF: 1.31). These indicators did not meet either of the validity criteria.

Table 3 Validation Results on Respectful Maternity Care Quality Indicators in Northern Ethiopia (N = 455)

Content of care indicators

Of the 39 routine contents of child birth care signal function indicators, eighteen of them met both individual and population level acceptability criteria. For example: 40% of the women reported receiving the HIV test, which closely approximated the true prevalence of 44%. The validity analysis showed that women were able to accurately report whether they received an HIV test or not (SN: 94%, SP: 81%). Furthermore, the indicator of breastfeeding initiated within first hour of birth did meet both validity criteria. This indicator had high sensitivity (88%) and low specificity (17%), suggesting that while most women who initiate breast feeding in the first hour correctly reported doing so, nearly one out of eight women who did not breastfeed in the first hour falsely reported doing so (83%).

Ten indicators met the facility level of accuracy but over reported at the population level. These were: taking a urine sample for a protein test (AUC = 0.71, 95% CI: 0.67–0.75), recording blood pressure at the first post-delivery exam (AUC = 0.66, 95% CI: 0.61–0.70), taking a woman’s temperature at the first post-delivery exam (AUC = 0.74, 95% CI: 0.70–0.78), asking a women if she needed pain relief medication during labor (AUC = 0.75, 95% CI: 0.71–0.79), whether the health care provider discussed self-care and other healthy behaviors (AUC = 0.67, 95% CI: 0.63–0.72), counseling of women on consuming a balanced diet (AUC = 0.64, 95% CI: 0.60–0.69), whether the provider discussed delaying the baby’s bath until 24 h post-birth (AUC = 0.66, 95% CI:0.61–0.70), PNC appointment counseling (AUC = 0.69, 95% CI: 0.65–0.73), a composite indicator of 9 elements of immediate postpartum care counseling (AUC = 0.61, 95% CI: 0.54–0.65) and four basic items of immediate postpartum care counseling (AUC = 0.69, 95% CI: 0.65–0.73).

Nine content of care indicators met the acceptability criteria of the population level. These were: the provider asked about a woman’s obstetric history (AUC = 0.57, 95% CI: 0.53–0.62, IF: 1.24), a woman’s HIV status was checked (AUC = 0.59, 95% CI: 0.54–0.63, IF: 1.18), an abdominal examination was performed (AUC = 0.55, 95% CI: 0.50–0.59, IF: 1.04), the health care provider wore sterile gloves (AUC = 0.53, 95% CI: 0.48–0.58, IF: 1.14), the newborn was dried and wrapped with towel (AUC = 0.52, 95% CI: 0.47–0.57, IF: 1.17), a safe and clean environment was provided (AUC = 0.56, 95% CI: 0.51–0.60, IF: 1.06), the baby was weighed (AUC = 0.59, 95% CI: 0.55–0.64, IF: 0.96), the provider discussed perineum care (AUC = 0.58, 95% CI: 0.54–0.63, IF: 1.24) and a composite indicator of 5 essential elements of newborn care (AUC = 0.56,95% CI:0.52–0.61, IF: 0.83). Most of these valid indicators had low specificity, indicating there is high false positive rate. For example, only 6.0% of the women correctly reported that the health care provider didn’t wear sterile gloves during vaginal examination. However, a composite indicator of 5 essential elements of newborn care had low sensitivity (38%) and high specificity (76%) indicates a high false negative rate, which shows, 62% of woman who did receive all the five elements of newborn care did not report receiving those interventions.

Of the content of care indicators which did not meet either of the validity criteria were: the scale was calibrated and the baby was weighed (AUC = 0.48 ± 0.05, IF: 0.72) and a women’s vulva was cleansed (AUC = 0.59 ± 0.04, IF: 1.99) which showed the observed prevalence was nearly double at the population level compared to the facility where data were collected for this study [Table 4].

Table 4 Validation Results on the Content of Child Birth Care Quality indicators in Northern Ethiopia (N = 455)

Non-indicated childbirth care practice indicators

Of the 8 non-indicated care practices during normal birth, 2 indicators (stretching of the perineum during second stage of labor (AUC = 0.60 ± 0.05, IF: 1.16) and episiotomy performed without indication (AUC = 0.60 ± 0.05, IF: 1.09)) met both validity criteria.

Five indicators met the individual-level accuracy: Apply fundal pressure (AUC = 0.76, 95% CI: 0.72–0.80), artificial rupture of membrane (AUC = 0.67, 95% CI: 0.62–0.71), restriction of foods and fluids (AUC = 0.64, 95% CI: 0.61–0.69), frequency of digital vaginal examination less than four hours (AUC = 0.67, 95% CI: 0.62–0.71) and routine intravenous fluid infusion during labor (AUC = 0.74, 95% CI: 0.69–0.78). One indicator of the non-indicated care practices, which is hold newborn upside down did not meet either validation criteria (AUC = 0.55 ± 0.05, IF: 3.02) [Table 5].

Table 5 Validation Results for Non indicated Childbirth Care Practice Quality indicators in Northern Ethiopia (N = 455)

Maternal and newborn complications

Maternal complications

Participants were questioned about whether they experienced any of the following conditions either during or immediately following delivery: (1) bleeding, (2) preeclampsia/eclampsia (3) laceration (4) another type of complication (asked to specify), or (5) no complications.

Indicators of women’s report of experiencing any type of complication, hemorrhage, laceration and avoiding delays in received care met both validity criteria. About reporting the prevalence of maternal complications, nearly 19% of women reported experiencing some type of complication, which exceeded the observed prevalence (15%). Self-reports of experiencing any complication had a sensitivity of 45%, indicating that around half of women who had experienced a complication did report it. The indicator also had high specificity (85%), reflecting a low rate of false positive reports by women. The indicator of avoiding delays in receiving care had a high specificity (91.82, 95% CI: 88.64–94.33) but low sensitivity (32.81, 95% CI: 21.59–45.69). In addition, the most commonly reported indicators by mothers were experiencing excessive hemorrhage (9.67%), followed by laceration (3.96%).

Three indicators met the individual-level accuracy: Preeclampsia/eclampsia, neonatal complication and new born death within the facility. The indicator of preeclampsia/eclampsia faced around birth had low sensitivity (38%) and high specificity (95%) and was accurately classified at individual level (AUC = 0.62, 95% CI: 0.53–0.70). This shows there is a high false negativity rate and an overestimation at the population level (IF = 1.5).

Newborn outcomes

Mothers were asked whether their newborn babies were faced with any of the following complications during birth: (1) birth asphyxia, (2) still birth (3) infection (4) newborn death within the health facility (5) any other type of complication, or (6) no complications.

Only the birth asphyxia indicator of the newborn complication met both validity criteria (AUC = 0.76, 95% CI: 0.68–0.84, IF: 1.19). Women’s reports on birth asphyxia had a sensitivity of 64%, indicating that over one-third of women who had asphyxiated newborns did not report it. However, the indicator had high specificity (95%), reflecting low false positive reports.

The indicator of any neonatal complication only met study validity criteria at the individual level (AUC = 0.71, 95% CI (0.68–0.84), IF: 1.39), but suggests that the indicator was overestimated by 1.39 at the population level. Likewise, the indicator facility newborn death met individual-level accuracy (AUC = 0.60, 95% CI (0.45–0.73), IF: 0.47). This indicates that the indicator was underestimated by 0.47 at the population level. Implies the indicator had low sensitivity and high specificity indicating not all women whom their newborns death at facility correctly reported it. This might be due to mothers unable to differentiate newborn death and still birth.

Only the still birth indicator did not meet either of the validation criteria (AUC = 0.57 ± 0.07, IF = 1.52). Regarding the perinatal death (still birth and neonatal death) indicator, mothers could not differentially report whether the death was a still birth or early newborn death.

Lastly this study revealed that, 17% of women reported their newborns suffered at least one type of complication, exceeding the observed prevalence (12%) [Table 6].

Table 6 Validation Results on Maternal and Newborn Outcome Quality Indicators in Northern Ethiopia (N = 455)


This study tested the validity of key indicators that measure the quality of care received at the time of the intra-partum and immediate postpartum period which is needed to move beyond measures of nominal facility utilization for delivery to measures of effective coverage of delivery care that are either currently in use or will be incorporated into the household survey. Measures of effective coverage weight utilization estimates by the quality of the services used [24].

Several (n = 32) of the quality indicators across phases of labor and delivery met both validity criteria (accuracy at AUC and IF). Furthermore, an additional 21 indicators met the individual-level criteria, 13 met the population level criteria and 8 indicators did not meet any of the criteria. Indicators that did not meet both criteria are not necessarily or invalid for all measurement purposes, but should be used in accordance with the rationale for their use [9, 25]. Indicators that met both validity criteria could be appropriate for the measurement of quality of care at the facility and population level.

Indicators that do not meet criteria for the AUC but do meet the IF criterion may be suitable to measure intervention coverage at the population-level as false positive and negative reporting balance each other at the aggregate level.

Indicators that do met AUC criteria, but not IF criteria may be useful for facility-level measurement or individual level classification but may be over reported at population level. For example, in our study the composite indicator of counseling on care provision of immediate postpartum care, taking a urine sample for a protein test indicator was not correctly reported by women at the population level but, accurate measurement occurred at the individual level because of some of the indicators required understanding of technical terms which are difficult to distinguish by mothers.

Our study shows varied results in specific indicators when compared with other published literature findings. For example the indicator of women being allowed to have a companion of her choice in labor found that most women accurately reported the presence of a companion (i.e., high sensitivity). Our result shows a low specificity for this indicator, which may reflect “facility reporting bias” among women. This finding is not consistent with a study done in Mozambique and Kenya [14, 21], that reported high specificity for the indicator of women being allowed to have a companion of choice during labor. This discrepancy could be attributable to the cultural contexts of mothers, literacy level and awareness on the importance of having a companion of choice during labor. Besides, Mothers may have under reported negative experiences at a facility due to concerns about providers abandoning proper care for their subsequent visit in retaliation for these comments." We also found low sensitivity and high specificity for reported excessive bleeding (hemorrhage); this finding corresponds with levels found among women delivering in Mexico, Indonesia, Benin and the Philippines respectively [9, 26,27,28]. However, these results differ from women’s reporting in Ghana [29]. Taken together, the findings suggest that women’s understanding and recall of the presence of a companion of choice and obstetric complications experienced may vary by clinical and cultural context, and settings. The study results indicate that there were challenges of measuring low prevalence indicators accurately. Given that the calculation of IF depends upon the indicator’s observed prevalence, even a small number of false positive responses can result in overestimation as measured by the IF. A few of indicators met the IF test only; individual–level misclassification does not inherently signify that measurement at the population level will be inaccurate [21, 30, 31]. Evidence shows that knowledge of whether an indicator is likely to be overestimated can also have significant programmatic implications. For example, where the complication of preeclampsia/eclampsia is over– reported, identifying causes of maternal death at population level may not be as great as expected. When possible, we recommend that users also triangulate self–reported data on quality of care with facility and other data sources. Other core indicators like calibrating the scale and weighing the baby did not meet any of the validity criteria. This suggests that women may not be able to report accurately when and where the measuring scale was placed and calibrated.

This study has some limitations. This validation results don’t include women indicated for a cesarean section delivery and mothers whom were referred to higher facility with surgical capacity. Furthermore, this tool is used at discharge (soon after labor) and considers population level indicators. It may have some deficiencies for household surveys that take place long after delivery.


Women were able to accurately report on several aspects of quality of care indicators received across the phases of child birth and immediately after birth. A few technical indicators tended to have higher don’t know responses.

Although high specificity and sensitivity are preferred for all indicators, knowing the estimated survey-based prevalence is helpful, particularly for indicators of very low prevalence which are likely to be overestimated without near perfect specificity. Likewise, in some cases, low sensitivity and specificity cancel out at the population level and may generate acceptable estimates for coverage monitoring purposes, even if they are not appropriate for analysis at the individual or facility level. Therefore, the valid indicators should be included as potential measurements of quality process of labor and delivery to ensure that the essential content of care interventions is delivered.

Additional work on fewer, better metrics of quality measurement indicators of mothers and newborns through national survey efforts and tracking the basic lifesaving interventions received at the time of birth are reasonable to design a range of context-based quality improvement strategies. Lastly, the valid indicators can be tracked and reported through the health management information system or district health information system two (DHIS2) data for decision making at all levels.

Availability of data and materials

The raw datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.



Health care providers






  1. Kruk ME, Leslie HH, Verguet S, Mbaruku GM, Adanu RMK, Langer A. Quality of basic maternal care functions in health facilities of five African countries: an analysis of national health system surveys. Lancet Glob Heal. 2016;4:e845–55.

    Article  Google Scholar 

  2. Amouzou A, Aguirre LC, Khan SM, Sitrin D, Vaz L. Measuring postnatal care contacts for mothers and newborns: An analysis of data from the MICS and DHS surveys. J Glob Health. 2017;7(2):020502.

  3. Marchant T, Tilley-Gyado RD, Tessema T, Singh K, Gautham M, Umar N, et al. Adding content to contacts: measurement of high - quality contact for maternal and newborn health in Ethiopia, Nigeria, and Uttar Pradesh, India. Plos one. 2015;10(5):e0126840.

    Article  Google Scholar 

  4. Bartlett LA, Tripathi V, Stanton C, Strobino D, Bartlett L. Development and validation of an index to measure the quality of facility-based labor and delivery care processes in sub-Saharan Africa. PLoS One. 2015;8(5):e60761.

    Google Scholar 

  5. Bryce J, Arnold F, Newby H, Requejo J, et al. Measuring Coverage in MNCH: New Findings, New Strategies, and Recommendations for Action. PLose Med. 2013;10(5):e1001423.

    Article  Google Scholar 

  6. Mccarthy KJ, Ann K, Warren CE. Women’s recall of maternal and newborn interventions received in the postnatal period: a validity study in Kenya and Swaziland. J Glob Health. 2018;8(1):1–15.

    Article  Google Scholar 

  7. Spector JM, Agrawal P, Kodkany B, Lipsitz S, Lashoher A, Dziekan G, et al. Improving quality of care for maternal and newborn health: prospective pilot study of the WHO safe childbirth checklist program. PLoS One 2012; 7: e35151. doi: PMID: 22615733.

  8. Flood M, Small R. Researching labour and birth events using health information records: methodological challenges. Midwifery 2009; 25: 701–710. doi: PMID: 18321619.

  9. Blanc AK, Diaz C, Mccarthy KJ, Berdichevsky K. Measuring progress in maternal and newborn health care in Mexico: validating indicators of health system contact and quality of care. BMC Pregnancy Childbirth. 2016; 16 (255):1–11.

  10. Graham WJ, Bell JS, Bullough CH. Can skilled attendance at delivery reduce maternal mortality in developing countries? Safe motherhood strategies: a review of the evidence; 2001.

    Google Scholar 

  11. Gouws E, Bryce J, Pariyo G, Armstrong Schellenberg J, Amaral J, Habicht JP. Measuring the quality of child health care at first level facilities. Soc Sci Med. 2005;61:613–25. 15899320.

    Article  Google Scholar 

  12. Bazant E, Rakotovao JP, Rasolofomanana JR, Tripathi V, Gomez P, Favero R, Moffson S. Quality of care to prevent and treat postpartum hemorrhage and pre-eclampsia/eclampsia: an observational assessment in Madagascar's hospitals. Medecine et sante tropicales. 2013;23(2):168–175. doi: PMID: 23694783.

  13. Souza JP, Cecatti JG, Pacagnella RC, Giavarotti TM, Parpinelli MA, Camargo RS, Sousa MH. Development and validation of a questionnaire to identify severe maternal morbidity in epidemiological surveys. Reprod Health. 2010;7(1):16.

    Article  Google Scholar 

  14. Stanton CK, Rawlins B, Drake M, Anjos M, et al. Measuring coverage in MNCH: testing the validity of women’ s self-report of key maternal and newborn health interventions during the Peripartum period in Mozambique. Plose one. 2015;8(5):e60694.

    Article  Google Scholar 

  15. Tigray Regional State. Bureau of Finance and Economic Development, Annual Report 2017. Mekelle; 2017. p. 1–49.

  16. Agency Central Statistics, Addis Ababa Ethiopia, Program TD, ICF, Rockville, Maryland U. Federal Democratic Republic of Ethiopia Demographic and Health Survey. Ethiopia: FMoH; 2016. p. 27–57.

    Google Scholar 

  17. Hajian-tilaki K. Sample size estimation in diagnostic test studies of biomedical informatics. J Biomed Inform, 2014; 48:193–204. Available from:

    Article  Google Scholar 

  18. Zaidi M, Hospital LN, Waseem H, Fahim M, Ansari A, Hospital LN. Sample size estimation of diagnostic test studies in health sciences. J Biomed informatics, Proc 14th Int Conf Stat Sci. 2016;29:14–6.

    Google Scholar 

  19. Sharma G, Powell-jackson T, Haldar K, Bradley J, Filippi V. Quality of routine essential care during childbirth Clinical observations of uncomplicated births in Uttar Pradesh, India. Wider Work Pap 2017 / 143; India, environments. 2017;13:14.

    Google Scholar 

  20. Federal Democratic Republic of Ethiopia Ministry of Health. Health Sector Transformation Plan, 2015 2015/16–2019/20(2008–2012 EFY): Accessed 23 Dec, 2018.

  21. Blanc AK, Warren C, Mccarthy KJ, Kimani J, Ndwiga C, Ramarao S. Assessing the validity of indicators of the quality of maternal and newborn health care in Kenya. BMC Pregnancy Childbirth. 2016;6(1):1–13.

    Google Scholar 

  22. Campbell H, Biloglav Z, Rudan I. Reducing Bias from test misclassification in burden of disease studies: use of test to actual positive ratio – new test parameter. Public Health. 2008;49:402–14.

    Google Scholar 

  23. StataCorp. Stata Statistical software. Stata: Release 13. 2013.

  24. Larson E, Vail D, Mbaruku GM, Mbatia R, Kruk ME. Beyond utilization: measuring effective coverage of obstetric care along the quality cascade. Int J Qual Health Care. 2017;29(1):104–10.

  25. Dey A, Shakya HB, Chandurkar D, Kumar S, Das AK, Anthony J, et al. Discordance in self-report and observation data on mistreatment of women by providers during childbirth in Uttar Pradesh, India. Reprod Health. 2017;14(1):149.

    Article  Google Scholar 

  26. Sloan NL, Amoaful E, Arthur P, Winikoff B, Adjei S. Validity of women’s self-reported obstetric complications in rural Ghana. J Health Popul Nutr. 2001;19(2):45–51.

    CAS  PubMed  Google Scholar 

  27. Filippi V, Ronsmans C, Gandaho T, Graham W, Alihonou E, Santos P. Women's reports of severe (near-miss) obstetric complications in Benin. Stud Fam Plan. 2000;31(4):309–24.

    Article  CAS  Google Scholar 

  28. Stewart MK, Festin M. Validation study of women’s reporting and recall of major obstetric complications treated at the Philippine general Hospital. Int J Gynecol Obstet. 1995;48(Suppl):53–66.

    Article  Google Scholar 

  29. Tuncalp O, Stanton C, Castro A, Adanu R, Heymann M, Adu-Bonsaffoh K, Lattof SR, Langer A. W407 Validating women’s self-report of emergency cesareans delivery Ghana and the Dominican Republic. Int J Gynecol Obstet. 2012;119:S837.

    Article  Google Scholar 

  30. Liu L, Li M, Yang L, Ju L, Tan B, Walker N, et’al. Measuring coverage in MNCH: a validation study linking population survey derived coverage to maternal, newborn, and child health care records in rural China. PLoS One. 2013;8(5):e60762.

    Article  CAS  Google Scholar 

  31. Carvajal-Aguirre L, Vaz LM, Singh K, Sitrin D, Moran AC, Khan SM, Amouzou A. Measuring coverage of essential maternal and newborn care interventions: an unfinished agenda. J Global Health. 2017;7(2):1–5.

  32. Ministry of Science and Technology, Federal Democratic Republic of Ethiopia (FDRE), Ethiopian national research ethics guideline; 2014. pp. 23–97.

Download references


Our heartfelt thanks go to Mekelle University for the entire support to conduct this research. We would like to extend our gratefulness to International Institute for Primary Health Care – Ethiopia (IIfPHC-E), for financial and technical support. Our appreciation and thank is also forwarded to study health facility managers, research assistants and study participants for their genuine support and participation. Lastly, but not the least, we thank for Pre-Publication Support Service (PREPSS) supported for the development of this manuscript by providing pre-publication peer-review and copy editing.


This research received funding and technical support from Mekelle University and partly supported from International Institute for Primary Health Care - Ethiopia (sub-award grant). The funding organizations had no contribution in the design of the study, analysis, and interpretation of data.

Author information

Authors and Affiliations



HGW designed the study, participated in data collection, analysis and draft the manuscript; AAM, HG and ABK provided scientific advices on the design of the study, interpretation and critically reviewed the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Haftom Gebrehiwot Weldearegay.

Ethics declarations

Ethics approval and consent to participate

This study was approved by Mekelle University Institutional Review Board (ERC 1191/2017) and written informed consent was obtained from all participants prior to participation. For women under the age of 18 years, consent was obtained from their legal guardian in accordance with local ethical guidelines [32].

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Appendix 1

. Full List of Indicators on Quality of Care during intra-partum and immediate postpartum period in Northern Ethiopia

Additional file 2: Appendix 2

. Check list or QuestionnaireR

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Weldearegay, H.G., Medhanyie, A.A., Godefay, H. et al. Beyond health system contact: measuring and validating quality of childbirth care indicators in primary level facilities of northern Ethiopia. Reprod Health 17, 73 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: