Control charts to establish and monitor proficiency in the detection of pulmonary B-lines with Point of Care Ultrasoundª

Objective : Point of care ultrasound (POCUS) is a widely used clinical tool. This operator-dependent technique requires methods to establish individual benchmarks and to monitor the learning process. We present the use of the learning curve standard cumulative summation (LC-CUSUM) and CUSUM control charts to establish and monitor, respectively, the proficiency of a physician to detect pulmonary B-lines with POCUS. Materials and Methods : A training course for general practitioners was conducted to detect plasma leakage using POCUS. The trainees and an expert radiologist identified the number of pulmonary B-lines in the POCUS images of 53 hospitalized patients. The interpretation of one trainee was compared to that of the expert radiologist using LC-CUSUM and CUSUM considering image quality and anatomical site. Results and Discussion : We found that image quality was better in the apices than the bases of the lungs. The trainee learning curve differed by anatomical site and the results of LC-CUSUM and CUSUM differed when only high-quality (first scenario) or all images (second scenario) were included in the analysis. Conclusion : The LC-CUSUM and CUSUM control charts were useful to evaluate the learning curve in this case and to identify image quality as an important factor in the evaluation process. They warrant further study as graphical tools for real-time monitoring of POCUS training.


Introduction
Point-of-Care Ultrasound (POCUS) is becoming standard practice across numerous medical specialties and has been referred to as the "stethoscope of the future" [1]. Since 1990, the American College of Emergency Physicians (ACEP) has supported the use of POCUS in the emergency department [2]. Learning competencies for Emergency Medicine (EM) residents were defined in 2008 and have been further developed over time [3]- [7]. Following the foundations set by EM, POCUS is now an established part of medical school curricula in many institutions and has transdisciplinary applications within other sectors of the healthcare workforce [8], [9]. POCUS is particularly applicable in resource-limited regions where more advanced imaging technologies and trained radiologists may be in short supply [10]. POCUS has several clinical applications such as in the detection of pulmonary B-lines which may indicate the presence of pulmonary edema earlier than other imaging modalities [11]- [13]. Recent studies have found that POCUS is as sensitive as high-resolution computed tomography to detect peripheral abnormalities associated with COVID-19 pneumonia [14]- [16].
Many medical specialties and undergraduate medical education programs provide courses to establish a baseline of POCUS competency [17]. While POCUS training for U.S. medical students, residents and general internists is not yet universal, interest is rapidly growing [18]- [21]. Methods to establish individual benchmarks (i.e. number of exams to reach proficiency) in POCUS are described in the literature and include real-time supervision, teaching sessions with feedback, and quality control [6], [7]. Training recommendations are based on expert consensus and vary by ultrasound protocol [22]. For example, a guideline released by the American College of Emergency Medicine recommends a benchmark of 25 to 50 qualityreviewed exams in a particular application to establish competency [7]. However, learning curves may vary between individuals, with some reaching proficiency above or below the recommended benchmark.
The U.S. Accreditation Council for General Medical Education (ACGME) requires 150 total POCUS exams as the minimum experience to complete EM residency training [22]. Since 2010, The Royal College of Emergency Medicine (United Kingdom) requires POCUS training for residents to The Royal College of Radiologists level 1, which includes 50 focused emergency ultrasound exams and five supervised examinations per week [23]. EM trainees need to continue performing ultrasound throughout the remainder of their training program and into their consultant appointment. Ongoing ultrasound use may be intermittent, but no more than three months should elapse without performing an ultrasound, and at least 50 scans must be performed per year. Quality control procedures have been applied to the evaluation of student learning curves and have been used to monitor the introduction of health technologies [24]. Acquiring valid POCUS images and interpreting them correctly are operator-dependent competencies. To select a reliable and valid statistical tool to assess the learning process of those clinical competencies is an important element for ensuring patient safety [25]. Some studies have included POCUS training by measuring the trainee´s performance in two or more stages of the training, but they have not evaluated or modeled the learning curve [26]- [28]. Others studies have evaluated the agreement between different operators using conventional and pocket ultrasound [29], or compared the diagnostic accuracy of different ultrasound protocols and efficacy of new clinical procedures [30], [31]. Others have focused on validating rating scales on the POCUS study quality [28], [32]. Some studies have included CUSUM in their learning curve assessment, but not considered image quality [33], [34]. Only one study included training follow-up (Table 1).

Source: Authors own creation
Standard Cumulative sum (CUSUM) and Learning Curve CUSUM (LC-CUSUM) control charts have advantages over other methods because they are sensitive quantitative methods for detecting if a target level of proficiency has been reached and maintained over time. LC-CUSUM control charts reveal when a pre-specified level of proficiency has been reached and CUSUM charts are used to monitor performance after reaching proficiency [24], [36]- [38]. This statistical tool allows quantitative assessment to quickly detect changes in proficiency in real-time on the level of an individual and thus the statistical power of the method does not depend on the number of individuals evaluated. This is because the criteria of acceptable and unacceptable proficiency are determined a priori [39]. The use of these modelling tools has allowed medical and surgical providers to define and meet specific individual learning and proficiency objectives [36]. This article aims to demonstrate the usefulness of standard cumulative summation (CUSUM) and learning curve CUSUM (LC-CUSUM) control charts to evaluate the learning and monitor the maintenance of POCUS skills to detect pulmonary B-lines.

Study design
This case report was carrying out between December 2018 and March 2019 in a health care institution in Cali-Colombia. The intervention was the training of general practitioners with a standardized POCUS protocol to detect the presence of pulmonary B lines, a sign of plasma leakage. Training was provided as part of the investigation of the usefulness of POCUS to detect plasma leakage in subjects with dengue fever (POCUS-DENGUE). The POCUS-DENGUE project was approved by the Institutional Review Board of Universidad del Valle (Number: 022-018) and the University of Minnesota (STUDY00004437).

Training intervention
Training in the POCUS-DENGUE study consisted of online modules and in-person training. The online component was three hours in length and included content on general ultrasound principles, correct technique, and interpretation of the images. The in-person portion was divided across five days of practice sessions on healthy volunteers and hospitalized patients. The training was provided in Cali, Colombia by an EM physician expert sonographer and an Internal Medicine physician from the University of Minnesota. A portable ultrasound machine (Philips Lumify with 5 to 2 MHz curved array transducer in B-mode) was used during the training and to obtain images for the corresponding training assessment. In the same week after training, the trainee obtained and interpreted all the POCUS images from 53 inpatients of a tertiary level care hospital in Cali-Colombia. Images were obtained from 4 ultrasound points of the lungs (apices and bases of both right and left sides) where B-lines are expected to be found when plasma leakage is present. [40] Images for each of the 53 patients were encrypted and stored on a Samsung Galaxy Tab S4 Tablet and transferred to a secure cloud service for analysis (www.box.com).

Assessment of training
In each set of ultrasound images, the maximum number of B-lines and the presence of plasma leakage (defined as the existence of three or more B-lines in any of the anatomic sites of both right and left hemithorax) were interpreted blindly, independently, and in chronological order by the trainee and an expert radiologist, professor in the Radiodiagnosis Postgraduate Program at the Universidad del Valle, Cali-Colombia with fifteen years of practicing ultrasound. The radiologist also interpreted each image for quality using the generic scale recommended by The American College of Emergency Physicians Ultrasound (ACEP) [2]. The trainee and the expert radiologist entered their respective interpretation results in a predesigned and encrypted case record form (CRF) using Research Electronic Data Capture (REDCap) software [41], [42].

Statistical diagnosis of learning
Using the interpretation of the expert radiologist as gold standard, LC-CUSUM and CUSUM were used to assess the trainee benchmark and then monitor performance over time, respectively. For the LC-CUSUM, the null hypothesis was considered as, H0: "performance is inadequate" (i.e., process is out of control as the competency has not been achieved) and the alternative hypothesis as H1: "performance is adequate" (i.e., process is in control as the competency has been achieved or is maintained). Statistically, in terms of the failure rate (P: Proportion of images misclassified by the trainee), hypotheses can be written as: H0: P ≥P0 tested against H1:P < P0 where P0 is the unacceptable failure rate, value previously defined by the researcher. By contrast, in the CUSUM test we considered the null hypothesis as H0: "the competency continues to be adequate" (i.e., process is in control) and the alternative hypothesis, H1: "the competency is lost" (i.e., process is out of control). This can be written as H0: "P ≤ P0 tested against H1:P >P0 where P0.
For the design of the LC-CUSUM and CUSUM graphs, it is required to configure some parameters that determine the performance of these diagnostic tools, among these parameters are P0 unacceptable failure rate and P1 a tolerable failure rate and h numerical value that determines the control limit location on CUSUM and LC-CUSUM chart. The corresponding acceptable and unacceptable failure rates were used to set a control limit (h), that when crossed, indicates that proficiency has been achieved (LC-CUSUM) or has been lost (CUSUM). From these, we estimated the average number of procedures to detect a signal under H0 and H1 denoted as ARL (Average Run Length) given ARL0 and ARL1, respectively. Based on the literature and the concept of an expert radiologist, we established P0 = 0.3 and P1 = 0.1 for LC CUSUM and P0 = 0.1 and P1 = 0.2 for CUSUM [33], [34]. A Monte Carlo simulation routine was carried-out to establish h and ARLs. The result was h = 2.5, ARL0 = 98 and ARL1 = 17 for LC-CUSUM, and h =1.5, ARL0 = 99 and ARL1 = 23 for CUSUM.
The trainee´s ultrasound interpretation was considered successful if it was identical to that of the radiologist or a failure if it was discordant. To assess the effect of image quality in the learning curve, analyses were done separately for two scenarios: first scenario with images of high quality (score ≥ 3) and second scenario with all the images. In the latter, images not obtained or images of quality scores <3 were defined as trainee failures. For each sequential image (t), the corresponding LC-CUSUM and, if the proficiency was achieved, the CUSUM score St =max (0, St-1+Wt) was calculated, equation 1 (Appendix of [24]): The sequential LC -CUSUM and CUSUM scores were plotted on the control chart until a point exceeds its respective h (St > h), indicating that proficiency had been reached (LC-CUSUM chart) or that proficiency had been lost (CUSUM chart).
Detailed description of this procedure can be found in Appendix of article of D. J. Biau, et al, 2008 [24].

Results
POCUS images of both lung apices and bases were available for all the 53 patients, except one left apex and four left bases. Quality of available images was scored in all but one. The images of the right and left bases were of poorer quality than those of the apices (Table 2). For the right apex, 45 images were included in the first scenario and 53 in the second. For the right base, 32 images were included in the first scenario and 53 in the second. For the left apex, 41 images were included in the first scenario and 53 in the second. For the left base, 28 images were included in the first scenario and 52 in the second.

Source: Authors own creation.
Most patients did not have B-lines according to both expert radiologist (29/53, 54.7%) and trainee (32/53, 60.4%). The presence of at least one B-line was more frequently reported in the bases than the apices. For all anatomical sites, the trainee reported fewer evidence of Blines than the expert radiologist (Table 3).

Source: Authors own creation.
There were differences in the results of the LC-CUSUM and CUSUM charts between the first and second scenarios when the maximum number of B-lines ( Figure 1) and the presence/absence of plasma leakage ( Figure 2) were assessed. Similarly, the trainee had different proficiency performance according to the anatomical site. For the right apex, proficiency was achieved after the 10th case in the first scenario, and the 24th case in the second. For the left apex, proficiency was achieved after the 16th case in the first scenario and was not achieved in the second scenario. Proficiency was not attained for the right or left lung bases in neither the first nor second scenario. The CUSUM control charts showed that for the right apex in the first scenario, the trainee maintained the competence until 33th case, after which the learner exhibits intermittent behavior. In the second scenario, proficiency was lost by the 40th case and never recovered. In the cases in which the trainee did not reach proficiency, it was not possible to represent the CUSUM chart. (Figure 1) The LC-CUSUM and CUSUM charts of plasma leakage showed that proficiency is achieved after the 10th case and maintained until the 36th. However, for the second scenario, competence was never achieved (Figure 2).

Figure 1. Cumulative summation test for learning curve (LC-CUSUM) and standard cumulative summation test (CUSUM) control charts for the maximum number of B-lines reported by the evaluated trainee, according to anatomical site and scenario.
Source: Authors own creation.

Figure 2. Cumulative summation test for learning curve (LC-CUSUM) and standard cumulative summation (CUSUM) test control charts for evidence of plasma leakage reported by the trainee, by scenario.
Source: Authors own creation.

Discussion
Establishing when proficiency is achieved or lost for operator-dependent technologies is challenging. There are different guidelines for POCUS training but there are few tools for validating the recommended benchmarks to establish proficiency. The development of proficiency for any individual depends on many factors, which in turn vary depending on the trainee, the procedure, instructor, settings and required level of performance [24]. The point at which proficiency is reached varies within the same profession, for example between residents and consultants [25]. Developing tools for verifying proficiency would help fill an important learning and evaluation gap. Learning curves encourage reflection on learning, facilitate training monitoring, and identify when additional learning or updating is required [43]. Our results and those of others demonstrates that the use of control charts such as CUSUM can be an effective tool for assessing proficiency achieved within the context of POCUS [33], [34]. CUSUM and LC-CUSUM can be adapted to verify any level of competence that is previously chosen by the evaluator.
To help further the understanding of how LC-CUSUM and CUSUM control charts may be applicable for POCUS training, we considered two different scenarios according to quality of images obtained with POCUS to detect pulmonary B-lines in hospitalized patients. Our results show that the number of exams needed to reach proficiency is influenced by the level of image quality obtained in different pulmonary anatomical sites. The image quality could be associated with the morphological conditions and clinical status of the patients, and/or the variation in the mechanical-cognitive application of POCUS by the trainee. This highlights that the acquisition of images with quality ≥ 3 is a critical component in establishing proficiency and is a foundational component of POCUS. The parameters to be evaluated in the image quality include image resolution, anatomic definition, and other image quality acquisition aspects such as gain, depth, orientation, and focus [7].
In the case of CUSUM control charts, one advantage is that it allows the assessment of proficiency in real time. For example, when proficiency is lost a review can be triggered and the trainee can be provided feedback to self-correct until proficiency is regained. This type of monitoring could be used throughout the learning process [44]. It could also be a useful method to evaluate the progress of the trainee and guide the amount of supervision required. A case in point is the POCUS-EPAs (Entrustable professional activities) method used with internal medicine residents. POCUS-EPAs are intended as highly practical tools to allow faculty to make competency-based decisions for well-defined pieces of clinical work and help define the levels of supervision required by the trainee [21]. Follow-up training is not a common component of the published literature on POCUS. Skill attenuation can occur after lack of practice and hence, CUSUM charts can carry out this type of monitoring [34]. In this context, the Royal College of Radiologists recommends that once the proficiency is reached, to maintain practical skills, intermittent practices can be performed, but no more than 3 months should elapse without the student using these exploration skills [23].
Some studies have considered image quality and identification of specific anatomical structures to define the criteria of success in the evaluation of the learning curves [35]. Some have used other criteria such as the identification of a clinical finding and the quality of the image, under the premise that a poor acquisition affects the clinical interpretation [22]. In our case, the quality of the image clearly influenced the plotting of the learning curve. To accurately detect plasma leakage, it was necessary for the trainee to obtain high quality images at all anatomic sites.

Limitations
The in-person training received by the general physician was given by an expert POCUS physician while the image interpretation was done by an expert radiologist. This could have led to a more demanding standard for the interpretation and the quality assignment of the ultrasound images, than is typical for POCUS. Pulmonary B-lines may require greater image quality due to the need to quantify lines seen over background noise. There also may be variability between radiologists, depending on their level of focus in ultrasound. Our study was conducted in a single hospital with a single general physician which limits the generalizability of the results. However, we expect that demonstrating an example of the use of LC-CUSUM and CUSUM control charts, even though only with one trainee, will be applicable in similar contexts, considering that they are individualized assessment methods.

Conclusions
Methods for setting individual benchmarks in POCUS include real-time monitoring, teaching sessions with feedback, and quality control. Benchmarks for proficiency likely vary by individual making it necessary to apply a reliable and valid assessment to each individual's learning curve. LC-CUSUM and CUSUM control charts are well established tools for monitoring the learning process across a range of disciplines. It has not been widely adopted as a tool for evaluating POCUS training or for monitoring POCUS performance over time.
Here, we demonstrate the application of the LC-CUSUM and CUSUM control charts in assessing the learning process of POCUS for detecting pulmonary B-lines on a general physician, using the evaluation by an expert ultrasound radiologist as the gold standard. By considering two different scenarios, we show that image quality is an important evaluation factor that affects the assessment of the learning curve. The LC-CUSUM and CUSUM control charts are a graphical tool to intuitively evaluate learning curves and can be used for real-time monitoring once the trainee reaches a predefined level of competency.