Keywords

1 Introduction

In 2012 over 1 million men were diagnosed with prostate cancer (PCa) worldwide and around 300.000 men died from the disease in 2012 [1]. The current diagnostic tool for PCa diagnosis is systematic transrectal ultrasound guided biopsies (TRUS+B) due to suspicious elevated prostate specific antigen (PSA) and/or an abnormal digital rectal examination (DRE) [2]. The biopsies are used to grade the PCa according to the Gleason system, which describes the microscopic appearance of PCa. In practice, the Gleason score ranges from 6–10, with 6 being the lowest tumor aggressiveness and 10 being the most aggressive [3]. DRE is not effective in detecting small tumors and tumors located in the anterior or central part of the gland [4]. TRUS+B has a risk of missing tumors that are not palpable by DRE and not visible on ultrasound [5]. Using standard TRUS+B only 0.05–0.5% of the prostate volume is sampled [6]. Thus, TRUS+B entails a risk of missing significant tumors, under-grading cancer burden, and conversely detecting small insignificant tumors that might lead to over detection and possible overtreatment [7, 8]. In patients with persistent suspicion of PCa, despite previous negative, or inconclusive TRUS+B, repeated biopsy procedures are performed in around 31% of the patients [9, 10]. The detection rates of second to fifth set of biopsies range from 12.5 to 16.9% [10].

Magnetic resonance imaging (MRI) provides excellent contrast between soft tissues, which makes it suitable for PCa examination [11, 12]. Recent studies suggest that multiparametric MRI (mpMRI) guided biopsies improve the detection of clinically significant tumors compared to TRUS+B [13]. Furthermore, it can help reduce the number of unnecessary biopsies and allows better assessment of the cancer aggressiveness [14, 15].

Using mpMRI data for PCa screening is a labor intensive task; it requires a high level of expertise, which is not widely available, and is affected by inter-observer variation [7, 16, 17]. This motivates the need for semi- or fully automatic methods, such as computer-aided detection (CAD) algorithms that holds the potential of reducing reading time and inter-observer variation, and may improve the detection rate of clinically significant PCa [18].

Different semi- or fully automatic CAD algorithms have been designed, but it still is a novel technique that remains a challenging issue to improve [7, 19]. The first CAD system to identify cancerous regions in the peripheral zone (PZ) was proposed in 2003 by Chan et al. [20]. Since then, a substantial number of papers have been published on the subject along with detailed overviews of the current literature on prostate CAD algorithms [7, 21, 22]. The methodology behind the published algorithms vary greatly regarding region of interest (peripheral zone (PZ) or whole prostate), MRI sequences, definition of ground truth, features and classifiers used [7, 21]. The best combination of these parameters used for the CAD algorithm remains unsolved and might be scanner and dataset dependent.

The most commonly used mpMRI sequences for prostate CAD algorithms are T2W, DWI (ADC) and DCE. The first two sequences, T2W and DWI (ADC), take less than 20 min to acquire, while adding the DCE sequence prolongs scanning time by up to 45 min. Furthermore, the DCE sequence requires administration of an expensive contrast agent [23]. The long scanning time and contrast costs could pose a limitation on a more widespread distribution of mpMRI diagnostics. Limiting the number of sequences used may partly resolve those limitations [24].

The aim of the present study was to establish a new algorithm for detection of PCa suspicious foci using biparametric MRI (bpMRI) based on T2W and DWI (ADC) MRI sequences and compare it to expert annotations.

2 Materials and Methods

2.1 Patient Data

Eighteen patients were scanned at Herlev Hospital, Denmark using a 3.0T MRI scanner (Ingenia, Philips Healthcare) with an anterior pelvic phased-array coil. One mg intramuscular Glucagon combined with 1 mg hyoscine butylbromid (Buscopan) intravenous injection was administered to the patient to reduce peristaltic movement during the MR examination. MR series were axial T2W and DWI including four b-values (0, 100, 800, and 1400 s/mm2)). An ADC map (b-values 100 and 800 s/mm2) was calculated for each patient using the MR-scanner software. For details about the MRI protocol, see Table 1.

Table 1. Sequence parameters for 3 Tesla Ingenia MRI with pelvic phased-array coil

All patients had at least one negative or inconclusive TRUS+B prior to the MRI examination. Patients underwent a new TRUS + B with either 10 standard biopsies and 1–3 biopsies from MR positive areas, or only biopsies from MR positive areas (3–4 biopsies). All patients were diagnosed with local or locally advanced PCa. Patient and tumor characteristics are listed in Table 2. Fusion of MRI and real-time ultrasound was done using a Hitachi Medical Systems, Real-time Virtual Sonography (RVS) setup.

Table 2. Patient and tumor characteristics. Prostate and tumor volume is based on expert delineation on T2W. Gleason scores were obtained from prostate biopsies.

2.2 Image Pre-processing

T2W series were manually cropped to exclude some of the normal tissues surrounding the prostate gland. To correct for non-uniformity in MRI intensities the images were normalized using the N3 algorithm, and made isotropic using tri-linear interpolation (1 × 1 × 1 mm3 voxels) [25]. DWI (b = 1400 s/mm2) and ADC series were resampled to match the world coordinate system of the T2W series, using coordinate information from the image headers. For one patient the DWI and ADC images were manually co-registered to T2W images using 3dSlicer since there was clear displacement between the image series [26, 27]. The remaining 17 patients were visually inspected for any displacement and no co-registration was done.

2.3 Expert Delineation

The prostate contour was delineated on T2W images for all patients by an expert (>5 year experience in prostate MRI) to focus the analysis on prostate tissue only. Furthermore, expert tumor contours on biopsy confirmed areas were annotated on T2W images using the combined MRI series, see Fig. 1. All contours were made in Eclipse™ Treatment Planning System (Varian Medical System).

Fig. 1.
figure 1

Example of expert delineation of prostate boundary (dashed white) and tumor boundary (solid black) on T2W (a), ADC (b) and DWI (c) for patient 1.

2.4 Voxel Feature Extraction

Intensity features from T2W, high b-value DWI (b = 1400 s/mm2) and ADC, together with 3D image gradient magnitude and gradient direction for T2W, ADC and DWI images were used as features. The gradient magnitude is the square root of the sum of squares of the individual gradients in x, y and z direction. Gradient direction feature indicates in which direction the image intensity changes most rapidly using the Azimuth angle (measured in the xy-plane from the x-axis). Furthermore, a Euclidean distance feature, measuring the shortest distance from each voxel within the prostate to the prostate boundary, was used.

The 10 features used for this study are listed in Table 3.

Table 3. The features used for the classifier

2.5 Voxel Classification

The intensity features (T2W, DWI and ADC) for each patient were normalized to zero mean and unit variance to account for interpatient-variation. Afterwards, all feature vectors were normalized to zero mean and unit variance.

The classifier used for this study was a quadratic discriminant analysis (QDA) model. The 10 features listed in Table 3 were used in the final model using leave-one-out cross-validation. In leave-one-out cross-validation one patient is kept outside the training set and used for subsequent testing of the model. This is repeated until all patients have been used for testing. The result of the classifier was a probability map per-voxel-basis for each 3D prostate volume with values between 0 and 1, where 1 is indicating highest suspicion of PCa.

2.6 Evaluation

A true positive (TP) was defined as a model detected volume (connected voxels with >0.5 probability) of >0.2 cc within the expert tumor contour. False positives (FP) were defined as model detected volumes outside the expert tumor contour of volumes >0.2 cc. The number of TP and FP, and percentage of TP and FP voxels were evaluated for >0.5 probability obtained from the probability map. Furthermore, the receiver operating characteristics area under curve (ROC-AUC) was calculated (voxel-wise) for each patient and overall for the algorithm.

3 Results

Figure 2 shows the probability maps for image slices at tumor location (approx. center) for each patient.

Fig. 2.
figure 2

Probability maps (0 probability being transparent) overlaid T2W images presented for all patients.

Visual assessment of the probability maps show that the highest tumor probability corresponds well with the expert annotated area for many patients (e.g. Fig. 2a, i–j, n–o and p–s)

In several patients (Fig. 2d, g–h and m) the tumor region has been identified, although the detected area is smaller than the expert annotation. In some of the patients a high tumor probability is found near the expert annotation (Fig. 2e, k–l). One tumor shows no area with high tumor probability within the expert annotation (Fig. 2o) and some tumors only show a small area with high tumor probability (e.g. Fig. 2b and f).

Table 4 shows the quantitative performance of the algorithm with the number of TP and FP for detected volumes >0.2 cc for >0.5 probability.

Table 4. Overview of output from the algorithm showing percentage true positive (for each tumor) and false positive voxels. Furthermore, the number of TP and FP for each patient for lesion volumes >0.2 cc at >0.5 probability is shown along with overall and per-patient ROC-AUC.

Of the 22 tumors 21 were detected by the algorithm with a median number of FP per patient of 1. The number of FP ranged from 0 to 4 per patient with a total of 28. The detected TP volumes ranged from 10.71% to 97.31% (median 38.58%) with actual volumes of 0.24 to 5.45 cc (median: 1.14 cc). Three FP volumes were >1.50 cc, the remaining were < 1.00 cc (median: 0.52 cc). The ROC-AUC ranged from 0.69–0.98 with a mean of 0.83.

4 Discussion

In this study we presented a CAD algorithm based on bpMRI that can locate the majority of PCa annotated by an expert. A probability map was calculated for each patient using intensity features from T2W, DWI and ADC along with gradient magnitude and direction, and a distance feature. Both visual and quantitative evaluation showed good performance of the algorithm with only one missed tumor. Thus, the algorithm can potentially aid physicians in detecting PCa on MRI for biopsy guidance.

The most used evaluation metric for CAD algorithms is ROC-AUC [21]. We found a ROC-AUC of 0.83, which is in line with the ROC-AUC (0.80–0.89) reported by others [7, 21, 22]. However, some studies report higher values >0.89. Ehrenberg et al. [28] obtained a ROC-AUC of 0.92, also detecting 21 out of 22 tumors with a low number of FP.

We found a low number of FP ranging from 0–4 per patient [15]. Giannini et al. [29] found a per lesion sensitivity of 96% with a median of 3 FP per patient when considering only PZ tumors. Their results are comparable to our per-lesion sensitivity of 95% (21/22 detected tumors) and a median number of 1 FP per patient. A FP in a healthy patient will lead to unnecessary biopsy and healthcare cost, whereas in repeat biopsy patients a high sensitivity is more important than a high specificity [30].

Support Vector Machines (SVM) are the most studied classifier for PCa CAD algorithms. However, other classifiers such as Random Forest, Naïve Bayes and Linear Discriminant Analysis, have been used [21]. S. E. Viswanath [31] compared 12 different classifiers, including QDA, for PCa detection and found that QDA was the best forming classifier in terms of accuracy, execution time and overall evaluation.

Our algorithm has a tendency of under-estimating tumor volume compared to expert annotation (e.g. Fig. 2d). MRI series have been shown to generally under-estimate tumor volume compared to histopathological estimated volumes, although more prominent on ADC than T2W [24, 32]. However, the intent of the algorithm was not to segment the tumor volume but to determine the location of the tumor in order to target biopsies.

Even though tumor volumes >0.5 cc usually are deemed clinically significant, a threshold of 0.2 cc was used for detecting TP and FP in this study [33]. However, tumor volume alone does not determine the PCa risk as some small tumors (0.2–0.5 cc) have high Gleason grade components (Gleason grade 4) and are therefore clinically significant tumors [34, 35]. Thus, a volume threshold <0.5 cc on MRI seems appropriate [24].

We acknowledge certain limitations to this study; Firstly, the prostate was not automatically segmented but annotated by an expert. For a clinical useful CAD algorithm, prostate segmentation should be done automatically as well, however, much research has already been done within this subject and was not within the aim of this study [36]. Another limitation is the use of biopsy results with expert annotation as ground truth. The optimal ground truth would have been the pathological results from radical prostatectomy specimens. Since PCa often is a multifocal disease it is possible that some of the FP lesions found actually are TP not detected by the expert on bpMRI [37]. Furthermore, standard biopsies detected additional small, low grade cancers missed by the expert, which corresponds well with the fact that MRI often overlook clinically insignificant PCa [38]. Finally, no co-registration was done between T2W images and DWI and ADC images (except manual registration in one patient). Thus, geometric image mismatch could have affected the results of the CAD algorithm. A volume with high tumor probability was found in some patients (e.g. Fig. 2e and k–i) at the same location as the expert annotation although with a slight displacement. This might be the results of deformation and/or movement of the prostate during the MRI examination. Automatic registration methods have been explored in the literature, however, this is not a trivial problem to solve [7, 39]. According to Wang et al. [7] registration using the coordinate information in the image header is often sufficient when there is limited patient motion. In this study, prostate movement was visually accessed, and found to be minimal.

The PIRADS v2 guidelines recommend the use of DCE series for expert assessment of PCa, however, it is not clear whether DCE is necessary for CAD algorithms in order to obtain good performance [40]. The combination of T2W, DWI (ADC) and DCE are the most commonly used sequences for PCa diagnostic. However other imaging modalities, such as proton density, diffusion tensor and MR spectroscopy, have been applied for CAD algorithms as well [7, 39].

CAD algorithms are intended to assist radiologists in their workflow by selecting key images and highlight suspicious areas for further evaluation. This might decrease the workload and inter-observer variance among radiologists [7]. Hambrock et al. [16] showed that their CAD algorithm could assist less-experienced radiologists in evaluating PCa on mpMRI reaching performance levels similar to experienced radiologists.

In this study all patients had at least one prior negative biopsy and all were PCa positive. A future study is needed to assess the algorithms performance in PCa negative patients to test the algorithms ability to exclude PCa negative patients from further diagnosis.

5 Conclusion

This study demonstrates that a new algorithm based on bpMRI can be used for PCa detecting with only one missed tumor and a low number of false positives. The quantitative results are within the range of existing CAD algorithms using MRI data for PCa detection.