Paracelsus Medical University (PMU)

Core Facility Biostatistics
Recommendation of statistical methods

Overview of efficient methods commonly used in practice:

We would like to recommend some advanced methods for data analysis and study planning to help you analyze your research goals more thoroughly:

1. Resampling Methods/Monte Carlo Methods/bootstrapping

The advantage of these simulation methods is that they can better detect true treatment effects or relationships (e.g., comparing blood loss between two surgical methods, 5-year breast cancer incidence rates, superiority of Therapy A vs. Therapy B, etc.) with higher statistical power than traditional parametric, semi-parametric, or non-parametric tests.

Our research office uses a wide range of "Monte Carlo, permutation, bootstrap, and jackknife methods," which often outperform traditional tests in many cases.
 

2. Adaptive Group-Sequential Designs

These designs can significantly reduce the expected sample size compared to fixed sample size designs, thereby shortening the study duration and minimizing time, costs, and organizational efforts.

For example, we often use "two-stage designs," such as "internal pilot study designs" and “Bauer-Köhne designs.”

The concept of these designs is to allow interim data analysis during the data collection phase. This enables the collection of valuable insights from the existing data, allowing you to adjust the study course. In the best-case scenario, this can lead to early termination of the study with positive results or provide critical information about how long recruitment should continue.

 

3. Machine learning/artifical intelligence (AI)

Some common machine learning methods we employ include:

  • Artificial Neural Networks: feedforward networks, convolutional neural networks (CNN), recurrent neural networks (RNN)
  • Support Vector Machines
  • Classification Tree Analyses
  • Bayes Classifier
  • ResNET 50, 101, and 152
  • Inception V1, V3
  • VGG-16 and VGG-19 (Visual Geometry Group, University of Oxford)
  • Squeeze-and-Excitation Net
  • U-Net (University of Freiburg, Germany)

These methods are highly successful for “classification problems.” Practical applications include:

  • Ophthalmology: Detecting vessels, exudates, and hemorrhages in diabetic retinopathy patients using neural networks.
  • Internal Medicine: Predicting pulmonary embolisms, ventricular hypertrophy, and myocardial infarctions.
  • Pulmonology: Forecasting whether a patient can be successfully weaned off mechanical ventilation.
  • Urology: Determining if a laser should be used for kidney stone fragmentation or estimating the expected duration of surgery.
  • Dermatology: Using neural networks to predict melanoma occurrence.

These models undergo rigorous "cross-validation tests" with randomized, independent samples to ensure they canbe generalized to new patients or objects.

 

4. Generalized Estimation Equation Models (GEE), Mixed models, Generalized linear/nonlinear Models (GLM/GLNM) oder General Additive Models

These powerful models offer significant advantages over traditional methods such as t-tests and ANOVA. Why?

In practice, we often encounter non-normally distributed characteristics that follow other distributions (e.g., lognormal, beta, gamma, Poisson, exponential, binomial, or Tweedie distributions). These models can directly incorporate such information, offering substantial benefits over non-parametric methods.

Kaplan-Meier methods or Cox regression models are frequently—and justifiably—criticized by reviewers. A wider range of models is available as alternatives.


5. Models: Continuous-time inhomogeneous Markov processes with discrete state space

These models provide an excellent generalization of Kaplan-Meier methods, including "illness-death models," "competing risk models," and "Fine-Gray models," which offer improved results compared to Kaplan-Meier models.

Medical applications of these models can be found in PubMed, such as studies on the progression of kidney cancer, breast cancer, or eye diseases. Below is an example of how a state-space model can be represented visually.

 

 

Grafik: Zustandsraum eines „Dynamic treatment regimes“ zur Behandlung von HIV basierend auf Zahl der CD4 Zellen. Der zeitliche Prozess wurde mittels eines diskreten, inhomogenen Markovprozesses mit stetiger Zeit mit 5 Zuständen und 7 Übergangen analysiert und beschrieben."