I recently ran a simulation study comparing methods for handling class imbalance (in this case, when the class of interest is less than about 3% of the data) for a statistical computing course. I simulated 500 data sets, varying some characteristics like sample size and minority class size, and tested a number of preprocessing techniques (e.g., SMOTE) and algorithms (e.g., XGBoost). You can view the working paper by clicking here.

If you don't want to slog through the whole paper, the plot below shows densities of how each model (combination of sampling technique and algorithm) performed. I totally left off models that used no preprocessing and oversampling, since they made so few positive predictions that metrics like F1 scores couldn't even be calculated most of the time!

Feel free to check out the GitHub repository, as well.