UPCommons està en procés de migració del dia 10 fins al 14 Juliol. L’autentificació està deshabilitada per evitar canvis durant aquesta migració.
Ad-RuLer: A novel rule-driven data synthesis technique for imbalanced classification

View/Open
Cita com:
hdl:2117/398676
Document typeArticle
Defense date2023-11-23
PublisherMultidisciplinary Digital Publishing Institute
Rights accessOpen Access
This work is protected by the corresponding intellectual and industrial property rights.
Except where otherwise noted, its contents are licensed under a Creative Commons license
:
Attribution 4.0 International
Abstract
When classifiers face imbalanced class distributions, they often misclassify minority class samples, consequently diminishing the predictive performance of machine learning models. Existing oversampling techniques predominantly rely on the selection of neighboring data via interpolation, with less emphasis on uncovering the intrinsic patterns and relationships within the data. In this research, we present the usefulness of an algorithm named RuLer to deal with the problem of classification with imbalanced data. RuLer is a learning algorithm initially designed to recognize new sound patterns within the context of the performative artistic practice known as live coding. This paper demonstrates that this algorithm, once adapted (Ad-RuLer), has great potential to address the problem of oversampling imbalanced data. An extensive comparison with other mainstream oversampling algorithms (SMOTE, ADASYN, Tomek-links, Borderline-SMOTE, and KmeansSMOTE), using different classifiers (logistic regression, random forest, and XGBoost) is performed on several real-world datasets with different degrees of data imbalance. The experiment results indicate that Ad-RuLer serves as an effective oversampling technique with extensive applicability.
CitationZhang, X. [et al.]. Ad-RuLer: A novel rule-driven data synthesis technique for imbalanced classification. "Applied sciences (Basel)", 23 Novembre 2023, vol. 13, núm. 23, article 12636.
ISSN2076-3417
Publisher versionhttps://www.mdpi.com/2076-3417/13/23/12636
Files | Description | Size | Format | View |
---|---|---|---|---|
applsci-13-12636-v4.pdf | 6,521Mb | View/Open |