Obiekt

Tytuł: Isolation Forests for Symbolic Data as a Tool for Outlier Mining

Tytuł odmienny:

Lasy separujące dla danych symbolicznych jako narzędzie wykrywania obserwacji odstających

Autor:

Pełka, Marcin ; Dudek, Andrzej

Opis:

Econometrics = Ekonometria, 2024, Vol. 28, No. 1, s. 1-10

Abstrakt:

Aim: Outlier detection is a key part of every data analysis. Although there are many definitions of outliers that can be found in the literature, all of them emphasise that outliers are objects that are in some way different from other objects in the dataset. There are many different approaches that have been proposed, compared, and analysed for the case of classical data. However, there are only few studies that deal with the problem of outlier detection in symbolic data analysis. The paper aimed to propose how to adapt isolation forest for symbolic data cases. Methodology: An isolation forest for symbolic data is used to detect outliers in four different artificial datasets with a known cluster structure and a known number of outliers Results: The results show that the isolation forest for symbolic data is a fast and efficient tool for outlier mining. Implications and recommendations: As the isolation forest for symbolic data appears to be an efficient tool for outlier detection for artificial data, further studies should focus on real data sets that contain outliers (i.e. credit card fraud dataset), and this approach should be compared with other outlier mining tools (i.e. DBCSAN). The authors recommend using the same initial settings for the isolation forest for symbolic data as the settings that are proposed for the isolation forest for classical data. Originality/value: This paper is the first of its kind, focusing not only on the problem of outlier detection in general, but also extending the well-known isolation forest model for symbolic data cases.

Wydawca:

Publishing House of Wroclaw University of Economics and Business

Miejsce wydania:

Wroclaw

Data wydania:

2024

Typ zasobu:

artykuł

Identyfikator zasobu:

doi:10.15611/eada.2024.1.01 ; oai:dbc.wroc.pl:126462

Język:

eng

Powiązania:

Econometrics = Ekonometria, 2024, Vol. 28, No. 1

Prawa:

Pewne prawa zastrzeżone na rzecz Autorów i Wydawcy

Prawa dostępu:

Dla wszystkich zgodnie z licencją

Licencja:

CC BY-SA 4.0

Lokalizacja oryginału:

Uniwersytet Ekonomiczny we Wrocławiu

Tytuł publikacji grupowej:

Ekonometria = Econometrics

Podobne

×

Cytowanie

Styl cytowania:

Ta strona wykorzystuje pliki 'cookies'. Więcej informacji