Learning From High-Cardinality Categorical Features in Deep Neural Networks

Home Page
About
Submit A Journal
Submit A Conference
Submit Paper/Book
- Submit a Preprint
- Submit a Book
Contact

Çanakkale Onsekiz Mart Üniversitesi Fen Bilimleri Enstitüsü Dergisi
Volume:8 Issue:2
Learning From High-Cardinality Categorical Features in Deep Neural Networks

Learning From High-Cardinality Categorical Features in Deep Neural Networks

Pages : 222-236

View : 63 | Download : 18

Publication Date : 2022-06-23

Article Type : Research Paper

Abstract :Some machine learning algorithms expect the input variables and the output variables to be numeric. Therefore, in an early stage of modelling, feature engineering is required when categorical variables present in the dataset. As a result, we must encode those attributes into an appropriate feature vector. However, categorical variables having more than 100 unique values are considered to be high-cardinality and there exists no straightforward methods to handle them. Besides, the majority of the work on categorical variable encoding in the literature assumes that the categories is limited, known beforehand, and made up of mutually-exclusive elements, inde-pendently from the data, which is not necessarily true for real-world applications. Feature engineering typically practices to tackle the high cardinality issues with data-cleaning techniques which they are time-consuming and often needs human intervention and domain expertise which are major costs in data science projects The most common methods of transform categorical variables is one-hot encoding and target encoding. To address the issue of encoding categorical variables in environments with a high cardinality, we also seek a general-purpose approach for statistical analysis of categorical entries that is capable of handling a very large number of catego-ries, while avoiding computational and statistical difficulties. Our proposed approach is low dimensional; thus, it is very efficient in processing time and memory, it can be computed in an online learning setting. Even though for this paper, we opt to utilize it in the input layer, dictionaries are typically architecture-independent and may be moved between different architectures or layers.
Keywords : deep neural networks, Categorical variable, high cardinality, mean target encoding, one hot encoding

ORIGINAL ARTICLE URL

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.