문제

I'm using the NSL-KDD data set which contains nominal and numerical values, and I want to convert all the nominal values to numerical ones. I tried the get_dummies method in python and the NominalToBinary method in WEKA, but the problem is that some nominal features contain 64 values so the conversion increases the dimensionality of the data a lot, and this can create problems for the classifier.

My question is if I can convert the nominal attributes by establishing a correspondence between each category of a nominal feature and a sequence of integer values, for example protocol_type {tcp=0, udp=1, icmp=2...etc}? Would this alter the credibility of the resulted data set?

올바른 솔루션이 없습니다

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top