문제

I am trying to load a big csv file(about 18G) into rapidminer for building a classification model. The “import configuration wizard” seems has difficulty in loading the data. Therefore, I choose to use the “Edit parameter list: data set meta data information” to set up the attribute and label information. However, the UI-interface only allows me to setup those information column-by-column. My csv file has about 80000 columns. How should I handle this kind of scenario? Thanks.

도움이 되었습니까?

해결책

I haven't tried it yet myself, but you should be able to load the CSV into a MySQL database. You can then use the stream database operator to avoid size limitations. Here is the description from RapidMiner:

In contrast to the Read Database operator, which loads the data into the main memory, the Stream Database operator keeps the data in the database and performs the data reading in batches. This allows RapidMiner to access data sets of arbitrary sizes without any size restrictions.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top