Running multiple hadoop instances on same machine
-
22-09-2019 - |
문제
I wish to run a second instance of Hadoop on a machine which already has an instance of Hadoop running. After untar'ing hadoop distribution, some config files need to changed from hadoop-version/conf directory. The linux user will be same for both the instances. I have identified the following attributes, but, I am not sure if this is good enough.
hdfs-site.xml
: dfs.data.dir
and dfs.name.dir
core-site.xml
: fs.default.name
and hadoop.tmp.dir
mapred-site.xml
: mapred.job.tracker
I couldn't find the attribute names for the port number of job tracker/task tracker/DFS web interface. Their default values are 50030, 50060 and 50070 respctively.
Are there any more attributes that need to be changed to ensure that the new hadoop instance is running in its own environment?
해결책
Look for ".address" in src/hdfs/hdfs-default.xml and src/mapred/mapred-default.xml, and you'll find plenty attributes defined there.
BTW, I had a box with firewall enabled, and I observed that the effective ports in default configuration are 50010, 50020, 50030, 50060, 50070, 50075 and 50090.