Saturday, December 29, 2012

Test Hadoop cluster on vmware

SQL Server MVP Jeremiah Peschka posted 2 articles about Hadoop, which makes me be interested on the nosql skill.

I don't have much knowledge on Nosql and Linux system, so I am going to setup a testing environment on my laptop in holidays

1. download CentOS Linux setup iso file
http://www.centos.org/

2. download java jdk 1.6
http://www.oracle.com/technetwork/java/javase/downloads/index.html

3. download hadoop setup file
http://hadoop.apache.org/#Download+Hadoop

I downloaded release 1.0.4

4. Create VM with VMware workstation
I created 3 vm
linux1 : 192.168.27.29   ----->master

linux2 : 192.168.27.31   ----->slaver
linux3 : 192.168.27.32   ----->slaver


5. install Linux OS

6. Configure vm ip address
vi /etc/sysconfig/network-scripts/ifcfg-eth0

7. Configure host name and hosts file
vi /etc/sysconfig/network          --------->set the hostname
vi /etc/hosts                              --------->add ip hostname mapping for all 3 servers, for instance
192.168.27.29 linux1
192.168.27.31 linux2
192.168.27.32 linux3

8. Install JDK
Copy the jdk install file to vm with vmware share folders, and unzip it to local folder. I installed the jdk in /usr/jdk1.6.0-37

9. Install Hadoop
Copy the install file to vm with vmware share folders, and unzip it to local folder. I installed the hadoop files in /usr/hadoop-1.0.4

10. create folder to Hadoop
temp folder: /usr/hadoop-1.0.4/temp
Data folder: /usr/hadoopfiles/Data
Name folder:/usr/hadoopfiles/Name

make sure the folder owner is the user which will start hadoop thread. and for Data folder and Name folder, the permission should be 755
chmod 755 /usr/hadoopfiles/Data

11. Set environment variable
vi /etc/profile

then add the line below:

HADOOP_HOME=/usr/hadoop-1.0.4
JAVA_HOME=/usr/jdk1.6.0_37
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$CLASSPATH
PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH
export JAVA_HOME
export HADOOP_HOME
export CLASSPATH
export PATH

12. Setup SSH
1) generate ssh pub key file on all 3 servers
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

run "ssh localhost" to test if ssh works. make sure the authorized_keys file has correct permission, that's important
chmod 644 authorized_keys

2)Copy the file id_dsa.pub to other 2 servers with a new file name, for instance
on linux1, copy the id_dsa.pub to lunix2 and linux3 with name linux1_id_dsa.pub

3) log on other 2 servers, import the new file
cat ~/.ssh/linux1_id_dsa.pub >> ~/.ssh/authorized_keys

do the 3 steps on all 3 servers, make sure you can ssh log on any remote server without password prompt.


13. Configure Hadoop.
1) Open $HADOOP_HOME/conf/hadoop_env.sh, set the line below
export JAVA_HOME=/usr/jdk1.6.0_37

2) Open $HADOOP_HOME/conf/masters, add line below
linux1


3) Open $HADOOP_HOME/conf/slavers, add line below
linux2
linux3


4) Edit $HADOOP_HOME/conf/core-site.xml

<configuration>
<!--- global properties -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop-1.0.4/tmp</value>(这里可以自己配置一个存放tmp的文件夹路径)
<description>A base for other temporary directories.</description>
</property>
<!-- file system properties -->
<property>
<name>fs.default.name</name>
<value>hdfs://linux1:9000</value>
</property>
</configuration>


5) Edit $HADOOP_HOME/conf/hdfs-site.xml


<configuration>
<!--- global properties -->
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/usr/HadoopFiles/Name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/HadoopFiles/Data</value>
</property>
</configuration>












6) Edit $HADOOP_HOME/conf/mapred-site.xml



<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>

do the same configuration on all 3 servers

13) disable firewall on all 3 servers
service iptables stop
chkconfig iptables off

14) format name node
cd /usr/hadoop-1.0.4/bin
./hadoop namenode -format

15) start hadoop on master(linux1)
./start-all.sh

16) run "jps" on all 3 servers to check if hadoop is running
or you can open the website below
http://linux1:50030
http://linux1:50070

you can check the log file in logs folder in case any process can not be run.

it is a good start to learn hadoop, even Microsoft is developing data solutions with hadoop on window platform, so it is time to learn new things

reference:
http://blog.csdn.net/skyering/article/details/6457466












48 comments:

  1. I like the helpful hadoop information you provide for your tutorials. I’ll bookmark your weblog and check again here frequently. I am quite sure I’ll learn many new stuff proper here! Best of luck for the following!
    Hadoop Training in hyderabad

    ReplyDelete
  2. Thanks so very much for taking your time to create this very useful and informative site. I have learned a lot from your site. Thanks!!

    Hadoop Course in Chennai

    ReplyDelete
  3. Your posts is really helpful for me.Thanks for your wonderful post. I am very happy to read your post. It is really very helpful for us and I have gathered some important information from this blog.

    Salesforce Training

    ReplyDelete
  4. Hi I am Victoria lives in Chennai. I am a technology freak. Recently I did Java Course in Chennai at a leading Java Training Institutes in Chennai. This is really helpful for me to make a bright carrer in IT industry.

    ReplyDelete
  5. Dot Net Training Chennai

    Thanks for your wonderful post.It is really very helpful for us and I have gathered some important information from this blog.If anyone wants to get Dot Net Training in Chennai reach FITA, rated as No.1 Dot Net Training Institute in Chennai.

    Dot Net Course in Chennai

    Dot Net Training


    ReplyDelete
  6. QTP Training Chennai

    Hi, I wish to be a regular contributor of your blog. I have read your blog. Your information is really useful for beginner. I did Testing Training in Chennai at Fita training and placement academy which offer best Software Testing Training in Chennai with years of experienced professionals. This is really useful for me to make a bright career.

    Regards...

    Software Testing Training Institutes in Chennai

    ReplyDelete
  7. Uniqe informative article and of course True words, thanks for sharing. Today I see myself proud to be a hadoop professional with strong dedication and will power by blasting the obstacles. Thanks to Hadoop Training Chennai

    ReplyDelete
  8. Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.
    AWS Training in chennai | AWS Training chennai | AWS course in chennai

    ReplyDelete
  9. very nice blogs!!! i have to learning for lot of information for this sites...Sharing for wonderful information.
    VMWare Training in chennai | VMWare Training chennai | VMWare course in chennai

    ReplyDelete
  10. Nice piece of article you have shared here, my dream of becoming a hadoop professional become true with the help of Hadoop Training in Chennai, keep up your good work of sharing quality articles.

    ReplyDelete
  11. Your blog is really awesome and I got some useful information from your blog. This is really useful for me. Thanks for sharing such a informative blog. Keep posting.

    Regards..
    Cloud Computing Course in Chennai

    ReplyDelete
  12. Truely a very good article on how to handle the future technology. This content creates a new hope and inspiration within me. Thanks for sharing article like this. The way you have stated everything above is quite awesome. Keep blogging like this. Thanks :)

    Software testing training in chennai | Software testing course chennai | Automation testing courses in chennai

    ReplyDelete
  13. Using big data analytics may give the companies many fruitful results, the findings can be implemented in their business decisions so as to minimize their risk and to cut the costs.
    hadoop training in chennai|big data training|big data training in chennai

    ReplyDelete
  14. Thanks for appreciating. Really means and inspires a lot to hear from you guys.I have bookmarked it and I am looking forward to reading new articles. Keep up the good work..Believe me, This is very helpful for me.

    Salesforce Training in Chennai

    Web Designing Training in Chennai

    ReplyDelete
  15. Thanks for appreciating. Really means and inspires a lot to hear from you guys.I have bookmarked it and I am looking forward to reading new articles. Keep up the good work..Believe me, This is very helpful for me.

    Salesforce Training in Chennai

    Web Designing Training in Chennai

    ReplyDelete
  16. I accept there are numerous more pleasurable open doors ahead for people that took a gander at your site.
    iosh course in chennai

    ReplyDelete
  17. Hi, thank you very much for new information, i learned something new. Very well written.It was so good to read and usefull to improve knowledge.Keep posting. If you are looking for any big data hadoop related information please visit our website.
    big data hadoop training in bangalore.

    ReplyDelete
  18. Thanks for sharing useful information. I learned something new from your bog. Its very interesting and informative. keep updating. If you are looking for any Data science related information, please visit our website bigdata training institute in bangalore.

    ReplyDelete
  19. Everything is very open with a clear explanation of the issues. It was definitely informative. Your site is very helpful. Many thanks for sharing!
    Gadgets

    ReplyDelete

  20. Wow. That is so elegant and logical and clearly explained. Brilliantly goes through what could be a complex process and makes it obvious.I want to refer about the best websphere admin training and websphere tutorial

    ReplyDelete
  21. Very well and informative post..
    Thanks for sharing with us,
    We are again come on your website,
    Thanks and good day,
    Please visit our site,
    buylogo

    ReplyDelete
  22. great post and creative ideas. I am happy to visit and read useful articles here. I hope you continue to do the sharing through the post to the reader. and good luck for the visitors site.


    Big Data Hadoop Training In Chennai | Big Data Hadoop Training In anna nagar | Big Data Hadoop Training In omr | Big Data Hadoop Training In porur | Big Data Hadoop Training In tambaram | Big Data Hadoop Training In velachery

    ReplyDelete
  23. I just see the post i am so happy the post of information's.So I have really enjoyed and reading your blogs for these posts.Any way I’ll be subscribing to your feed and I hope you post again soon. thanks for ur efforts
    Ai & Artificial Intelligence Course in Chennai
    PHP Training in Chennai
    Ethical Hacking Course in Chennai Blue Prism Training in Chennai
    UiPath Training in Chennai

    ReplyDelete
  24. I really enjoy simply reading all of your weblogs. Simply wanted to inform you that you have people like me who appreciate your work. Definitely a great post. Hats off to you! The information that you have provided is very helpful.

    artificial intelligence course in bangalore

    ReplyDelete
  25. It has fully emerged to crown Singapore's southern shores and undoubtedly placed her on the global map of residential landmarks. I still scored the more points than I ever have in a season for GS. I think you would be hard pressed to find somebody with the same consistency I have had over the years so I am happy with that.
    artificial intelligence course in bangalore

    ReplyDelete
  26. Excellent article. Very interesting to read. I really love to read such a nice article. Thanks! keep rocking Best data science courses in hyerabad

    ReplyDelete
  27. I wanted to leave a little comment to support you and wish you a good continuation. Wishing you the best of luck for all your blogging efforts.
    machine learning courses in bangalore

    ReplyDelete
  28. I have express a few of the articles on your website now, and I really like your style of blogging. I added it to my favorite’s blog site list and will be checking back soon…
    machine learning courses in bangalore

    ReplyDelete
  29. I recently found many useful information in your website especially this blog page. Among the lots of comments on your articles. Thanks for sharing.business analytics course

    ReplyDelete
  30. I feel extremely appreciative that I read this. It is extremely useful and exceptionally enlightening and I truly took in a ton from it. data science training in kanpur

    ReplyDelete
  31. Study chair stop information color step. Dream economic cup want.news today live

    ReplyDelete
  32. Superb article. I was in search of this kind of blog.Thanks again.
    SQL Classes in Pune

    ReplyDelete