首先开门见山,说明一下MR的l两种种运行模式
1,在本地运行模式①在windows中,intellij idea或者eclipse中直接运行,文件存储可以是本地也可以是HDFS②在linux中,运行eclipse,文件存储可以是本地也可以是HDFS2,集群运行模式①,在windows中将MR程序编写好,运行main提交到集群中,交由yarn去调度运行,这一种,需要修改好多信息,因为平台不一样,需要修改如下:要在windows中解压一份hadoop安装包配置HADOOP_HOME和Path重新编译YarnRunner②,在linux中运行编写好的MR程序,提交到集群中③,编好MR程序,打包位jar,命令运行hadoop jar **.jar *******.java为什么要说一下运行模式呢?为了写MR程序在本地测试了通过之后,直接在集群上用命令运行,这样就会提高效率OK!接下来先简单的介绍一下Yarn的调度流程,用画图的方式,先大概的描述一下整个调度流程,明天分析源码 image.png
看一看进程的变化
运行 image.png
[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3400 Jps2236 DataNode2141 NameNode[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3478 Jps3434 RunJar2236 DataNode2141 NameNode[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3525 Jps3434 RunJar2236 DataNode2141 NameNode[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster3434 RunJar2236 DataNode2141 NameNode3663 Jps[songlj@my01 ~]$ jps3680 Jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster3434 RunJar2236 DataNode2141 NameNode[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster3751 Jps3434 RunJar2236 DataNode2141 NameNode[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3811 Jps3652 MRAppMaster3801 YarnChild3434 RunJar2236 DataNode2141 NameNode[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster3831 Jps3801 YarnChild3434 RunJar2236 DataNode2141 NameNode[songlj@my01 ~]$ jsp-bash: jsp: command not found[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster3913 Jps3434 RunJar2236 DataNode2141 NameNode3903 YarnChild[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster3434 RunJar2236 DataNode2141 NameNode3934 Jps3903 YarnChild[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster3992 Jps3434 RunJar2236 DataNode2141 NameNode[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster4004 Jps2236 DataNode2141 NameNode[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager3652 MRAppMaster2236 DataNode2141 NameNode4015 Jps[songlj@my01 ~]$ jps2561 ResourceManager2417 SecondaryNameNode2659 NodeManager2236 DataNode2141 NameNode4030 Jps[songlj@my01 ~]$ 复制代码
从上面就可以看出
RunJarMRAppMasterYarnChild这三个进程的出现以及消失,就可以看出Yarn在资源调度的时候产生的进程以及过程好了,今天就分享到这里,明天看Yarn调度过程的源码望指正,不吝赐教!