如何利用MapReduce的分治策略提高KNN算法的运行速度
集群环境介绍:
hadoop2.4.164位
6台服务器:
hadoop11NameNode、SecondaryNameNode
hadoop22ResourceManager
hadoop33DataNode、NodeManager
hadoop44DataNode、NodeManager
hadoop55DataNode、NodeManager
hadoop66DataNode、NodeManager
实验1:训练集train.txt样例个数为245057(3.24M)测试集test.txt样例个数为51444(640kb),并将全部测试集都存放在test.txt中
[root@hadoop11local]#hadoopfs-lsr/dir6/
-rw-r--r--3rootsupergroup34008162016-07-1719:28/dir6/test.txt
注意:此时所有的测试集都在一个文本中(test.txt)存放,作为输入路径
KNN算法运行日志:
16/07/1719:32:24INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032
16/07/1719:32:25WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.
16/07/1719:32:25INFOinput.FileInputFormat:Totalinputpathstoprocess:1
16/07/1719:32:25INFOmapreduce.JobSubmitter:numberofsplits:1
16/07/1719:32:26INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0016
16/07/1719:32:26INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0016
16/07/1719:32:26INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0016/
16/07/1719:32:26INFOmapreduce.Job:Runningjob:job_1468752229715_0016
16/07/1719:32:32INFOmapreduce.Job:Jobjob_1468752229715_0016runninginubermode:false
16/07/1719:32:32INFOmapreduce.Job:map0%reduce0%
16/07/1719:32:49INFOmapreduce.Job:map1%reduce0%
16/07/1719:33:05INFOmapreduce.Job:map2%reduce0%
16/07/1719:33:20INFOmapreduce.Job:map3%reduce0%
16/07/1719:33:35INFOmapreduce.Job:map4%reduce0%
16/07/1719:33:50INFOmapreduce.Job:map5%reduce0%
16/07/1719:34:02INFOmapreduce.Job:map6%reduce0%
16/07/1719:34:17INFOmapreduce.Job:map7%reduce0%
16/07/1719:34:32INFOmapreduce.Job:map8%reduce0%
16/07/1719:34:47INFOmapreduce.Job:map9%reduce0%
16/07/1719:35:02INFOmapreduce.Job:map10%reduce0%
16/07/1719:35:14INFOmapreduce.Job:map11%reduce0%
16/07/1719:35:29INFOmapreduce.Job:map12%reduce0%
16/07/1719:35:44INFOmapreduce.Job:map13%reduce0%
16/07/1719:35:59INFOmapreduce.Job:map14%reduce0%
16/07/1719:36:12INFOmapreduce.Job:map15%reduce0%
16/07/1719:36:27INFOmapreduce.Job:map16%reduce0%
16/07/1719:36:42INFOmapreduce.Job:map17%reduce0%
16/07/1719:36:57INFOmapreduce.Job:map18%reduce0%
16/07/1719:37:12INFOmapreduce.Job:map19%reduce0%
16/07/1719:37:27INFOmapreduce.Job:map20%reduce0%
16/07/1719:37:39INFOmapreduce.Job:map21%reduce0%
16/07/1719:37:54INFOmapreduce.Job:map22%reduce0%
16/07/1719:38:09INFOmapreduce.Job:map23%reduce0%
16/07/1719:38:24INFOmapreduce.Job:map24%reduce0%
16/07/1719:38:39INFOmapreduce.Job:map25%reduce0%
16/07/1719:38:51INFOmapreduce.Job:map26%reduce0%
16/07/1719:39:06INFOmapreduce.Job:map27%reduce0%
16/07/1719:39:22INFOmapreduce.Job:map28%reduce0%
16/07/1719:39:37INFOmapreduce.Job:map29%reduce0%
16/07/1719:39:52INFOmapreduce.Job:map30%reduce0%
16/07/1719:40:07INFOmapreduce.Job:map31%reduce0%
16/07/1719:40:22INFOmapreduce.Job:map32%reduce0%
16/07/1719:40:37INFOmapreduce.Job:map33%reduce0%
16/07/1719:40:52INFOmapreduce.Job:map34%reduce0%
16/07/1719:41:04INFOmapreduce.Job:map35%reduce0%
16/07/1719:41:22INFOmapreduce.Job:map36%reduce0%
16/07/1719:41:37INFOmapreduce.Job:map37%reduce0%
16/07/1719:41:52INFOmapreduce.Job:map38%reduce0%
16/07/1719:42:07INFOmapreduce.Job:map39%reduce0%
16/07/1719:42:22INFOmapreduce.Job:map40%reduce0%
16/07/1719:42:37INFOmapreduce.Job:map41%reduce0%
16/07/1719:42:53INFOmapreduce.Job:map42%reduce0%
16/07/1719:43:08INFOmapreduce.Job:map43%reduce0%
16/07/1719:43:23INFOmapreduce.Job:map44%reduce0%
16/07/1719:43:41INFOmapreduce.Job:map45%reduce0%
16/07/1719:43:56INFOmapreduce.Job:map46%reduce0%
16/07/1719:44:12INFOmapreduce.Job:map47%reduce0%
16/07/1719:44:30INFOmapreduce.Job:map48%reduce0%
16/07/1719:44:45INFOmapreduce.Job:map49%reduce0%
16/07/1719:45:00INFOmapreduce.Job:map50%reduce0%
16/07/1719:45:15INFOmapreduce.Job:map51%reduce0%
16/07/1719:45:30INFOmapreduce.Job:map52%reduce0%
16/07/1719:45:48INFOmapreduce.Job:map53%reduce0%
16/07/1719:46:03INFOmapreduce.Job:map54%reduce0%
16/07/1719:46:18INFOmapreduce.Job:map55%reduce0%
16/07/1719:46:33INFOmapreduce.Job:map56%reduce0%
16/07/1719:46:49INFOmapreduce.Job:map57%reduce0%
16/07/1719:47:07INFOmapreduce.Job:map58%reduce0%
16/07/1719:47:22INFOmapreduce.Job:map59%reduce0%
16/07/1719:47:37INFOmapreduce.Job:map60%reduce0%
16/07/1719:47:55INFOmapreduce.Job:map61%reduce0%
16/07/1719:48:10INFOmapreduce.Job:map62%reduce0%
16/07/1719:48:25INFOmapreduce.Job:map63%reduce0%
16/07/1719:48:43INFOmapreduce.Job:map64%reduce0%
16/07/1719:48:58INFOmapreduce.Job:map65%reduce0%
16/07/1719:49:13INFOmapreduce.Job:map66%reduce0%
16/07/1719:49:28INFOmapreduce.Job:map67%reduce0%
16/07/1719:49:30INFOmapreduce.Job:map100%reduce0%
16/07/1719:49:37INFOmapreduce.Job:map100%reduce100%
16/07/1719:49:38INFOmapreduce.Job:Jobjob_1468752229715_0016completedsuccessfully
16/07/1719:49:39INFOmapreduce.Job:Counters:49
FileSystemCounters
FILE:Numberofbytesread=2892255
FILE:Numberofbyteswritten=5971253
FILE:Numberofreadoperations=0
FILE:Numberoflargereadoperations=0
FILE:Numberofwriteoperations=0
HDFS:Numberofbytesread=4056338
HDFS:Numberofbyteswritten=861195
HDFS:Numberofreadoperations=7
HDFS:Numberoflargereadoperations=0
HDFS:Numberofwriteoperations=2
JobCounters
Launchedmaptasks=1
Launchedreducetasks=1
Data-localmaptasks=1
Totaltimespentbyallmapsinoccupiedslots(ms)=1016177
Totaltimespentbyallreducesinoccupiedslots(ms)=4948
Totaltimespentbyallmaptasks(ms)=1016177
Totaltimespentbyallreducetasks(ms)=4948
Totalvcore-secondstakenbyallmaptasks=1016177
Totalvcore-secondstakenbyallreducetasks=4948
Totalmegabyte-secondstakenbyallmaptasks=1040565248
Totalmegabyte-secondstakenbyallreducetasks=5066752
Map-ReduceFramework
Mapinputrecords=51444
Mapoutputrecords=154332
Mapoutputbytes=2583585
Mapoutputmaterializedbytes=2892255
Inputsplitbytes=103
Combineinputrecords=0
Combineoutputrecords=0
Reduceinputgroups=51444
Reduceshufflebytes=2892255
Reduceinputrecords=154332
Reduceoutputrecords=51444
SpilledRecords=308664
ShuffledMaps=1
FailedShuffles=0
MergedMapoutputs=1
GCtimeelapsed(ms)=5836
CPUtimespent(ms)=1033510
Physicalmemory(bytes)snapshot=517627904
Virtualmemory(bytes)snapshot=1786634240
Totalcommittedheapusage(bytes)=306774016
ShuffleErrors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
FileInputFormatCounters
BytesRead=655419
FileOutputFormatCounters
BytesWritten=861195
统计:
精确度:5144451367
CPUtimespent(ms)=1033510
maptasks=1
实验2:训练集train.txt样例个数为245057不变测试集test.txt样例个数为51444,并将全部测试集存放在
test1.txt(25568)和test2.txt(25857)中
[root@hadoop11local]#hadoopfs-lsr/dir6/
-rw-r--r--3rootsupergroup3687742016-07-1720:15/dir6/test1.txt
-rw-r--r--3rootsupergroup3122102016-07-1720:15/dir6/test2.txt
KNN算法运行日志:
先看进程日志:
[root@hadoop66~]#jps
24659YarnChild(mapper任务)
22777DataNode
25592Jps
24660YarnChild(mapper任务)
24557MRAppMaster
22622NodeManager
计数器日志:
[root@hadoop11local]#app1.sh
16/07/1720:21:03INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032
16/07/1720:21:03WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.
16/07/1720:21:03INFOinput.FileInputFormat:Totalinputpathstoprocess:2
16/07/1720:21:03INFOmapreduce.JobSubmitter:numberofsplits:2
16/07/1720:21:03INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0019
16/07/1720:21:04INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0019
16/07/1720:21:04INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0019/
16/07/1720:21:04INFOmapreduce.Job:Runningjob:job_1468752229715_0019
16/07/1720:21:10INFOmapreduce.Job:Jobjob_1468752229715_0019runninginubermode:false
16/07/1720:21:10INFOmapreduce.Job:map0%reduce0%
16/07/1720:21:21INFOmapreduce.Job:map1%reduce0%
16/07/1720:21:30INFOmapreduce.Job:map2%reduce0%
16/07/1720:21:40INFOmapreduce.Job:map3%reduce0%
16/07/1720:21:46INFOmapreduce.Job:map4%reduce0%
16/07/1720:21:55INFOmapreduce.Job:map5%reduce0%
16/07/1720:22:01INFOmapreduce.Job:map6%reduce0%
16/07/1720:22:10INFOmapreduce.Job:map7%reduce0%
16/07/1720:22:17INFOmapreduce.Job:map8%reduce0%
16/07/1720:22:26INFOmapreduce.Job:map9%reduce0%
16/07/1720:22:35INFOmapreduce.Job:map10%reduce0%
16/07/1720:22:41INFOmapreduce.Job:map11%reduce0%
16/07/1720:22:47INFOmapreduce.Job:map12%reduce0%
16/07/1720:22:56INFOmapreduce.Job:map13%reduce0%
16/07/1720:23:05INFOmapreduce.Job:map14%reduce0%
16/07/1720:23:11INFOmapreduce.Job:map15%reduce0%
16/07/1720:23:17INFOmapreduce.Job:map16%reduce0%
16/07/1720:23:26INFOmapreduce.Job:map17%reduce0%
16/07/1720:23:35INFOmapreduce.Job:map18%reduce0%
16/07/1720:23:41INFOmapreduce.Job:map19%reduce0%
16/07/1720:23:50INFOmapreduce.Job:map20%reduce0%
16/07/1720:23:56INFOmapreduce.Job:map21%reduce0%
16/07/1720:24:05INFOmapreduce.Job:map22%reduce0%
16/07/1720:24:11INFOmapreduce.Job:map23%reduce0%
16/07/1720:24:20INFOmapreduce.Job:map24%reduce0%
16/07/1720:24:26INFOmapreduce.Job:map25%reduce0%
16/07/1720:24:35INFOmapreduce.Job:map26%reduce0%
16/07/1720:24:42INFOmapreduce.Job:map27%reduce0%
16/07/1720:24:51INFOmapreduce.Job:map28%reduce0%
16/07/1720:24:57INFOmapreduce.Job:map29%reduce0%
16/07/1720:25:06INFOmapreduce.Job:map30%reduce0%
16/07/1720:25:12INFOmapreduce.Job:map31%reduce0%
16/07/1720:25:21INFOmapreduce.Job:map32%reduce0%
16/07/1720:25:27INFOmapreduce.Job:map33%reduce0%
16/07/1720:25:36INFOmapreduce.Job:map34%reduce0%
16/07/1720:25:42INFOmapreduce.Job:map35%reduce0%
16/07/1720:25:51INFOmapreduce.Job:map36%reduce0%
16/07/1720:25:57INFOmapreduce.Job:map37%reduce0%
16/07/1720:26:06INFOmapreduce.Job:map38%reduce0%
16/07/1720:26:12INFOmapreduce.Job:map39%reduce0%
16/07/1720:26:21INFOmapreduce.Job:map40%reduce0%
16/07/1720:26:30INFOmapreduce.Job:map41%reduce0%
16/07/1720:26:36INFOmapreduce.Job:map42%reduce0%
16/07/1720:26:45INFOmapreduce.Job:map43%reduce0%
16/07/1720:26:51INFOmapreduce.Job:map44%reduce0%
16/07/1720:27:00INFOmapreduce.Job:map45%reduce0%
16/07/1720:27:06INFOmapreduce.Job:map46%reduce0%
16/07/1720:27:15INFOmapreduce.Job:map47%reduce0%
16/07/1720:27:21INFOmapreduce.Job:map48%reduce0%
16/07/1720:27:30INFOmapreduce.Job:map49%reduce0%
16/07/1720:27:36INFOmapreduce.Job:map50%reduce0%
16/07/1720:27:45INFOmapreduce.Job:map51%reduce0%
16/07/1720:27:51INFOmapreduce.Job:map52%reduce0%
16/07/1720:28:01INFOmapreduce.Job:map53%reduce0%
16/07/1720:28:07INFOmapreduce.Job:map54%reduce0%
16/07/1720:28:16INFOmapreduce.Job:map55%reduce0%
16/07/1720:28:23INFOmapreduce.Job:map56%reduce0%
16/07/1720:28:31INFOmapreduce.Job:map57%reduce0%
16/07/1720:28:38INFOmapreduce.Job:map58%reduce0%
16/07/1720:28:46INFOmapreduce.Job:map59%reduce0%
16/07/1720:28:53INFOmapreduce.Job:map60%reduce0%
16/07/1720:29:02INFOmapreduce.Job:map61%reduce0%
16/07/1720:29:10INFOmapreduce.Job:map62%reduce0%
16/07/1720:29:17INFOmapreduce.Job:map63%reduce0%
16/07/1720:29:26INFOmapreduce.Job:map64%reduce0%
16/07/1720:29:32INFOmapreduce.Job:map65%reduce0%
16/07/1720:29:41INFOmapreduce.Job:map66%reduce0%
16/07/1720:29:42INFOmapreduce.Job:map83%reduce0%
16/07/1720:29:52INFOmapreduce.Job:map83%reduce17%
16/07/1720:29:54INFOmapreduce.Job:map100%reduce17%
16/07/1720:29:55INFOmapreduce.Job:map100%reduce70%
16/07/1720:29:56INFOmapreduce.Job:map100%reduce100%
16/07/1720:29:56INFOmapreduce.Job:Jobjob_1468752229715_0019completedsuccessfully
16/07/1720:29:56INFOmapreduce.Job:Counters:49
FileSystemCounters
FILE:Numberofbytesread=2892255
FILE:Numberofbyteswritten=6064619
FILE:Numberofreadoperations=0
FILE:Numberoflargereadoperations=0
FILE:Numberofwriteoperations=0
HDFS:Numberofbytesread=7482816
HDFS:Numberofbyteswritten=861195
HDFS:Numberofreadoperations=11
HDFS:Numberoflargereadoperations=0
HDFS:Numberofwriteoperations=2
JobCounters
Launchedmaptasks=2
Launchedreducetasks=1
Data-localmaptasks=2
Totaltimespentbyallmapsinoccupiedslots(ms)=1032086
Totaltimespentbyallreducesinoccupiedslots(ms)=11757
Totaltimespentbyallmaptasks(ms)=1032086
Totaltimespentbyallreducetasks(ms)=11757
Totalvcore-secondstakenbyallmaptasks=1032086
Totalvcore-secondstakenbyallreducetasks=11757
Totalmegabyte-secondstakenbyallmaptasks=1056856064
Totalmegabyte-secondstakenbyallreducetasks=12039168
Map-ReduceFramework
Mapinputrecords=51444
Mapoutputrecords=154332
Mapoutputbytes=2583585
Mapoutputmaterializedbytes=2892261
Inputsplitbytes=200
Combineinputrecords=0
Combineoutputrecords=0
Reduceinputgroups=51444
Reduceshufflebytes=2892261
Reduceinputrecords=154332
Reduceoutputrecords=51444
SpilledRecords=308664
ShuffledMaps=2
FailedShuffles=0
MergedMapoutputs=2
GCtimeelapsed(ms)=8264
CPUtimespent(ms)=1045670
Physicalmemory(bytes)snapshot=762257408
Virtualmemory(bytes)snapshot=2654359552
Totalcommittedheapusage(bytes)=496762880
ShuffleErrors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
FileInputFormatCounters
BytesRead=680984
FileOutputFormatCounters
BytesWritten=861195
16/07/1720:29:58INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032
16/07/1720:29:59WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.
16/07/1720:29:59INFOinput.FileInputFormat:Totalinputpathstoprocess:1
16/07/1720:29:59INFOmapreduce.JobSubmitter:numberofsplits:1
16/07/1720:29:59INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0020
16/07/1720:29:59INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0020
16/07/1720:30:00INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0020/
16/07/1720:30:00INFOmaprwww.wang027.comeduce.Job:Runningjob:job_1468752229715_0020
16/07/1720:30:05INFOmapreduce.Job:Jobjob_1468752229715_0020runninginubermode:false
16/07/1720:30:05INFOmapreduce.Job:map0%reduce0%
16/07/1720:30:12INFOmapreduce.Job:map100%reduce0%
16/07/1720:30:18INFOmapreduce.Job:map100%reduce100%
16/07/1720:30:18INFOmapreduce.Job:Jobjob_1468752229715_0020completedsuccessfully
16/07/1720:30:18INFOmapreduce.Job:Counters:49
FileSystemCounters
FILE:Numberofbytesread=24
FILE:Numberofbyteswritten=186173
FILE:Numberofreadoperations=0
FILE:Numberoflargereadoperations=0
FILE:Numberofwriteoperations=0
HDFS:Numberofbytesread=861298
HDFS:Numberofbyteswritten=12
HDFS:Numberofreadoperations=6
HDFS:Numberoflargereadoperations=0
HDFS:Numberofwriteoperations=2
JobCounters
Launchedmaptasks=1
Launchedreducetasks=1
Data-localmaptasks=1
Totaltimespentbyallmapsinoccupiedslots(ms)=3973
Totaltimespentbyallreducesinoccupiedslots(ms)=3243
Totaltimespentbyallmaptasks(ms)=3973
Totaltimespentbyallreducetasks(ms)=3243
Totalvcore-secondstakenbyallmaptasks=3973
Totalvcore-secondstakenbyallreducetasks=3243
Totalmegabyte-secondstakenbyallmaptasks=4068352
Totalmegabyte-secondstakenbyallreducetasks=3320832
Map-ReduceFramework
Mapinputrecords=51444
Mapoutputrecords=1
Mapoutputbytes=16
Mapoutputmaterializedbytes=24
Inputsplitbytes=103
Combineinputrecords=0
Combineoutputrecords=0
Reduceinputgroups=1
Reduceshufflebytes=24
Reduceinputrecords=1
Reduceoutputrecords=1
SpilledRecords=2
ShuffledMaps=1
FailedShuffles=0
MergedMapoutputs=1
GCtimeelapsed(ms)=70
CPUtimespent(ms)=2340
Physicalmemory(bytes)snapshot=451612672
Virtualmemory(bytes)snapshot=1790021632
Totalcommittedheapusage(bytes)=309002240
ShuffleErrors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
FileInputFormatCounters
BytesRead=861195
FileOutputFormatCounters
BytesWritten=12
统计:
精确度:5144451367
CPUtimespent(ms)=1045670(时间之所以长:在于mapper任务的创建花费了时间,并且两个mapper任务都在同一个服务器hadoop66运行)
maptasks=2
实验3:训练集train.txt样例个数为245057不变测试集test.txt样例个数为51444,并将全部测试集存放在
test1.txt(25402)和test2.txt(15224)和test3.txt(10818)中
[root@hadoop11local]#hadoopfs-lsr/dir6/
lsr:DEPRECATED:Pleaseuse''ls-R''instead.
-rw-r--r--3rootsupergroup1281612016-07-1720:54/dir6/test1.txt
-rw-r--r--3rootsupergroup3663132016-07-1720:54/dir6/test2.txt
-rw-r--r--3rootsupergroup2015662016-07-1720:54/dir6/test3.txt
1
2
3
4
5
1
2
3
4
5
先看进程日志:
[root@hadoop33~]#jps
26501Jps
26279YarnChild(mapper任务)
2399QuorumPeerMain
26280YarnChild(mapper任务)
23800DataNode
23648NodeManager
26133MRAppMaster
[root@hadoop66~]#jps
22777DataNode
26652Jps
26302YarnChild(mapper任务)
22622NodeManager
此时可以看出,此时mapper任务的执行有两台服务器来执行---分而治之!
具体运行日志:
[root@hadoop11local]#app1.sh
16/07/1720:55:17INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032
16/07/1720:55:18WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.
16/07/1720:55:18INFOinput.FileInputFormat:Totalinputpathstoprocess:3
16/07/1720:55:18INFOmapreduce.JobSubmitter:numberofsplits:3
16/07/1720:55:18INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0021
16/07/1720:55:19INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0021
16/07/1720:55:19INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0021/
16/07/1720:55:19INFOmapreduce.Job:Runningjob:job_1468752229715_0021
16/07/1720:55:25INFOmapreduce.Job:Jobjob_1468752229715_0021runninginubermode:false
16/07/1720:55:25INFOmapreduce.Job:map0%reduce0%
16/07/1720:55:37INFOmapreduce.Job:map1%reduce0%
16/07/1720:55:40INFOmapreduce.Job:map2%reduce0%
16/07/1720:55:45INFOmapreduce.Job:map3%reduce0%
16/07/1720:55:49INFOmapreduce.Job:map4%reduce0%
16/07/1720:55:54INFOmapreduce.Job:map5%reduce0%
16/07/1720:55:58INFOmapreduce.Job:map6%reduce0%
16/07/1720:56:03INFOmapreduce.Job:map7%reduce0%
16/07/1720:56:07INFOmapreduce.Job:map8%reduce0%
16/07/1720:56:12INFOmapreduce.Job:map9%reduce0%
16/07/1720:56:16INFOmapreduce.Job:map10%reduce0%
16/07/1720:56:20INFOmapreduce.Job:map11%reduce0%
16/07/1720:56:24INFOmapreduce.Job:map12%reduce0%
16/07/1720:56:29INFOmapreduce.Job:map13%reduce0%
16/07/1720:56:33INFOmapreduce.Job:map14%reduce0%
16/07/1720:56:37INFOmapreduce.Job:map15%reduce0%
16/07/1720:56:42INFOmapreduce.Job:map16%reduce0%
16/07/1720:56:47INFOmapreduce.Job:map17%reduce0%
16/07/1720:56:51INFOmapreduce.Job:map18%reduce0%
16/07/1720:56:56INFOmapreduce.Job:map19%reduce0%
16/07/1720:57:00INFOmapreduce.Job:map20%reduce0%
16/07/1720:57:05INFOmapreduce.Job:map21%reduce0%
16/07/1720:57:08INFOmapreduce.Job:map22%reduce0%
16/07/1720:57:13INFOmapreduce.Job:map23%reduce0%
16/07/1720:57:18INFOmapreduce.Job:map24%reduce0%
16/07/1720:57:23INFOmapreduce.Job:map25%reduce0%
16/07/1720:57:27INFOmapreduce.Job:map26%reduce0%
16/07/1720:57:32INFOmapreduce.Job:map27%reduce0%
16/07/1720:57:36INFOmapreduce.Job:map28%reduce0%
16/07/1720:57:41INFOmapreduce.Job:map29%reduce0%
16/07/1720:57:45INFOmapreduce.Job:map30%reduce0%
16/07/1720:57:50INFOmapreduce.Job:map31%reduce0%
16/07/1720:57:54INFOmapreduce.Job:map32%reduce0%
16/07/1720:57:59INFOmapreduce.Job:map33%reduce0%
16/07/1720:58:03INFOmapreduce.Job:map34%reduce0%
16/07/1720:58:08INFOmapreduce.Job:map35%reduce0%
16/07/1720:58:12INFOmapreduce.Job:map36%reduce0%
16/07/1720:58:15INFOmapreduce.Job:map37%reduce0%
16/07/1720:58:20INFOmapreduce.Job:map38%reduce0%
16/07/1720:58:24INFOmapreduce.Job:map39%reduce0%
16/07/1720:58:29INFOmapreduce.Job:map40%reduce0%
16/07/1720:58:33INFOmapreduce.Job:map41%reduce0%
16/07/1720:58:38INFOmapreduce.Job:map42%reduce0%
16/07/1720:58:42INFOmapreduce.Job:map43%reduce0%
16/07/1720:58:47INFOmapreduce.Job:map44%reduce0%
16/07/1720:58:51INFOmapreduce.Job:map45%reduce0%
16/07/1720:58:56INFOmapreduce.Job:map46%reduce0%
16/07/1720:59:00INFOmapreduce.Job:map58%reduce0%
16/07/1720:59:06INFOmapreduce.Job:map59%reduce0%
16/07/1720:59:11INFOmapreduce.Job:map59%reduce11%
16/07/1720:59:15INFOmapreduce.Job:map60%reduce11%
16/07/1720:59:21INFOmapreduce.Job:map61%reduce11%
16/07/1720:59:30INFOmapreduce.Job:map62%reduce11%
16/07/1720:59:39INFOmapreduce.Job:map63%reduce11%
16/07/1720:59:48INFOmapreduce.Job:map64%reduce11%
16/07/1720:59:58INFOmapreduce.Job:map65%reduce11%
16/07/1721:00:04INFOmapreduce.Job:map66%reduce11%
16/07/1721:00:13INFOmapreduce.Job:map67%reduce11%
16/07/1721:00:23INFOmapreduce.Job:map68%reduce11%
16/07/1721:00:26INFOmapreduce.Job:map79%reduce11%
16/07/1721:00:27INFOmapreduce.Job:map79%reduce22%
16/07/1721:00:35INFOmapreduce.Job:map80%reduce22%
16/07/1721:00:59INFOmapreduce.Job:map81%reduce22%
16/07/1721:01:20INFOmapreduce.Job:map82%reduce22%
16/07/1721:01:44INFOmapreduce.Job:map83%reduce22%
16/07/1721:02:08INFOmapreduce.Job:map84%reduce22%
16/07/1721:02:32INFOmapreduce.Job:map85%reduce22%
16/07/1721:02:56INFOmapreduce.Job:map86%reduce22%
16/07/1721:03:17INFOmapreduce.Job:map87%reduce22%
16/07/1721:03:41INFOmapreduce.Job:map88%reduce22%
16/07/1721:04:06INFOmapreduce.Job:map89%reduce22%
16/07/1721:04:15INFOmapreduce.Job:map100%reduce22%
16/07/1721:04:16INFOmapreduce.Job:map100%reduce90%
16/07/1721:04:17INFOmapreduce.Job:map100%reduce100%
16/07/1721:04:17INFOmapreduce.Job:Jobjob_1468752229715_0021completedsuccessfully
16/07/1721:04:17INFOmapreduce.Job:Counters:50
FileSystemCounters
FILE:Numberofbytesread=2892255
FILE:Numberofbyteswritten=6158011
FILE:Numberofreadoperations=0
FILE:Numberoflargereadoperations=0
FILE:Numberofwriteoperations=0
HDFS:Numberofbytesread=10898788
HDFS:Numberofbyteswritten=861195
HDFS:Numberofreadoperations=15
HDFS:Numberoflargereadoperations=0
HDFS:Numberofwriteoperations=2
JobCounters
Killedmaptasks=2
Launchedmaptasks=5
Launchedreducetasks=1
Data-localmaptasks=5
Totaltimespentbyallmapsinoccupiedslots(ms)=1417294
Totaltimespentbyallreducesinoccupiedslots(ms)=313657
Totaltimespentbyallmaptasks(ms)=1417294
Totaltimespentbyallreducetasks(ms)=313657
Totalvcore-secondstakenbyallmaptasks=1417294
Totalvcore-secondstakenbyallreducetasks=313657
Totalmegabyte-secondstakenbyallmaptasks=1451309056
Totalmegabyte-secondstakenbyallreducetasks=321184768
Map-ReduceFramework
Mapinputrecords=51444
Mapoutputrecords=154332
Mapoutputbytes=2583585
Mapoutputmaterializedbytes=2892267
Inputsplitbytes=300
Combineinputrecords=0
Combineoutputrecords=0
Reduceinputgroups=51444
Reduceshufflebytes=2892267
Reduceinputrecords=154332
Reduceoutputrecords=51444
SpilledRecords=308664
ShuffledMaps=3
FailedShuffles=0
MergedMapoutputs=3
GCtimeelapsed(ms)=9078
CPUtimespent(ms)=1054730
Physicalmemory(bytes)snapshot=1011130368
Virtualmemory(bytes)snapshot=3553914880
Totalcommittedheapusage(bytes)=575209472
ShuffleErrors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
FileInputFormatCounters
BytesRead=696040
FileOutputFormatCounters
BytesWritten=861195
16/07/1721:04:19INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032
16/07/1721:04:19WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.
16/07/1721:04:20INFOinput.FileInputFormat:Totalinputpathstoprocess:1
16/07/1721:04:20INFOmapreduce.Jowww.baiyuewang.netbSubmitter:numberofsplits:1
16/07/1721:04:20INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0022
16/07/1721:04:20INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0022
16/07/1721:04:20INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0022/
16/07/1721:04:20INFOmapreduce.Job:Runningjob:job_1468752229715_0022
16/07/1721:04:27INFOmapreduce.Job:Jobjob_1468752229715_0022runninginubermode:false
16/07/1721:04:27INFOmapreduce.Job:map0%reduce0%
16/07/1721:04:33INFOmapreduce.Job:map100%reduce0%
16/07/1721:04:38INFOmapreduce.Job:map100%reduce100%
16/07/1721:04:38INFOmapreduce.Job:Jobjob_1468752229715_0022completedsuccessfully
16/07/1721:04:38INFOmapreduce.Job:Counters:49
FileSystemCounters
FILE:Numberofbytesread=24
FILE:Numberofbyteswritten=186173
FILE:Numberofreadoperations=0
FILE:Numberoflargereadoperations=0
FILE:Numberofwriteoperations=0
HDFS:Numberofbytesread=861298
HDFS:Numberofbyteswritten=12
HDFS:Numberofreadoperations=6
HDFS:Numberoflargereadoperations=0
HDFS:Numberofwriteoperations=2
JobCounters
Launchedmaptasks=1
Launchedreducetasks=1
Data-localmaptasks=1
Totaltimespentbyallmapsinoccupiedslots(ms)=3580
Totaltimespentbyallreducesinoccupiedslots(ms)=3393
Totaltimespentbyallmaptasks(ms)=3580
Totaltimespentbyallreducetasks(ms)=3393
Totalvcore-secondstakenbyallmaptasks=3580
Totalvcore-secondstakenbyallreducetasks=3393
Totalmegabyte-secondstakenbyallmaptasks=3665920
Totalmegabyte-secondstakenbyallreducetasks=3474432
Map-ReduceFramework
Mapinputrecords=51444
Mapoutputrecords=1
Mapoutputbytes=16
Mapoutputmaterializedbytes=24
Inputsplitbytes=103
Combineinputrecords=0
Combineoutputrecords=0
Reduceinputgroups=1
Reduceshufflebytes=24
Reduceinputrecords=1
Reduceoutputrecords=1
SpilledRecords=2
ShuffledMaps=1
FailedShuffles=0
MergedMapoutputs=1
GCtimeelapsed(ms)=89
CPUtimespent(ms)=2360
Physicalmemory(bytes)snapshot=435548160
Virtualmemory(bytes)snapshot=1775456256
Totalcommittedheapusage(bytes)=310444032
ShuffleErrors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
FileInputFormatCounters
BytesRead=861195
FileOutputFormatCounters
BytesWritten=12
统计:
精确度:5144451367
CPUtimespent(ms)=1054730(此时看来数据量很小的时候,不太适合分而治之,间接说明了hadoop适合大数据)
maptasks=3
总结:MapReduce在处理大数据的时候,会逐渐发挥集群的优势,通过mapper任务的并行处理,提高大数据的处理速度!
|
|