配色: 字号:
如何利用MapReduce的分治策略提高KNN算法的运行速度
2016-12-08 | 阅:  转:  |  分享 
  
如何利用MapReduce的分治策略提高KNN算法的运行速度



集群环境介绍:



hadoop2.4.164位

6台服务器:

hadoop11NameNode、SecondaryNameNode

hadoop22ResourceManager

hadoop33DataNode、NodeManager

hadoop44DataNode、NodeManager

hadoop55DataNode、NodeManager

hadoop66DataNode、NodeManager





实验1:训练集train.txt样例个数为245057(3.24M)测试集test.txt样例个数为51444(640kb),并将全部测试集都存放在test.txt中



[root@hadoop11local]#hadoopfs-lsr/dir6/

-rw-r--r--3rootsupergroup34008162016-07-1719:28/dir6/test.txt

注意:此时所有的测试集都在一个文本中(test.txt)存放,作为输入路径



KNN算法运行日志:



16/07/1719:32:24INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032

16/07/1719:32:25WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.

16/07/1719:32:25INFOinput.FileInputFormat:Totalinputpathstoprocess:1

16/07/1719:32:25INFOmapreduce.JobSubmitter:numberofsplits:1

16/07/1719:32:26INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0016

16/07/1719:32:26INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0016

16/07/1719:32:26INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0016/

16/07/1719:32:26INFOmapreduce.Job:Runningjob:job_1468752229715_0016

16/07/1719:32:32INFOmapreduce.Job:Jobjob_1468752229715_0016runninginubermode:false

16/07/1719:32:32INFOmapreduce.Job:map0%reduce0%

16/07/1719:32:49INFOmapreduce.Job:map1%reduce0%

16/07/1719:33:05INFOmapreduce.Job:map2%reduce0%

16/07/1719:33:20INFOmapreduce.Job:map3%reduce0%

16/07/1719:33:35INFOmapreduce.Job:map4%reduce0%

16/07/1719:33:50INFOmapreduce.Job:map5%reduce0%

16/07/1719:34:02INFOmapreduce.Job:map6%reduce0%

16/07/1719:34:17INFOmapreduce.Job:map7%reduce0%

16/07/1719:34:32INFOmapreduce.Job:map8%reduce0%

16/07/1719:34:47INFOmapreduce.Job:map9%reduce0%

16/07/1719:35:02INFOmapreduce.Job:map10%reduce0%

16/07/1719:35:14INFOmapreduce.Job:map11%reduce0%

16/07/1719:35:29INFOmapreduce.Job:map12%reduce0%

16/07/1719:35:44INFOmapreduce.Job:map13%reduce0%

16/07/1719:35:59INFOmapreduce.Job:map14%reduce0%

16/07/1719:36:12INFOmapreduce.Job:map15%reduce0%

16/07/1719:36:27INFOmapreduce.Job:map16%reduce0%

16/07/1719:36:42INFOmapreduce.Job:map17%reduce0%

16/07/1719:36:57INFOmapreduce.Job:map18%reduce0%

16/07/1719:37:12INFOmapreduce.Job:map19%reduce0%

16/07/1719:37:27INFOmapreduce.Job:map20%reduce0%

16/07/1719:37:39INFOmapreduce.Job:map21%reduce0%

16/07/1719:37:54INFOmapreduce.Job:map22%reduce0%

16/07/1719:38:09INFOmapreduce.Job:map23%reduce0%

16/07/1719:38:24INFOmapreduce.Job:map24%reduce0%

16/07/1719:38:39INFOmapreduce.Job:map25%reduce0%

16/07/1719:38:51INFOmapreduce.Job:map26%reduce0%

16/07/1719:39:06INFOmapreduce.Job:map27%reduce0%

16/07/1719:39:22INFOmapreduce.Job:map28%reduce0%

16/07/1719:39:37INFOmapreduce.Job:map29%reduce0%

16/07/1719:39:52INFOmapreduce.Job:map30%reduce0%

16/07/1719:40:07INFOmapreduce.Job:map31%reduce0%

16/07/1719:40:22INFOmapreduce.Job:map32%reduce0%

16/07/1719:40:37INFOmapreduce.Job:map33%reduce0%

16/07/1719:40:52INFOmapreduce.Job:map34%reduce0%

16/07/1719:41:04INFOmapreduce.Job:map35%reduce0%

16/07/1719:41:22INFOmapreduce.Job:map36%reduce0%

16/07/1719:41:37INFOmapreduce.Job:map37%reduce0%

16/07/1719:41:52INFOmapreduce.Job:map38%reduce0%

16/07/1719:42:07INFOmapreduce.Job:map39%reduce0%

16/07/1719:42:22INFOmapreduce.Job:map40%reduce0%

16/07/1719:42:37INFOmapreduce.Job:map41%reduce0%

16/07/1719:42:53INFOmapreduce.Job:map42%reduce0%

16/07/1719:43:08INFOmapreduce.Job:map43%reduce0%

16/07/1719:43:23INFOmapreduce.Job:map44%reduce0%

16/07/1719:43:41INFOmapreduce.Job:map45%reduce0%

16/07/1719:43:56INFOmapreduce.Job:map46%reduce0%

16/07/1719:44:12INFOmapreduce.Job:map47%reduce0%

16/07/1719:44:30INFOmapreduce.Job:map48%reduce0%

16/07/1719:44:45INFOmapreduce.Job:map49%reduce0%

16/07/1719:45:00INFOmapreduce.Job:map50%reduce0%

16/07/1719:45:15INFOmapreduce.Job:map51%reduce0%

16/07/1719:45:30INFOmapreduce.Job:map52%reduce0%

16/07/1719:45:48INFOmapreduce.Job:map53%reduce0%

16/07/1719:46:03INFOmapreduce.Job:map54%reduce0%

16/07/1719:46:18INFOmapreduce.Job:map55%reduce0%

16/07/1719:46:33INFOmapreduce.Job:map56%reduce0%

16/07/1719:46:49INFOmapreduce.Job:map57%reduce0%

16/07/1719:47:07INFOmapreduce.Job:map58%reduce0%

16/07/1719:47:22INFOmapreduce.Job:map59%reduce0%

16/07/1719:47:37INFOmapreduce.Job:map60%reduce0%

16/07/1719:47:55INFOmapreduce.Job:map61%reduce0%

16/07/1719:48:10INFOmapreduce.Job:map62%reduce0%

16/07/1719:48:25INFOmapreduce.Job:map63%reduce0%

16/07/1719:48:43INFOmapreduce.Job:map64%reduce0%

16/07/1719:48:58INFOmapreduce.Job:map65%reduce0%

16/07/1719:49:13INFOmapreduce.Job:map66%reduce0%

16/07/1719:49:28INFOmapreduce.Job:map67%reduce0%

16/07/1719:49:30INFOmapreduce.Job:map100%reduce0%

16/07/1719:49:37INFOmapreduce.Job:map100%reduce100%

16/07/1719:49:38INFOmapreduce.Job:Jobjob_1468752229715_0016completedsuccessfully

16/07/1719:49:39INFOmapreduce.Job:Counters:49

FileSystemCounters

FILE:Numberofbytesread=2892255

FILE:Numberofbyteswritten=5971253

FILE:Numberofreadoperations=0

FILE:Numberoflargereadoperations=0

FILE:Numberofwriteoperations=0

HDFS:Numberofbytesread=4056338

HDFS:Numberofbyteswritten=861195

HDFS:Numberofreadoperations=7

HDFS:Numberoflargereadoperations=0

HDFS:Numberofwriteoperations=2

JobCounters

Launchedmaptasks=1

Launchedreducetasks=1

Data-localmaptasks=1

Totaltimespentbyallmapsinoccupiedslots(ms)=1016177

Totaltimespentbyallreducesinoccupiedslots(ms)=4948

Totaltimespentbyallmaptasks(ms)=1016177

Totaltimespentbyallreducetasks(ms)=4948

Totalvcore-secondstakenbyallmaptasks=1016177

Totalvcore-secondstakenbyallreducetasks=4948

Totalmegabyte-secondstakenbyallmaptasks=1040565248

Totalmegabyte-secondstakenbyallreducetasks=5066752

Map-ReduceFramework

Mapinputrecords=51444

Mapoutputrecords=154332

Mapoutputbytes=2583585

Mapoutputmaterializedbytes=2892255

Inputsplitbytes=103

Combineinputrecords=0

Combineoutputrecords=0

Reduceinputgroups=51444

Reduceshufflebytes=2892255

Reduceinputrecords=154332

Reduceoutputrecords=51444

SpilledRecords=308664

ShuffledMaps=1

FailedShuffles=0

MergedMapoutputs=1

GCtimeelapsed(ms)=5836

CPUtimespent(ms)=1033510

Physicalmemory(bytes)snapshot=517627904

Virtualmemory(bytes)snapshot=1786634240

Totalcommittedheapusage(bytes)=306774016

ShuffleErrors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

FileInputFormatCounters

BytesRead=655419

FileOutputFormatCounters

BytesWritten=861195



统计:



精确度:5144451367

CPUtimespent(ms)=1033510

maptasks=1



实验2:训练集train.txt样例个数为245057不变测试集test.txt样例个数为51444,并将全部测试集存放在

test1.txt(25568)和test2.txt(25857)中



[root@hadoop11local]#hadoopfs-lsr/dir6/

-rw-r--r--3rootsupergroup3687742016-07-1720:15/dir6/test1.txt

-rw-r--r--3rootsupergroup3122102016-07-1720:15/dir6/test2.txt



KNN算法运行日志:

先看进程日志:



[root@hadoop66~]#jps

24659YarnChild(mapper任务)

22777DataNode

25592Jps

24660YarnChild(mapper任务)

24557MRAppMaster

22622NodeManager



计数器日志:



[root@hadoop11local]#app1.sh

16/07/1720:21:03INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032

16/07/1720:21:03WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.

16/07/1720:21:03INFOinput.FileInputFormat:Totalinputpathstoprocess:2

16/07/1720:21:03INFOmapreduce.JobSubmitter:numberofsplits:2

16/07/1720:21:03INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0019

16/07/1720:21:04INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0019

16/07/1720:21:04INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0019/

16/07/1720:21:04INFOmapreduce.Job:Runningjob:job_1468752229715_0019

16/07/1720:21:10INFOmapreduce.Job:Jobjob_1468752229715_0019runninginubermode:false

16/07/1720:21:10INFOmapreduce.Job:map0%reduce0%

16/07/1720:21:21INFOmapreduce.Job:map1%reduce0%

16/07/1720:21:30INFOmapreduce.Job:map2%reduce0%

16/07/1720:21:40INFOmapreduce.Job:map3%reduce0%

16/07/1720:21:46INFOmapreduce.Job:map4%reduce0%

16/07/1720:21:55INFOmapreduce.Job:map5%reduce0%

16/07/1720:22:01INFOmapreduce.Job:map6%reduce0%

16/07/1720:22:10INFOmapreduce.Job:map7%reduce0%

16/07/1720:22:17INFOmapreduce.Job:map8%reduce0%

16/07/1720:22:26INFOmapreduce.Job:map9%reduce0%

16/07/1720:22:35INFOmapreduce.Job:map10%reduce0%

16/07/1720:22:41INFOmapreduce.Job:map11%reduce0%

16/07/1720:22:47INFOmapreduce.Job:map12%reduce0%

16/07/1720:22:56INFOmapreduce.Job:map13%reduce0%

16/07/1720:23:05INFOmapreduce.Job:map14%reduce0%

16/07/1720:23:11INFOmapreduce.Job:map15%reduce0%

16/07/1720:23:17INFOmapreduce.Job:map16%reduce0%

16/07/1720:23:26INFOmapreduce.Job:map17%reduce0%

16/07/1720:23:35INFOmapreduce.Job:map18%reduce0%

16/07/1720:23:41INFOmapreduce.Job:map19%reduce0%

16/07/1720:23:50INFOmapreduce.Job:map20%reduce0%

16/07/1720:23:56INFOmapreduce.Job:map21%reduce0%

16/07/1720:24:05INFOmapreduce.Job:map22%reduce0%

16/07/1720:24:11INFOmapreduce.Job:map23%reduce0%

16/07/1720:24:20INFOmapreduce.Job:map24%reduce0%

16/07/1720:24:26INFOmapreduce.Job:map25%reduce0%

16/07/1720:24:35INFOmapreduce.Job:map26%reduce0%

16/07/1720:24:42INFOmapreduce.Job:map27%reduce0%

16/07/1720:24:51INFOmapreduce.Job:map28%reduce0%

16/07/1720:24:57INFOmapreduce.Job:map29%reduce0%

16/07/1720:25:06INFOmapreduce.Job:map30%reduce0%

16/07/1720:25:12INFOmapreduce.Job:map31%reduce0%

16/07/1720:25:21INFOmapreduce.Job:map32%reduce0%

16/07/1720:25:27INFOmapreduce.Job:map33%reduce0%

16/07/1720:25:36INFOmapreduce.Job:map34%reduce0%

16/07/1720:25:42INFOmapreduce.Job:map35%reduce0%

16/07/1720:25:51INFOmapreduce.Job:map36%reduce0%

16/07/1720:25:57INFOmapreduce.Job:map37%reduce0%

16/07/1720:26:06INFOmapreduce.Job:map38%reduce0%

16/07/1720:26:12INFOmapreduce.Job:map39%reduce0%

16/07/1720:26:21INFOmapreduce.Job:map40%reduce0%

16/07/1720:26:30INFOmapreduce.Job:map41%reduce0%

16/07/1720:26:36INFOmapreduce.Job:map42%reduce0%

16/07/1720:26:45INFOmapreduce.Job:map43%reduce0%

16/07/1720:26:51INFOmapreduce.Job:map44%reduce0%

16/07/1720:27:00INFOmapreduce.Job:map45%reduce0%

16/07/1720:27:06INFOmapreduce.Job:map46%reduce0%

16/07/1720:27:15INFOmapreduce.Job:map47%reduce0%

16/07/1720:27:21INFOmapreduce.Job:map48%reduce0%

16/07/1720:27:30INFOmapreduce.Job:map49%reduce0%

16/07/1720:27:36INFOmapreduce.Job:map50%reduce0%

16/07/1720:27:45INFOmapreduce.Job:map51%reduce0%

16/07/1720:27:51INFOmapreduce.Job:map52%reduce0%

16/07/1720:28:01INFOmapreduce.Job:map53%reduce0%

16/07/1720:28:07INFOmapreduce.Job:map54%reduce0%

16/07/1720:28:16INFOmapreduce.Job:map55%reduce0%

16/07/1720:28:23INFOmapreduce.Job:map56%reduce0%

16/07/1720:28:31INFOmapreduce.Job:map57%reduce0%

16/07/1720:28:38INFOmapreduce.Job:map58%reduce0%

16/07/1720:28:46INFOmapreduce.Job:map59%reduce0%

16/07/1720:28:53INFOmapreduce.Job:map60%reduce0%

16/07/1720:29:02INFOmapreduce.Job:map61%reduce0%

16/07/1720:29:10INFOmapreduce.Job:map62%reduce0%

16/07/1720:29:17INFOmapreduce.Job:map63%reduce0%

16/07/1720:29:26INFOmapreduce.Job:map64%reduce0%

16/07/1720:29:32INFOmapreduce.Job:map65%reduce0%

16/07/1720:29:41INFOmapreduce.Job:map66%reduce0%

16/07/1720:29:42INFOmapreduce.Job:map83%reduce0%

16/07/1720:29:52INFOmapreduce.Job:map83%reduce17%

16/07/1720:29:54INFOmapreduce.Job:map100%reduce17%

16/07/1720:29:55INFOmapreduce.Job:map100%reduce70%

16/07/1720:29:56INFOmapreduce.Job:map100%reduce100%

16/07/1720:29:56INFOmapreduce.Job:Jobjob_1468752229715_0019completedsuccessfully

16/07/1720:29:56INFOmapreduce.Job:Counters:49

FileSystemCounters

FILE:Numberofbytesread=2892255

FILE:Numberofbyteswritten=6064619

FILE:Numberofreadoperations=0

FILE:Numberoflargereadoperations=0

FILE:Numberofwriteoperations=0

HDFS:Numberofbytesread=7482816

HDFS:Numberofbyteswritten=861195

HDFS:Numberofreadoperations=11

HDFS:Numberoflargereadoperations=0

HDFS:Numberofwriteoperations=2

JobCounters

Launchedmaptasks=2

Launchedreducetasks=1

Data-localmaptasks=2

Totaltimespentbyallmapsinoccupiedslots(ms)=1032086

Totaltimespentbyallreducesinoccupiedslots(ms)=11757

Totaltimespentbyallmaptasks(ms)=1032086

Totaltimespentbyallreducetasks(ms)=11757

Totalvcore-secondstakenbyallmaptasks=1032086

Totalvcore-secondstakenbyallreducetasks=11757

Totalmegabyte-secondstakenbyallmaptasks=1056856064

Totalmegabyte-secondstakenbyallreducetasks=12039168

Map-ReduceFramework

Mapinputrecords=51444

Mapoutputrecords=154332

Mapoutputbytes=2583585

Mapoutputmaterializedbytes=2892261

Inputsplitbytes=200

Combineinputrecords=0

Combineoutputrecords=0

Reduceinputgroups=51444

Reduceshufflebytes=2892261

Reduceinputrecords=154332

Reduceoutputrecords=51444

SpilledRecords=308664

ShuffledMaps=2

FailedShuffles=0

MergedMapoutputs=2

GCtimeelapsed(ms)=8264

CPUtimespent(ms)=1045670

Physicalmemory(bytes)snapshot=762257408

Virtualmemory(bytes)snapshot=2654359552

Totalcommittedheapusage(bytes)=496762880

ShuffleErrors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

FileInputFormatCounters

BytesRead=680984

FileOutputFormatCounters

BytesWritten=861195

16/07/1720:29:58INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032

16/07/1720:29:59WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.

16/07/1720:29:59INFOinput.FileInputFormat:Totalinputpathstoprocess:1

16/07/1720:29:59INFOmapreduce.JobSubmitter:numberofsplits:1

16/07/1720:29:59INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0020

16/07/1720:29:59INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0020

16/07/1720:30:00INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0020/

16/07/1720:30:00INFOmaprwww.wang027.comeduce.Job:Runningjob:job_1468752229715_0020

16/07/1720:30:05INFOmapreduce.Job:Jobjob_1468752229715_0020runninginubermode:false

16/07/1720:30:05INFOmapreduce.Job:map0%reduce0%

16/07/1720:30:12INFOmapreduce.Job:map100%reduce0%

16/07/1720:30:18INFOmapreduce.Job:map100%reduce100%

16/07/1720:30:18INFOmapreduce.Job:Jobjob_1468752229715_0020completedsuccessfully

16/07/1720:30:18INFOmapreduce.Job:Counters:49

FileSystemCounters

FILE:Numberofbytesread=24

FILE:Numberofbyteswritten=186173

FILE:Numberofreadoperations=0

FILE:Numberoflargereadoperations=0

FILE:Numberofwriteoperations=0

HDFS:Numberofbytesread=861298

HDFS:Numberofbyteswritten=12

HDFS:Numberofreadoperations=6

HDFS:Numberoflargereadoperations=0

HDFS:Numberofwriteoperations=2

JobCounters

Launchedmaptasks=1

Launchedreducetasks=1

Data-localmaptasks=1

Totaltimespentbyallmapsinoccupiedslots(ms)=3973

Totaltimespentbyallreducesinoccupiedslots(ms)=3243

Totaltimespentbyallmaptasks(ms)=3973

Totaltimespentbyallreducetasks(ms)=3243

Totalvcore-secondstakenbyallmaptasks=3973

Totalvcore-secondstakenbyallreducetasks=3243

Totalmegabyte-secondstakenbyallmaptasks=4068352

Totalmegabyte-secondstakenbyallreducetasks=3320832

Map-ReduceFramework

Mapinputrecords=51444

Mapoutputrecords=1

Mapoutputbytes=16

Mapoutputmaterializedbytes=24

Inputsplitbytes=103

Combineinputrecords=0

Combineoutputrecords=0

Reduceinputgroups=1

Reduceshufflebytes=24

Reduceinputrecords=1

Reduceoutputrecords=1

SpilledRecords=2

ShuffledMaps=1

FailedShuffles=0

MergedMapoutputs=1

GCtimeelapsed(ms)=70

CPUtimespent(ms)=2340

Physicalmemory(bytes)snapshot=451612672

Virtualmemory(bytes)snapshot=1790021632

Totalcommittedheapusage(bytes)=309002240

ShuffleErrors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

FileInputFormatCounters

BytesRead=861195

FileOutputFormatCounters

BytesWritten=12





统计:



精确度:5144451367

CPUtimespent(ms)=1045670(时间之所以长:在于mapper任务的创建花费了时间,并且两个mapper任务都在同一个服务器hadoop66运行)

maptasks=2



实验3:训练集train.txt样例个数为245057不变测试集test.txt样例个数为51444,并将全部测试集存放在

test1.txt(25402)和test2.txt(15224)和test3.txt(10818)中



[root@hadoop11local]#hadoopfs-lsr/dir6/

lsr:DEPRECATED:Pleaseuse''ls-R''instead.

-rw-r--r--3rootsupergroup1281612016-07-1720:54/dir6/test1.txt

-rw-r--r--3rootsupergroup3663132016-07-1720:54/dir6/test2.txt

-rw-r--r--3rootsupergroup2015662016-07-1720:54/dir6/test3.txt

1

2

3

4

5

1

2

3

4

5

先看进程日志:



[root@hadoop33~]#jps

26501Jps

26279YarnChild(mapper任务)

2399QuorumPeerMain

26280YarnChild(mapper任务)

23800DataNode

23648NodeManager

26133MRAppMaster

[root@hadoop66~]#jps

22777DataNode

26652Jps

26302YarnChild(mapper任务)

22622NodeManager

此时可以看出,此时mapper任务的执行有两台服务器来执行---分而治之!



具体运行日志:



[root@hadoop11local]#app1.sh

16/07/1720:55:17INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032

16/07/1720:55:18WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.

16/07/1720:55:18INFOinput.FileInputFormat:Totalinputpathstoprocess:3

16/07/1720:55:18INFOmapreduce.JobSubmitter:numberofsplits:3

16/07/1720:55:18INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0021

16/07/1720:55:19INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0021

16/07/1720:55:19INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0021/

16/07/1720:55:19INFOmapreduce.Job:Runningjob:job_1468752229715_0021

16/07/1720:55:25INFOmapreduce.Job:Jobjob_1468752229715_0021runninginubermode:false

16/07/1720:55:25INFOmapreduce.Job:map0%reduce0%

16/07/1720:55:37INFOmapreduce.Job:map1%reduce0%

16/07/1720:55:40INFOmapreduce.Job:map2%reduce0%

16/07/1720:55:45INFOmapreduce.Job:map3%reduce0%

16/07/1720:55:49INFOmapreduce.Job:map4%reduce0%

16/07/1720:55:54INFOmapreduce.Job:map5%reduce0%

16/07/1720:55:58INFOmapreduce.Job:map6%reduce0%

16/07/1720:56:03INFOmapreduce.Job:map7%reduce0%

16/07/1720:56:07INFOmapreduce.Job:map8%reduce0%

16/07/1720:56:12INFOmapreduce.Job:map9%reduce0%

16/07/1720:56:16INFOmapreduce.Job:map10%reduce0%

16/07/1720:56:20INFOmapreduce.Job:map11%reduce0%

16/07/1720:56:24INFOmapreduce.Job:map12%reduce0%

16/07/1720:56:29INFOmapreduce.Job:map13%reduce0%

16/07/1720:56:33INFOmapreduce.Job:map14%reduce0%

16/07/1720:56:37INFOmapreduce.Job:map15%reduce0%

16/07/1720:56:42INFOmapreduce.Job:map16%reduce0%

16/07/1720:56:47INFOmapreduce.Job:map17%reduce0%

16/07/1720:56:51INFOmapreduce.Job:map18%reduce0%

16/07/1720:56:56INFOmapreduce.Job:map19%reduce0%

16/07/1720:57:00INFOmapreduce.Job:map20%reduce0%

16/07/1720:57:05INFOmapreduce.Job:map21%reduce0%

16/07/1720:57:08INFOmapreduce.Job:map22%reduce0%

16/07/1720:57:13INFOmapreduce.Job:map23%reduce0%

16/07/1720:57:18INFOmapreduce.Job:map24%reduce0%

16/07/1720:57:23INFOmapreduce.Job:map25%reduce0%

16/07/1720:57:27INFOmapreduce.Job:map26%reduce0%

16/07/1720:57:32INFOmapreduce.Job:map27%reduce0%

16/07/1720:57:36INFOmapreduce.Job:map28%reduce0%

16/07/1720:57:41INFOmapreduce.Job:map29%reduce0%

16/07/1720:57:45INFOmapreduce.Job:map30%reduce0%

16/07/1720:57:50INFOmapreduce.Job:map31%reduce0%

16/07/1720:57:54INFOmapreduce.Job:map32%reduce0%

16/07/1720:57:59INFOmapreduce.Job:map33%reduce0%

16/07/1720:58:03INFOmapreduce.Job:map34%reduce0%

16/07/1720:58:08INFOmapreduce.Job:map35%reduce0%

16/07/1720:58:12INFOmapreduce.Job:map36%reduce0%

16/07/1720:58:15INFOmapreduce.Job:map37%reduce0%

16/07/1720:58:20INFOmapreduce.Job:map38%reduce0%

16/07/1720:58:24INFOmapreduce.Job:map39%reduce0%

16/07/1720:58:29INFOmapreduce.Job:map40%reduce0%

16/07/1720:58:33INFOmapreduce.Job:map41%reduce0%

16/07/1720:58:38INFOmapreduce.Job:map42%reduce0%

16/07/1720:58:42INFOmapreduce.Job:map43%reduce0%

16/07/1720:58:47INFOmapreduce.Job:map44%reduce0%

16/07/1720:58:51INFOmapreduce.Job:map45%reduce0%

16/07/1720:58:56INFOmapreduce.Job:map46%reduce0%

16/07/1720:59:00INFOmapreduce.Job:map58%reduce0%

16/07/1720:59:06INFOmapreduce.Job:map59%reduce0%

16/07/1720:59:11INFOmapreduce.Job:map59%reduce11%

16/07/1720:59:15INFOmapreduce.Job:map60%reduce11%

16/07/1720:59:21INFOmapreduce.Job:map61%reduce11%

16/07/1720:59:30INFOmapreduce.Job:map62%reduce11%

16/07/1720:59:39INFOmapreduce.Job:map63%reduce11%

16/07/1720:59:48INFOmapreduce.Job:map64%reduce11%

16/07/1720:59:58INFOmapreduce.Job:map65%reduce11%

16/07/1721:00:04INFOmapreduce.Job:map66%reduce11%

16/07/1721:00:13INFOmapreduce.Job:map67%reduce11%

16/07/1721:00:23INFOmapreduce.Job:map68%reduce11%

16/07/1721:00:26INFOmapreduce.Job:map79%reduce11%

16/07/1721:00:27INFOmapreduce.Job:map79%reduce22%

16/07/1721:00:35INFOmapreduce.Job:map80%reduce22%

16/07/1721:00:59INFOmapreduce.Job:map81%reduce22%

16/07/1721:01:20INFOmapreduce.Job:map82%reduce22%

16/07/1721:01:44INFOmapreduce.Job:map83%reduce22%

16/07/1721:02:08INFOmapreduce.Job:map84%reduce22%

16/07/1721:02:32INFOmapreduce.Job:map85%reduce22%

16/07/1721:02:56INFOmapreduce.Job:map86%reduce22%

16/07/1721:03:17INFOmapreduce.Job:map87%reduce22%

16/07/1721:03:41INFOmapreduce.Job:map88%reduce22%

16/07/1721:04:06INFOmapreduce.Job:map89%reduce22%

16/07/1721:04:15INFOmapreduce.Job:map100%reduce22%

16/07/1721:04:16INFOmapreduce.Job:map100%reduce90%

16/07/1721:04:17INFOmapreduce.Job:map100%reduce100%

16/07/1721:04:17INFOmapreduce.Job:Jobjob_1468752229715_0021completedsuccessfully

16/07/1721:04:17INFOmapreduce.Job:Counters:50

FileSystemCounters

FILE:Numberofbytesread=2892255

FILE:Numberofbyteswritten=6158011

FILE:Numberofreadoperations=0

FILE:Numberoflargereadoperations=0

FILE:Numberofwriteoperations=0

HDFS:Numberofbytesread=10898788

HDFS:Numberofbyteswritten=861195

HDFS:Numberofreadoperations=15

HDFS:Numberoflargereadoperations=0

HDFS:Numberofwriteoperations=2

JobCounters

Killedmaptasks=2

Launchedmaptasks=5

Launchedreducetasks=1

Data-localmaptasks=5

Totaltimespentbyallmapsinoccupiedslots(ms)=1417294

Totaltimespentbyallreducesinoccupiedslots(ms)=313657

Totaltimespentbyallmaptasks(ms)=1417294

Totaltimespentbyallreducetasks(ms)=313657

Totalvcore-secondstakenbyallmaptasks=1417294

Totalvcore-secondstakenbyallreducetasks=313657

Totalmegabyte-secondstakenbyallmaptasks=1451309056

Totalmegabyte-secondstakenbyallreducetasks=321184768

Map-ReduceFramework

Mapinputrecords=51444

Mapoutputrecords=154332

Mapoutputbytes=2583585

Mapoutputmaterializedbytes=2892267

Inputsplitbytes=300

Combineinputrecords=0

Combineoutputrecords=0

Reduceinputgroups=51444

Reduceshufflebytes=2892267

Reduceinputrecords=154332

Reduceoutputrecords=51444

SpilledRecords=308664

ShuffledMaps=3

FailedShuffles=0

MergedMapoutputs=3

GCtimeelapsed(ms)=9078

CPUtimespent(ms)=1054730

Physicalmemory(bytes)snapshot=1011130368

Virtualmemory(bytes)snapshot=3553914880

Totalcommittedheapusage(bytes)=575209472

ShuffleErrors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

FileInputFormatCounters

BytesRead=696040

FileOutputFormatCounters

BytesWritten=861195

16/07/1721:04:19INFOclient.RMProxy:ConnectingtoResourceManagerathadoop22/10.187.84.51:8032

16/07/1721:04:19WARNmapreduce.JobSubmitter:Hadoopcommand-lineoptionparsingnotperformed.ImplementtheToolinterfaceandexecuteyourapplicationwithToolRunnertoremedythis.

16/07/1721:04:20INFOinput.FileInputFormat:Totalinputpathstoprocess:1

16/07/1721:04:20INFOmapreduce.Jowww.baiyuewang.netbSubmitter:numberofsplits:1

16/07/1721:04:20INFOmapreduce.JobSubmitter:Submittingtokensforjob:job_1468752229715_0022

16/07/1721:04:20INFOimpl.YarnClientImpl:Submittedapplicationapplication_1468752229715_0022

16/07/1721:04:20INFOmapreduce.Job:Theurltotrackthejob:http://hadoop22:8088/proxy/application_1468752229715_0022/

16/07/1721:04:20INFOmapreduce.Job:Runningjob:job_1468752229715_0022

16/07/1721:04:27INFOmapreduce.Job:Jobjob_1468752229715_0022runninginubermode:false

16/07/1721:04:27INFOmapreduce.Job:map0%reduce0%

16/07/1721:04:33INFOmapreduce.Job:map100%reduce0%

16/07/1721:04:38INFOmapreduce.Job:map100%reduce100%

16/07/1721:04:38INFOmapreduce.Job:Jobjob_1468752229715_0022completedsuccessfully

16/07/1721:04:38INFOmapreduce.Job:Counters:49

FileSystemCounters

FILE:Numberofbytesread=24

FILE:Numberofbyteswritten=186173

FILE:Numberofreadoperations=0

FILE:Numberoflargereadoperations=0

FILE:Numberofwriteoperations=0

HDFS:Numberofbytesread=861298

HDFS:Numberofbyteswritten=12

HDFS:Numberofreadoperations=6

HDFS:Numberoflargereadoperations=0

HDFS:Numberofwriteoperations=2

JobCounters

Launchedmaptasks=1

Launchedreducetasks=1

Data-localmaptasks=1

Totaltimespentbyallmapsinoccupiedslots(ms)=3580

Totaltimespentbyallreducesinoccupiedslots(ms)=3393

Totaltimespentbyallmaptasks(ms)=3580

Totaltimespentbyallreducetasks(ms)=3393

Totalvcore-secondstakenbyallmaptasks=3580

Totalvcore-secondstakenbyallreducetasks=3393

Totalmegabyte-secondstakenbyallmaptasks=3665920

Totalmegabyte-secondstakenbyallreducetasks=3474432

Map-ReduceFramework

Mapinputrecords=51444

Mapoutputrecords=1

Mapoutputbytes=16

Mapoutputmaterializedbytes=24

Inputsplitbytes=103

Combineinputrecords=0

Combineoutputrecords=0

Reduceinputgroups=1

Reduceshufflebytes=24

Reduceinputrecords=1

Reduceoutputrecords=1

SpilledRecords=2

ShuffledMaps=1

FailedShuffles=0

MergedMapoutputs=1

GCtimeelapsed(ms)=89

CPUtimespent(ms)=2360

Physicalmemory(bytes)snapshot=435548160

Virtualmemory(bytes)snapshot=1775456256

Totalcommittedheapusage(bytes)=310444032

ShuffleErrors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

FileInputFormatCounters

BytesRead=861195

FileOutputFormatCounters

BytesWritten=12



统计:



精确度:5144451367

CPUtimespent(ms)=1054730(此时看来数据量很小的时候,不太适合分而治之,间接说明了hadoop适合大数据)

maptasks=3



总结:MapReduce在处理大数据的时候,会逐渐发挥集群的优势,通过mapper任务的并行处理,提高大数据的处理速度!

献花(0)
+1
(本文系thedust79首藏)