Hi
We try to use YARN framework with Lily -
http://www.lilyproject.org/lily/index.html
Lily use MapReduce job for batch indexing records.
Map reads record from Hbase and send it to Solr for indexing.
Feature of realization that it doesn't use REDUCE step:
http://docs.ngdata.com/lily-docs-current/415-lily/441-lily.html#ThebatchbuildMapReducejob
The batch build MapReduce job
The batch build MR job is a map task for the MapReduce programming
model that takes as input the records stored in Lily, and calls the
indexer engine for each of these records.
It is based on Lily's MR integration
<http://docs.ngdata.com/lily-docs-current/g4/570-lily.html> support,
which means that the input will be split into as many parts as there
are HBase regions in the records table.
There is no reduce part to this job, and neither does the map task
produce any output key-values. It simply calls Solr directly. This
approach is used since it allows to run the batch build concurrently
with an ongoing incremental update of the index.
Since the map task spends time waiting on IO (as it reads records from
HBase and sends to Solr), it uses multiple threads to perform the
indexing.
In original MapReduce (MR1) all works correctly, but when we try use
YARN we receive message:
yarn-hadoop-nodemanager-node4.nevod.ru.log:
2013-04-11 11:04:07,325 ERROR org.apache.hadoop.mapred.ShuffleHandler:
Shuffle error:
java.lang.IllegalArgumentException: empty text
at
org.jboss.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:103)
at
org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:68)
at
org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:81)
at
org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:198)
at
org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:107)
at
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:470)
at
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:443)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at
org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
2013-04-11 11:04:07,325 ERROR org.apache.hadoop.mapred.ShuffleHandler:
Shuffle error [id: 0x3bd2d042, /195.222.150.35:50577 =>
/195.222.150.35:8080] EXCEPTION: java.lang.IllegalArgumentException:
empty text
I suspect that this message appears because of lack of the reduce phase.
How it is possible to solve this problem?
--
Консультант 1-й категории
Костарев А.Ф.