Hi

We try to use YARN framework with Lily - http://www.lilyproject.org/lily/index.html

Lily use MapReduce job for batch indexing records.
Map reads record from Hbase and send it to Solr for indexing.
Feature of realization that it doesn't use REDUCE step:
http://docs.ngdata.com/lily-docs-current/415-lily/441-lily.html#ThebatchbuildMapReducejob


  The batch build MapReduce job

The batch build MR job is a map task for the MapReduce programming model that takes as input the records stored in Lily, and calls the indexer engine for each of these records.

It is based on Lily's MR integration <http://docs.ngdata.com/lily-docs-current/g4/570-lily.html> support, which means that the input will be split into as many parts as there are HBase regions in the records table.

There is no reduce part to this job, and neither does the map task produce any output key-values. It simply calls Solr directly. This approach is used since it allows to run the batch build concurrently with an ongoing incremental update of the index.

Since the map task spends time waiting on IO (as it reads records from HBase and sends to Solr), it uses multiple threads to perform the indexing.

In original MapReduce (MR1) all works correctly, but when we try use YARN we receive message:
yarn-hadoop-nodemanager-node4.nevod.ru.log:
2013-04-11 11:04:07,325 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error:
java.lang.IllegalArgumentException: empty text
at org.jboss.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:103) at org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:68) at org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:81) at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:198) at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:107) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:470) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:443) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349) at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)
2013-04-11 11:04:07,325 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0x3bd2d042, /195.222.150.35:50577 => /195.222.150.35:8080] EXCEPTION: java.lang.IllegalArgumentException: empty text

I suspect that this message appears because of lack of the  reduce phase.

How it is possible to solve this problem?


--
Консультант 1-й категории
Костарев А.Ф.

Reply via email to