After I run spark-submit in my YARN cluster with Spark-1.6.2:
./bin/spark-submit --class TerasortApp \
--master yarn \
--deploy-mode cluster \
--driver-memory 4G \
--executor-memory 12G \
--executor-cores 4 \
--num-executors 16 \
--conf spark.yarn.executor.memoryOverhead=4000 \
--conf spark.memory.useLegacyMode=true \
--conf spark.shuffle.memoryFraction=0.6 \
--conf "spark.executor.extraJavaOptions=-XX:+UseG1GC -XX:ArrayAllocationWarningSize=2048M" \
--queue spark \
/home/sanbai/myspark/target/scala-2.10/test_2.10-1.0.jar
The job fail, and the log report:
com.esotericsoftware.kryo.KryoException: java.io.IOException: failed to uncompress the chunk: PARSING_ERROR(2) Serialization trace: bytes (org.apache.hadoop.io.Text) at com.esotericsoftware.kryo.io.Input.fill(Input.java:142) at com.esotericsoftware.kryo.io.Input.require(Input.java:169) at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:317) at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:297) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:35) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:18) at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:699) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228) at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171) at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:201) at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:198) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
Somebody in the internet say may be this is caused by the compatibility problem between Spark-1.6.2 and Snappy. Therefore I add
--conf spark.io.compression.codec=lz4
to my spark-submit shell script to change compress algorithm from Snappy to lz4. And this time everything goes ok.
thank you mate..i know this is an old and concise article, but it helps me alot