These days I was trying to install Cloudera-5.8.3 on my centos-7 machines, and here are some steps for operation and tips for trouble shooting:
0. If you are not in USA, the speed of network for accessing Cloudera Repository of RPMS(or Parcels) is desperately slow, thus we need to move CM (Cloudera Manager) Repo and CDH Repo to local.
Create local CM Repo
Create local CDH Repo
1. Install Cloudera Manager (steps)
2. Start Cloudera Manager
sudo cmf-server start
But it report:
org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:142)
... 22 more
Caused by: org.hibernate.service.classloading.spi.ClassLoadingException: HHH010003: JDBC Driver class not found: com.mysql.jdbc.Driver
at org.hibernate.service.jdbc.connections.internal.C3P0ConnectionProvider.configure(C3P0ConnectionProvider.java:142)
at org.hibernate.service.internal.StandardServiceRegistryImpl.configureService(StandardServiceRegistryImpl.java:75)
In centos-7, the solution is:
# Install Mysql Driver for Java
sudo yum install mysql-connector-java -y
# Set jar to CLASSPATH
export CMF_JDBC_DRIVER_JAR=/usr/share/java/mysql-connector-java.jar
# Start Cloudera Manager again
sudo cmf-server start
Also need to run “sudo ./cloudera-manager-installer.bin –skip_repo_package=1” to create “db.properties”.
3. Login to the Cloudera Manager(port: 7180) and follow the steps of Wizard to create a new cluster. (Choose the local repository for installation will bring favorable fast speed 🙂
Make sure the hostname of every node is correct. And by using “Host Inspector”, we can reveal many potential problems in these machines.
After tried many times to setup cluster, I found this error in logs of some nodes:
Error, CM server guid updated, expected 85587073-270d-43d9-a44a-e213d9f7e45b, received 4c1402a5-8364-4598-a382-0c760710e897
The solution is simple:
#For the error node
sudo rm -rf /var/lib/cloudera-scm-agent/cm_guid
and restart Cloudera Manager Agent on these nodes.
I also confronted a problem that installation progress has hanged on this message:
Acquiring installation lock...
There isn’t any process of “yum” running in the node, so why it still acquire installation lock? The answer is:
sudo rm -rf /tmp/.scm_prepare_node.lock
4. After many fails and retry, I eventually setup the Hadoop Ecosystem of CDH:
When upgrading or downgrading a Cloudera Cluster, your may see this problem:
The solution is (if in ‘single user mode’):
sudo chown cloudera-scm:cloudera-scm /run/cloudera-scm-agent/ -R
sudo chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-agent/ -R
and try it again.
When staring ResourceManager, it failed and report:
2017-06-05 16:31:58,812 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Update thread interrupted. Exiting.
2017-06-05 16:31:58,813 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Continuous scheduling thread interrupted. Exiting.
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:319)
2017-06-05 16:31:58,814 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService: Interrupted while waiting to reload alloc configuration
2017-06-05 16:31:58,814 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: AMLivelinessMonitor thread interrupted
2017-06-05 16:31:58,814 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2017-06-05 16:31:58,814 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer thread interrupted
2017-06-05 16:31:58,814 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: AMLivelinessMonitor thread interrupted
2017-06-05 16:31:58,815 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to standby state
2017-06-05 16:31:58,816 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager
org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:278)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:990)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1090)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1222)
Caused by: java.io.IOException: Problem in starting http server. Server handlers failed
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:912)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:273)
... 4 more
2017-06-05 16:31:58,818 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG:
The reason of this error is: there is a Non-Cloudera version of zookeeper installed on the host. Remove it and reinstall zookeeper from CDH, the yarn-resource-manager will be launched successfully.
If meet “Deploy Client Configuration failed” when create new service, just add sudo nopassword to cloudera-scm user.
cloudera-scm ALL=(ALL) NOPASSWD: ALL