Date Posted:
Product: TIBCO Spotfire®
Product: TIBCO Spotfire®
Problem:
Troubleshooting the Error: Agent is down
Solution:
Preview a dataset or run a workflow, and the error "Agent is down" appears. This error comes up after reboot occasionally. Each agent allows connection from TIBCO Spotfire Data Science to a particular data source. The "Agent is down" error occurs when RAM is not available, the number of processes per OS user is set too low, or the agent's port is unavailable.
- Verify only the appropriate agent is enabled. In the example below, only Agent 7, for CDH5.3 and above, is enabled.
vi $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/configuration/alpine.conf agent { 1.enabled=false # PHD2.0 2.enabled=false # CDH4 3.enabled=false # MAPR3 4.enabled=false # CDH5.0 to CDH5.2 5.enabled=false # Hortonworks HDP2.1 6.enabled=false # MapR4 7.enabled=true # CDH5.3 and above 8.enabled=false # Hortonworks HDP2.2 and PHD3.0 }
Edit this configuration file by configuring 'false' for any agents associated with Hadoop distributions you do not need. If you're connecting to a database, any of the agents will suffice.
- Verify the agent status by checking the datetime for log updates. An active agent will have an updated log when a user tries to run a workflow. In this example, only Agent_2 and Agent_4 have been updated on the current date, May 25. The other agents have not been used today. If the agent is down, the log will not update.
$ cd /home/<chorus_username> $ ll AlpineAgent_*.log -rw-rw-r-- 1 <chorus_username> <chorus_usergroup> 10150225 May 20 15:53 AlpineAgent_1.log -rw-rw-r-- 1 <chorus_username> <chorus_usergroup> 1922200 May 25 07:12 AlpineAgent_2.log -rw-rw-r-- 1 <chorus_username> <chorus_usergroup> 5443924 Apr 6 04:04 AlpineAgent_3.log -rw-rw-r-- 1 <chorus_username> <chorus_usergroup> 10079669 May 25 12:14 AlpineAgent_4.log -rw-rw-r-- 1 <chorus_username> <chorus_usergroup> 4778048 May 23 04:05 AlpineAgent_5.log $ date Mon May 25 13:44:22 PDT 2015
- Check the java version is correct:
- Use Alternatives Command
-
$ cd /usr/sbin $ alternatives --config java There are 2 programs which provide 'java'. Selection Command ----------------------------------------------- *+ 1 /usr/java/jdk1.7.0_65/bin/java 2 /usr/java/jdk1.7.0_65/ Enter to keep the current selection[+], or type selection number:
Oracle JDK is required, NOT JRE
$ java -version java version "1.7.0_65" Java(TM) SE Runtime Environment (build 1.7.0_65-b17) Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode
- Collect information from relevant log files (Located here):
./AlpineAgent_#.log ./alpine-current/agent/scripts/#/[date].stderrout.log
- Make sure to specify enough Jetty threads.
- Check the number of cores specified in the file /proc/cpuinfo. In this case, we have 8 processors, zero actual cores, and 8 processing units.
-
$ cat /proc/cpuinfo | grep processor processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 processor : 6 processor : 7 $ cat /proc/cpuinfo | grep 'core id' $ cat /proc/cpuinfo | grep processor | wc -l 8
If the has more than 16 processors, modify the maxThreads to 120
<Set name='maxThreads'>120</Set>
-
in each of these jetty.xml files:
- $CHORUS_HOME/current/vendor/jetty/jetty.xml
- $CHORUS_HOME/current/vendor/jetty/etc/jetty.xml
- $CHORUS_HOME/alpine-current/agent/templates/jetty.xml
- Stop all processes
source chorus_path.sh chorus_control.sh stop ps -ef | grep chorus # note any chorus processes still running # force any remaining process to stop ps -ef | grep alpine # note any alpine processes still running # force any remaining process to stop
- Start chorus again
source chorus_path.sh chorus_control.sh start
Run a workflow to verify that the error has been resolved.
If the error still appears, contact Customer Support
Comments
0 comments
Article is closed for comments.