BigData and Hadoop

Ambari Failed To Start After Server Reboot

Recently, we had the requirement in our development environment to shutdown the HDP cluster for regular maintenance.

After the graceful shutdown of HDP cluster services using Ambari console, I gracefully shutdown the ambari server.

When I started the Ambari server post-maintenance. I was expecting the clean start of the Ambari server but it did not happen this way. Ambari server startup process threw below exception during the Ambari server start.

[jadmin@cpaz ~]$ sudo ambari-server start
[sudo] password for jadmin:
Using python  /usr/bin/python
Starting ambari-server
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Ambari database consistency check started...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start.........Unable to determine server PID. Retrying...
......Unable to determine server PID. Retrying...
......Unable to determine server PID. Retrying...
ERROR: Exiting with exit code -1.
REASON: Ambari Server java process died with exitcode 1. Check /var/log/ambari-server/ambari-server.out for more information.

After going through the detailed log at /var/log/ambari-server/ambari-server.log, I found the exception of table oozie.metainfo does not exist as shown below.

I was totally clueless as earlier I had started the ambari server smoothly many times. But out of sudden in this attempt server was throwing table oozie.metainfo does not exist error as below.

*****************************************************************Details in Server log at: /var/log/ambari-server/ambari-server.log*****************************************************************07 Apr 2020 18:16:25,063  INFO [main] HostRoleCommandDAO:261 - Host role command status summary cache enabled !
07 Apr 2020 18:16:25,065  INFO [main] TransactionalLock$LockArea:121 - LockArea HRC_STATUS_CACHE is enabled
07 Apr 2020 18:16:25,215  INFO [main] LockFactory:53 - Lock profiling is disabled
07 Apr 2020 18:16:25,216  INFO [main] AmbariServer:1044 - Getting the controller
07 Apr 2020 18:16:26,178  INFO [main] AbstractPoolBackedDataSource:212 - Initializing c3p0 pool... com.mchange.v2.c3p0.ComboPool
edDataSource [ acquireIncrement -> 5, acquireRetryAttempts -> 30, acquireRetryDelay -> 1000, autoCommitOnClose -> false, automat
icTestTable -> null, breakAfterAcquireFailure -> false, checkoutTimeout -> 0, connectionCustomizerClassName -> null, connectionT
esterClassName -> com.mchange.v2.c3p0.impl.DefaultConnectionTester, contextClassLoaderSource -> caller, dataSourceName -> 2y5zps
a9nhtgwh1jcfbh|28f3b248, debugUnreturnedConnectionStackTraces -> false, description -> null, driverClass -> com.mysql.jdbc.Drive
r, extensions -> {}, factoryClassLocation -> null, forceIgnoreUnresolvedTransactions -> false, forceSynchronousCheckins -> false
, forceUseNamedDriverClass -> false, identityToken -> 2y5zpsa9nhtgwh1jcfbh|28f3b248, idleConnectionTestPeriod -> 7200, initialPo
olSize -> 5, jdbcUrl -> jdbc:mysql://Ja.iv502ww4s2ludo45iz2y1gzvxg.ix.internal.cloudapp.net:3306/oozie, maxAdministrativ
eTaskTime -> 0, maxConnectionAge -> 0, maxIdleTime -> 14400, maxIdleTimeExcessConnections -> 0, maxPoolSize -> 32, maxStatements
 -> 0, maxStatementsPerConnection -> 0, minPoolSize -> 5, numHelperThreads -> 3, preferredTestQuery -> SELECT 1, privilegeSpawne
dThreads -> false, properties -> {user=******, password=******}, propertyCycle -> 0, statementCacheNumDeferredCloseThreads -> 0,
 testConnectionOnCheckin -> false, testConnectionOnCheckout -> false, unreturnedConnectionTimeout -> 0, userOverrides -> {}, use
sTraditionalReflectiveProxies -> false ]
07 Apr 2020 18:16:26,514 ERROR [main] AmbariServer:1073 - Failed to run the Ambari Server
Local Exception Stack:
Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.Databa
seException
Internal Exception: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 'oozie.metainfo' doesn't exist
Error Code: 1146
Call: SELECT `metainfo_key`, `metainfo_value` FROM metainfo WHERE (`metainfo_key` = ?)
        bind => [1 parameter bound]
Query: ReadObjectQuery(name="readMetainfoEntity" referenceClass=MetainfoEntity sql="SELECT `metainfo_key`, `metainfo_value` FROM
 metainfo WHERE (`metainfo_key` = ?)")
        at org.eclipse.persistence.exceptions.DatabaseException.sqlException(DatabaseException.java:340)
        at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:684)
        at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeCall(DatabaseAccessor.java:560)
        at org.eclipse.persistence.internal.sessions.AbstractSession.basicExecuteCall(AbstractSession.java:2055)
        at org.eclipse.persistence.sessions.server.ServerSession.executeCall(ServerSession.java:570)
        at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:2
42)
        at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:2
28)
        at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.selectOneRow(DatasourceCallQueryMechanism.java:
714)
        at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectOneRowFromTable(ExpressionQueryMechanism.java
:2803)

Post this I checked the db settings in /etc/ambari-server/conf/ambari.properties file and get the below output.

[hapad@J ~]$ grep 'jdbc' /etc/ambari-server/conf/ambari.properties
custom.mysql.jdbc.name=mysql-connector-java.jar
previous.custom.mysql.jdbc.name=mysql-connector-java.jar
server.jdbc.connection-pool=c3p0
server.jdbc.connection-pool.acquisition-size=5
server.jdbc.connection-pool.idle-test-interval=7200
server.jdbc.connection-pool.max-age=0
server.jdbc.connection-pool.max-idle-time=14400
server.jdbc.connection-pool.max-idle-time-excess=0
server.jdbc.database=mysql
server.jdbc.database_name=oozie
server.jdbc.driver=com.mysql.jdbc.Driver
server.jdbc.hostname=J.iv502ww4s2ludo45iz2y1gzvxg.ix.internal.cloudapp.net
server.jdbc.port=3306
server.jdbc.postgres.schema=ambari
server.jdbc.rca.driver=com.mysql.jdbc.Driver
server.jdbc.rca.url=jdbc:mysql://J.iv502ww4s2ludo45iz2y1gzvxg.ix.internal.cloudapp.net:3306/oozie
server.jdbc.rca.user.name=oozie
server.jdbc.rca.user.passwd=/etc/ambari-server/conf/password.dat
server.jdbc.url=jdbc:mysql://J.iv502ww4s2ludo45iz2y1gzvxg.ix.internal.cloudapp.net:3306/oozie
server.jdbc.user.name=oozie
server.jdbc.user.passwd=/etc/ambari-server/conf/password.dat

From the properties files it was looking like that ambari is pointing to MySQL DB and MySQL driver was in place but still, the error was there.

I thought to check about status of ambari server setup for MySQL DB and re-enforce the settings via setup script. But when I did ambari-server setup I got one warning message for running “/var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql”

I assumed that might be, there is an issue with MySQL DB and after all, this is what the earlier error is pointing to “table oozie.metainfo does not exist”.

Further, I also checked the database of MySQL to confirm the state of oozie.metainfo table and I found it missed in DB.

I executed below command to source SQL DDL query into MySQL DB but that too did not help.

mysql -h 10.1xx.2xx.1x -u oozie -poozie oozie < /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql > output.log

Even further, I realized that I had never initiated ambari-server setup with MySQL DB. Instead I did it for Postgres, hence I decided to check the schema in Postgres DB and I found metainfo table there.

Postgresql ambari DB table list is shown below. But why my server is not starting if all required things are in place this was the open question for quite some time.

ambari-> \dt
                    List of relations
 Schema |             Name              | Type  | Owner
--------+-------------------------------+-------+--------
 ambari | adminpermission               | table | ambari
 ambari | adminprincipal                | table | ambari
 ambari | adminprincipaltype            | table | ambari
 ambari | adminprivilege                | table | ambari
 ambari | adminresource                 | table | ambari
 ambari | adminresourcetype             | table | ambari
 ambari | alert_current                 | table | ambari
 ambari | alert_definition              | table | ambari
 ambari | alert_group                   | table | ambari
 ambari | alert_group_target            | table | ambari
 ambari | alert_grouping                | table | ambari
 ambari | alert_history                 | table | ambari
 ambari | alert_notice                  | table | ambari
 ambari | alert_target                  | table | ambari
 ambari | alert_target_states           | table | ambari
 ambari | ambari_operation_history      | table | ambari
 ambari | ambari_sequences              | table | ambari
 ambari | artifact                      | table | ambari
 ambari | blueprint                     | table | ambari
 ambari | blueprint_configuration       | table | ambari
 ambari | blueprint_setting             | table | ambari
 ambari | clusterconfig                 | table | ambari
 ambari | clusterhostmapping            | table | ambari
 ambari | clusters                      | table | ambari
 ambari | clusterservices               | table | ambari
 ambari | clusterstate                  | table | ambari
 ambari | confgroupclusterconfigmapping | table | ambari
 ambari | configgroup                   | table | ambari
 ambari | configgrouphostmapping        | table | ambari
 ambari | execution_command             | table | ambari
 ambari | extension                     | table | ambari
 ambari | extensionlink                 | table | ambari
 ambari | groups                        | table | ambari
 ambari | host_role_command             | table | ambari
 ambari | host_version                  | table | ambari
 ambari | hostcomponentdesiredstate     | table | ambari
 ambari | hostcomponentstate            | table | ambari
 ambari | hostconfigmapping             | table | ambari
 ambari | hostgroup                     | table | ambari
 ambari | hostgroup_component           | table | ambari
 ambari | hostgroup_configuration       | table | ambari
 ambari | hosts                         | table | ambari
 ambari | hoststate                     | table | ambari
 ambari | kerberos_descriptor           | table | ambari
 ambari | kerberos_principal            | table | ambari
 ambari | kerberos_principal_host       | table | ambari
 ambari | key_value_store               | table | ambari
 ambari | members                       | table | ambari
 ambari | metainfo                      | table | ambari
 ambari | permission_roleauthorization  | table | ambari
 ambari | qrtz_blob_triggers            | table | ambari
 ambari | qrtz_calendars                | table | ambari
 ambari | qrtz_cron_triggers            | table | ambari
 ambari | qrtz_fired_triggers           | table | ambari
 ambari | qrtz_job_details              | table | ambari
 ambari | qrtz_locks                    | table | ambari
 ambari | qrtz_paused_trigger_grps      | table | ambari
 ambari | qrtz_scheduler_state          | table | ambari
 ambari | qrtz_simple_triggers          | table | ambari
 ambari | qrtz_simprop_triggers         | table | ambari
 ambari | qrtz_triggers                 | table | ambari
 ambari | remoteambaricluster           | table | ambari
 ambari | remoteambariclusterservice    | table | ambari
 ambari | repo_version                  | table | ambari
 ambari | request                       | table | ambari
 ambari | requestoperationlevel         | table | ambari
 ambari | requestresourcefilter         | table | ambari
 ambari | requestschedule               | table | ambari
 ambari | requestschedulebatchrequest   | table | ambari
 ambari | role_success_criteria         | table | ambari
 ambari | roleauthorization             | table | ambari

As of now, I summarized that system is pointing to MySQL as per ambari properties file and complaining about oozie.metainfo but not about ambari.metainfo. because table which system could not find was oozie.metainfo in MySQL DB . After sourcing DDL query, table metainfo in oozie DB got created but still, the system was throwing the same error.

MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| oozie              |
| performance_schema |
+--------------------+

After doing all this, I concluded that somehow my ambari server is pointing to wrong DB and then eventually i decided to run the ambari-server setup for default DB which was postgres again.

Ambari.metainfo table data.

ambari=> \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
 ambari    | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres         +
           |          |          |             |             | postgres=CTc/postgres+
           |          |          |             |             | ambari=CTc/postgres
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)ambari=> select * from metainfo;
 metainfo_key | metainfo_value
--------------+----------------
 version      | 2.6.0
(1 row)

Postgres service was also running.

Redirecting to /bin/systemctl status postgresql.service
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2020-03-09 11:28:48 IST; 4 weeks 1 days ago
 Main PID: 38719 (postgres)
   CGroup: /system.slice/postgresql.service
           ├─38719 /usr/bin/postgres -D /var/lib/pgsql/data -p 5432
           ├─38720 postgres: logger process
           ├─38722 postgres: checkpointer process
           ├─38723 postgres: writer process
           ├─38724 postgres: wal writer process
           ├─38725 postgres: autovacuum launcher process
           └─38726 postgres: stats collector process

Before re-running ambari-server setup. I collected all the information of postgres DB and just entered the existed settings in setup script.

ambari-server setup

Post setup I started the ambari-server using start script and to my surprise, this time ambari started successfully. Further, I checked using ambari login my cluster information was intact without any change or loss of data.

I am not sure about RCA of this issue as I am still new to HDP.But eventually re-execution of setup script resolved the issue.

There were few risks involved in this process of trial and error investigation and it was not that easy to choose the path of re-running ambari-server setup as it looks like first hand. The reason being, I had apprehensions about existing cluster data integrity.

To overcome this doubt. I also tried resetting the ambari-server using reset option and here in this step, I realized that ambari-server reset wipes out any existing cluster (I did not complete this process and aborted after system warned me for loss of existing data). Hence, re-running the setup was a safe bet for me with existing settings and that did the trick without any harm to cluster data.

Show More

Leave a Reply

Back to top button