Re: #mariadb-galera #sdc #so #sdc #so #mariadb-galera

Dmitry Puzikov
 

Hi, Marko,

All mariadb data is kept under /dockerdata-nfs/dev-mariadb-galera/mariadb-galera/*{0..2}
I'd reather scale down mariadb to 0 pods, backup those data dirs then delete them for pod1 and pod2  *{1..2} keeping pod0 data dir intact. After that modify grastate.dat as it proposed in the log:
"edit the grastate.dat file manually and set safe_to_bootstrap to 1" and after that scale mariadb up to 1 pod to see if it started successfully. If it is just scale up to 3 or what number you need.

Regards,
Dmitry


From: onap-discuss@... <onap-discuss@...> on behalf of Ferrero Marco Angelo via Lists.Onap.Org <marco.ferrero=telecomitalia.it@...>
Sent: Monday, October 21, 2019 11:35
To: onap-discuss@... <onap-discuss@...>
Subject: [onap-discuss] #mariadb-galera #sdc #so
 
Hi all, today after 73 days con correct working (since August,9 to October,21) all the three pods of mariadb-galera are in CrashLoopbackOff state.
I tried to delete pods dev-mariadb-galera-mariadb-galera-1 and dev-mariadb-galera-mariadb-galera-2 hoping in pods re-creation by kubernetes.
It doesn't happened.
Pod dev-mariadb-galera-mariadb-galera-0 fails because of the readiness check. Mysql in not able to correctly start before the readiness check.
Here is the log of the pods dev-mariadb-galera-mariadb-galera-0

root@cp-1:~/oom/kubernetes# kubectl logs -n onap -f dev-mariadb-galera-mariadb-galera-0 >  faulty_mariadb_galera
root@cp-1:~/oom/kubernetes# cat faulty_mariadb_galera
+ CONTAINER_SCRIPTS_DIR=/usr/share/container-scripts/mysql
+ EXTRA_DEFAULTS_FILE=/etc/my.cnf.d/galera.cnf
+ '[' -z onap ']'
+ echo 'Galera: Finding peers'
Galera: Finding peers
++ hostname -f
++ cut -d. -f2
+ K8S_SVC_NAME=mariadb-galera
+ echo 'Using service name: mariadb-galera'
+ cp /usr/share/container-scripts/mysql/galera.cnf /etc/my.cnf.d/galera.cnf
Using service name: mariadb-galera
+ /usr/bin/peer-finder -on-start=/usr/share/container-scripts/mysql/configure-galera.sh -service=mariadb-galera
2019/10/21 09:01:46 Peer list updated
was []
now [dev-mariadb-galera-mariadb-galera-0.mariadb-galera.onap.svc.cluster.local]
2019/10/21 09:01:46 execing: /usr/share/container-scripts/mysql/configure-galera.sh with stdin: dev-mariadb-galera-mariadb-galera-0.mariadb-galera.onap.svc.cluster.local
2019/10/21 09:01:46
2019/10/21 09:01:47 Peer finder exiting
+ '[' '!' -d /var/lib/mysql/mysql ']'
+ exec mysqld
2019-10-21  9:01:47 140401625143552 [Note] mysqld (mysqld 10.1.24-MariaDB) starting as process 1 ...
2019-10-21  9:01:47 140401625143552 [Note] WSREP: Read nil XID from storage engines, skipping position init
2019-10-21  9:01:47 140401625143552 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
2019-10-21  9:01:47 140401625143552 [Note] WSREP: wsrep_load(): Galera 25.3.20(r3703) by Codership Oy <info@...> loaded successfully.
2019-10-21  9:01:47 140401625143552 [Note] WSREP: CRC-32C: using hardware acceleration.
2019-10-21  9:01:47 140401625143552 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootsrap: 0
2019-10-21  9:01:47 140401625143552 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = dev-mariadb-galera-mariadb-galera-0.mariadb-galera.onap.svc.cluster.local; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.versi
2019-10-21  9:01:47 140401625143552 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(00000000-0000-0000-0000-000000000000:-1)
2019-10-21  9:01:47 140401625143552 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2019-10-21  9:01:47 140401625143552 [Note] WSREP: wsrep_sst_grab()
2019-10-21  9:01:47 140401625143552 [Note] WSREP: Start replication
2019-10-21  9:01:47 140401625143552 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2019-10-21  9:01:47 140401625143552 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
2019-10-21  9:01:47 140401625143552 [ERROR] WSREP: wsrep::connect(gcomm://) failed: 7
2019-10-21  9:01:47 140401625143552 [ERROR] Aborting
 

Someone has an idea of what whas happened ?
Somemone knows how to re-deploy mariadb-galera KEEPING all the data in the underlying databasse ?
Thank you for the help

Marco F.

Join onap-discuss@lists.onap.org to automatically receive all group messages.