Docker galera



  • Any way of automating a docker swarm galera mariadb cluster deployment?


  • administrators

    @jafinn Hey there. Do you have any preferred way of deployment that you use at the moment that you'd wish to automate, or you need something from the ground up?


  • administrators

    @jafinn you didn't get email notify on @ajvn 's reply due to missed config tweak within nodebb (i don't know why email notifications aren't enabled by default). However it is corrected 🙂


  • administrators

    Check this out, let us know how it works:
    https://git.shell5.dev/shell5dev/installation-scripts/tree/master/jafinn-galera-cluster

    Master repo gitlab: git clone https://git.shell5.dev/shell5dev/installation-scripts.git
    Master repo github (usually takes 5-10 minutes to sync with gitlab): git clone https://github.com/shell5dev/Installation-Scripts.git
    Tested on:

    • Fedora 29
    • Ubuntu 16.04 server edition
    • CentOS 7


  • @ajvn I'm actually running a galera cluster on the hosts as I could never get any of the docker versions working properly. I'd prefer dockerizing it as it makes it a lot easier migrating between hosts



  • @ajvn thank you, that looks like it could work. I'll try it out tonight once the kids are put to bed and give you some feedback. Appreciate the speedy delivery;)

    Only thing I'd "complain" about would be to remove the cluster with the remove.sh but have it prompt to prune or leave the swarm. I'd just comment them out before running but someone else might just run it.


  • administrators

    @jafinn Thank you for checking it out. That remove script was mostly for myself while testing, and it's mentioned to be very careful when using it, as it will remove all unused containers, images, networks, and ALL the volumes. So yeah, be careful when running it 🙂
    Keep in mind that prune won't leave the swarm and in case you're running all on the same host, that host will also be manager/master.
    So if you want to rerun script, and redeploy everything without an issue, those are commands you'll have to run.

    Let me know how install script works when you have some time to check it out, thanks.



  • I did actually try that same image earlier but couldn't get it working properly. Now it seems to have stalled out with the seed node. I've had this for the last 20 mins:

    2019-03-30 18:44:19 11 [Warning] Aborted connection 11 to db: 'unconnected' user: 'system' host: '127.0.0.1' (Got timeout reading communication packets)
    

    I added this to the compose file for both services but other than that it's just pulled from the repo

    deploy:
      placement:
        constraints:
          - node.labels.app == nextcloud
    

    I'm on Debian 9 and Docker version 18.09.4. I really don't think it should matter but I can spin up a fresh Ubuntu 16.04 instance since that's what you've tested it on.

    Edit:
    The lines in the log before it stalled out:

    Phase 2/7: Installing used storage engines... Skipped
    Phase 3/7: Fixing views
    Phase 4/7: Running 'mysql_fix_privilege_tables'
    Phase 5/7: Fixing table and database names
    Phase 6/7: Checking and upgrading tables
    Processing databases
    database
    information_schema
    performance_schema
    test
    Phase 7/7: Running 'FLUSH PRIVILEGES'
    2019-03-30 18:34:28 23 [Warning] 'user' entry '[email protected]' ignored in --skip-name-resolve mode.
    2019-03-30 18:34:28 23 [Warning] 'user' entry '@2977db8295c6' ignored in --skip-name-resolve mode.
    2019-03-30 18:34:28 23 [Warning] 'proxies_priv' entry '@% [email protected]' ignored in --skip-name-resolve mode.
    OK
    

  • administrators

    @jafinn Constraint doesn't work for me either. It won't go past healthcheck.



  • @ajvn Ah, ok. That explains why I couldn't get the image running initially either:) I'll try it without for the seed but I need to constrain it for the nodes themselves.


  • administrators

    @jafinn I'm not sure you can actually constraint seed. It works for me if you add it to node section only. Since seed is removed after nodes are connected anyway, it should fit your usecase. Try only adding costraint to node section and not seed.

    Edit:
    Like this:

    node:
        image: colinmollenhour/mariadb-galera-swarm:10.3-2018-11-07
        deploy:
          placement:
            constraints:
              - node.labels.app == nextcloud
    


  • @ajvn It seems to stop at the same place with

    2019-03-30 20:17:39 12 [Warning] Aborted connection 12 to db: 'unconnected' user: 'system' host: '127.0.0.1' (Got timeout reading communication packets)
    

    I'm not sure why it would freak out about being constrained?

    I've got 7 nodes in my current cluster and I'm doing this to get a failover if anything goes down. If I'm unable to place the database on the nodes I want it won't really work. But that's not your issue, you provided a script that seems to do what it should, thank you very much.


  • administrators

    @jafinn We would need exact environment and exact versions of everything to troubleshoot/debug your problem here, which would be hard / nearly impossible to reproduce in general with all dependencies and services/containers running which we are unaware of. Sometimes logs are not that useful and require more of "trial and error" with max. verbose logging approach to get to depth of problem's cause.


  • administrators

    @jafinn Interesting, where are you getting that log from?



  • @ajvn the logs are pulled from docker

    docker service logs -f galera_seed
    

  • administrators

    @jafinn That's because script scale seed back to 0 after 2 initial Galera containers are created. If we don't seed it back to 0, you want be able to scale cluster to more than 2 nodes. I just tested replication on test 5 nodes cluster (it does take some time for initial connection between them to happen), and it works great. I'm able to create tables on each one of them, and it replicates instantly to other nodes in cluster. I assume this is what you want, correct?



  • @ajvn I set up a new swarm to just to test it and it does work as intended. There's something breaking it when constraining it (or something fishy in my other swarm..)


  • administrators

    @jafinn Hey Jafinn, glad to hear that it's working as intended in separate swarm. Not sure if/how we can help you with your other swarm setup 🤔 .



  • @ajvn No worries, you've done more than enough. I just need to figure out if I should just migrate to new machines or just keep using the locally installed database. I might try just replicating the old swarm onto new machines and try to run the script again.

    Thank you again for the script:)


  • administrators

    @jafinn said in Docker galera:

    @ajvn No worries, you've done more than enough. I just need to figure out if I should just migrate to new machines or just keep using the locally installed database. I might try just replicating the old swarm onto new machines and try to run the script again.

    Thank you again for the script:)

    You are very welcome. Please let me know if you'd like help with scripting anything else. 👍


Log in to reply