Friday, 2 February 2018

Install DCOS Cluster

For past couple of months DCOS is has gained its popularity exponentially. DCOS has updated from version 1.6-1.10 with tremendous integration with different applications. It supports different tools from Hadoop technology to make the process fast and efficient to manage all of them from a single console.

To install HDFS over DCOS we must have healthy cluster minimum of 11 nodes and 1 external node to install DCOS CLI. However for better performance/production environment it is better to add at least 25 node with handsome configuration. 


Machine
Core
RAM
HDD
BootstrapNode
>= 8
>= 16
40GB
MesosMaster1
>= 8
>= 12
40GB
MesosMaster2
>= 8
>= 12
40GB
MesosMaster3
>= 8
>= 12
40GB
MesosAgent1
>= 6
>= 6
40GB
MesosAgent2
>= 6
>= 6
40GB
MesosAgent3
>= 6
>= 6
40GB
MesosAgent4
>= 6
>= 6
40GB
MesosAgent5
>= 6
>= 6
40GB
MesosAgent6
>= 6
>= 6
40GB
MesosPublic
>= 6
>= 6
40GB

With the above number of Machines, DCOS cluster we will have 1 Bootstrap Node from where we install all the packages on DCOS Master Node and 6 PrivateAgent nodes and 1 PublicAgent respectively. However we can increase the number of datanode as per our requirement if we have enough number of Mesos agent in our DCOS cluster.

DCOS can be install either on-premise or over Cloud Platfrom (AWS/Azure) on certain OS like Centos/RHEL/CoreOS etc. The steps given below are perfectly working on Centos7.2 for both on-premise and AWS Cloud.

Pre-requisite for DCOS Intallation

There are certain pre-requisite steps that needs to be followed on either of the machines listed in the above table as below:

1. Login as a root and create a new group and user respectively. Add UID (1000 or above to a user)     and set password for the mesos user

              sudo su –
              groupadd -g 1001 mesos
              useradd -g mesos -u 1001 mesos
              passwd <mesos>

2. Edit the config file(Via Root) and reboot the system to reflect

             vi /etc/selinux/config

                  %s/SELINUX=enforcing/SELINUX=disabled/g

            :wq!

3. Set the hostname as below
            
            sudo hostnamectl set-hostname mesos-agent1.machine.com --static

4. Edit network file and add the below line
  
            vi /etc/sysconfig/network

                HOSTNAME= mesosmaster       (same name as in the step3)
                NETWORKING_IPV6=no
                IPV6INIT=no

           :wq!

Note: In case of creating cluster in EC2 environment please add the complete private domain name for host name
     E.g.: If ip-192.X.X.X.internal private domain then assigned ip-192.X.X.X as a hostname variable


5. Login as root again and edit the sudoers file

           su – root

           vi /etc/sudoers

               #Search for “wheel” and add the below line

               mesos    ALL=(ALL)             NOPASSWD:ALL         

          :wq!

6. Set the Ulimit Value

          ulimit  –n             32768
          ulimit  –u             60000

7. Edit the limits.conf file

          vi /etc/security/limits.conf

             #Append the below line
             mesos       hard    nofile    65536
             mesos      soft    nofile    65536
             mesos      hard    nproc    65536
             mesos      soft    nproc    65536

        :wq!

8. Edit the file /etc/sysctl.conf file and add the below line at the end
         vi    /etc/sysctl.conf

             kernel.pid_max = 4194303
             net.ipv4.ip_local_port_range = 1024      64000
             net.ipv6.conf.all.disable_ipv6 = 1

         :wq!


9. Disable firewall in all the machine

         sudo systemctl stop firewalld && sudo systemctl disable firewalld


10. Disable the Iptables


        sudo chkconfig iptables off

        sudo chkconfig ip6tables off

11. Copy the IP of each machine in host file on every machine in the DCOS cluster and execute the below command with updated IP of your cluster

        sudo tee -a /etc/hosts << -'EOF'
           10.5.3.80 bootstrap.machine.com

          10.5.3.10 mesos-master3.machine.com
          10.5.3.11 mesos-master2.machine.com
          10.5.3.12 mesos-master1.machine.com

         10.5.3.20 mesos-agent6.machine.com 
         10.5.3.21        mesos-agent5.machine.com
         10.5.3.22        mesos-agent4.machine.com
         10.5.3.23        mesos-agent3.machine.com
         10.5.3.24        mesos-agent2.machine.com
         10.5.3.25        mesos-agent1.machine.com 

         10.5.2.40        mesos-public.machine.com 

    EOF

Note: In case of ec2 environment we need to add the private domain name and internal IP address as below, complete             below line should be added
                  192.X.X.X    ip-192.X.X.X.internal      ip-192.X.X.X

12. ssh key based authentication is required so that each of the machine can communicate with each       other without password

          su – mesos
         ssh-keygen –t rsa            (Press Enter for default )

         #Press enter again for default passphrase
         cat /home/mesos/.ssh/id_rsa.pub >> /home/mesos/.ssh/authorized_keys
        ssh localhost     (Enter yes)

        exit        (connection should be closed)

Note: Copy the contents from id_rsa, id_rsa_pub & authorized_keys into respective files on each machine at same user/path with same privilege to ensure each machine can talk to other without password. Reboot all the machines to reflect the changes made in above steps.

13. Update the yum repo and install certain packages on each machine in the cluser

        sudo tee /etc/yum.repos.d/docker.repo <<-'EOF'
             [dockerrepo]
             name=Docker Repository
             baseurl=https://yum.dockerproject.org/repo/main/centos/$releasever/
             enabled=1
             gpgcheck=1
             gpgkey=https://yum.dockerproject.org/gpg
       EOF


      sudo mkdir -p /etc/systemd/system/docker.service.d && sudo tee          /etc/systemd/system/docker.service.d/override.conf <<- EOF
            [Service]
            ExecStart=
            ExecStart=/usr/bin/dockerd --storage-driver=overlay
      EOF



      sudo yum upgrade --assumeyes --tolerant && sudo yum update --assumeyes  

      sudo yum install -y mlocate.x86_64 tree screen net-tools.x86_64 bind-utils

      sudo yum -y install epel-release 

      sudo yum -y install build-essential python-dev libcurl4-nss-dev libsasl2-dev libsasl2-modules maven libapr1-dev libsvn-dev curl wget ntp htop;



      #Install Docker

      sudo yum install -y docker-engine-1.13.1 docker-engine-selinux-1.13.1

     #Start Services
     sudo systemctl start docker && sudo systemctl enable docker && sudo service ntpd start &&     sudo sysctl -w net.bridge.bridge-nf-call-iptables=1 && sudo sysctl -w net.bridge.bridge-nf-call-ip6tables=1 

     # Update the java path in bash profile
      vi ~/.bashrc
               readlink -f /usr/bin/java | sed "s:bin/java::"

               export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")

      wq!


Note: Make sure Java version should be >=1.7


Additional Steps On BootStrap Node

Login to the Bootstrap node and follow the below steps to launch DCOS

1. Goto home directory and create a genconf directory and update the config file with IP as below
     
          mkdir -p genconf && tee genconf/config.yaml <<-'EOF'
                     agent_list:
                     - 10.5.3.20
                     - 10.5.3.21
                     - 10.5.3.22
                     - 10.5.3.23
                     - 10.5.3.24
                     - 10.5.3.25
                   bootstrap_url: file:///opt/dcos_install_tmp
                   cluster_name: DEMO-DCOS
                   exhibitor_storage_backend: static
                   ip_detect_public_filename: genconf/ip-detect-public
                   ip_detect_path: genconf/ip-detect
                   master_discovery: static
                   master_list:
                    - 10.5.3.10
                    - 10.5.3.11
                    - 10.5.3.12
                  public_agent_list:
                    - 10.5.2.22
                  process_timeout: 10000
                  resolvers:
                    - 8.8.8.8
                    - 8.8.4.4
                 ssh_key_path: genconf/ssh_key
                 ssh_port: 22
                 ssh_user: mesos
       
         EOF


Note: This is a minimum configuration, we can do more customization as per the requirement

2. Create a script ip-detect as below

        tee genconf/ip-detect <<-'EOF'
           #!/usr/bin/env bash
                   ip addr show |grep "inet " |grep -v 127.0.0. |head -1|cut -d" " -f6|cut -d/ -f1
           EOF

3. Download the script in home directory

        curl -O https://downloads.dcos.io/dcos/stable/dcos_generate_config.sh

4. Execute the Below command

       sudo bash dcos_generate_config.sh --genconf &&
       sudo bash dcos_generate_config.sh --install-prereqs &&
       sudo bash dcos_generate_config.sh --preflight &&
       sudo bash dcos_generate_config.sh --deploy &&
       sudo bash dcos_generate_config.sh --postflight;

Below message should be displayed upon successful completion.


====> EXECUTING INSTALL PREREQUISITES
====> START install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> STAGE install_prereqs
====> OUTPUT FOR install_prereqs


====> ACTION run_preflight COMPLETE
====> SUMMARY FOR run_preflight
10 out of 10 hosts successfully completed run_preflight stage.
====> END OF SUMMARY FOR run_preflight



====> EXECUTING DC/OS INSTALLATION
====> START install_dcos


====> ACTION run_postflight COMPLETE
====> SUMMARY FOR run_postflight
10 out of 10 hosts successfully completed run_postflight stage.
====> END OF SUMMARY FOR run_postflight

4. Go to the browser and launch the below URL

        http://<master-public-ip>:8181/exhibitor/v1/ui/index.html
        http://<public-master-ip>/
        http://<public-master-ip>:8080



Login to the DCOS console by selecting either of the account type and Enjoy the DCOS Cluster!!!



Hope you have enjoyed the post and able to setup the environment without any issue. For any issue/troubleshoot please feel free to approach, always ready to help. Can assist to troubleshoot any issue.

For custom setup to expand the DCOS existing cluster, stay tuned with my blog.




Reference:

https://docs.mesosphere.com/1.10/

No comments:

Post a Comment