#328: The 18 Steps In A Software Development Workflow

Have you ever wondered how software gets developed and eventually pushed out for the world to use? Ever wondered what #devops looks like from a developer’s perspective? There are many ways to develop software and avail the end product to users, and depending on the design and complexity of the software, engineers or software teams have developed workflows that help them produce better software.

Below is my workflow for developing Geldzin, my perpetually evolving web application. Apart from an IDE in which I do actual code development (Eclipse), the other major components in this #devops environment are Vagrant+Virtualbox, Bitbucket, Jenkins, Artifactory, and of course the pre-production and production servers.

(1) Most projects start with access to a source code repository. Always fork your own repository of the project in the remote host. Access assumes you have the necessary SSH keys registered in your profile to allow Git access via SSH.

(2) To have the source code locally available for development, initially clone the forked project repository off “master”. If there is already source code locally, setup the remotes needed and pull to synchronize local with remote.

(3) Whatever feature or issue you working on, checkout a local branch in which it will be developed. This allows you to work on a few different features or issues at a time, keeping the changes isolated until they are ready.

(4) I always have a virtual environment which I use to test or debug the application. I favor running applications within the IDE, but it is better to also test in a setup similar to production, with the expected dependencies in place (without security or load-balancing or caching).

(5) When you are done developing a feature or fixing an issue, merge the changes from the feature branch into the local “master” branch. Note that the local feature branches are not remotely linked.

(6) Resolve any merge conflicts and pull before you push the changes to your remote forked repository branch.

(7) Create a pull request, which triggers notification to other team members to review the code changes.

(8) The changes are voted on, or comments are added that may require additional code changes (putting you back at #3). Code should not be integrated until it has been approved by the team.

(9) I usually have a Jenkins job off my forked repository “master” that will build and run unit tests. Code should not be integrated until it has passed tests and coding standards. The test profile here should not require a VM; it should simply set off a sanity check.

(10) Integrating approved changes into the project’s “master” is usually a task for the team lead. At this point, I merge pull requests from other developers into the main “master”, and close the pull requests.

(11) Nightly, a Jenkins job will grab all the changes from the day, build the software, initialize a virtual environment, test the software, and publish reports or documentation. Any failures cause a jump back to #3 for the offending developer.

(12) If the nightly Jenkins job is success, code is automatically tagged with a version number and promoted to the “candidate” branch. Another round of testing can be done here if cherry-picking of unwanted features was done. Just remember to merge back to “master” (step #10).

(13) If all is well in #12, immediately another Jenkins job builds the software and posts the product in Artifactory with the appropriate version tag, and triggers #14 below.

(14) The Jenkins job in #13 above also deploys the software into a pre-production environment where quality analysis and stakeholder preview can happen. Any feedback from this point that needs changes causes a jump back to #3.

(15) At the appointed time for release, code is promoted to the “production” branch. This is our golden version that is released. I used to do build optimizations here, but I only generate production configurations/settings these days.

(16) The software is then deployed into production, and users have access to the latest and greatest. Notice that no further building of the software should happen here: the artifact from #13 should be imported instead.

(17) The process doesn’t end here for a developer: you usually need to update your development environment with all the changes from the latest release, so you fetch them from upstream.

(18) You finally merge the fetched changes into my local “master”, ready for the next feature or issue you will be working on. At this point, the cycle proceeds from #3 onwards.

As a developer, you always need to be aware of the distinct parts of the workflow: the #dev part is obviously what you do in development, and the #ops part is how your software ends up in operation. Knowing this forces you to design your projects for testability and externalize configuration settings, for the software to flow well in a #devops pipeline.

#327: Eight Concepts for Building Better Software

You won’t believe how many software projects I have worked on that are monoliths of source code! There is no concept of artifact reuse, maintenance  is a headache, testing is a nightmare, and the side effects of changes in one area are sorely felt all over the application. Don’t even mention trying to upgrade or extend the application; the importance of good application architecture is paramount.

That is why over the years I have adopted eight principles in my projects. Below is a common pattern for standard web applications that I am using in my perpetual development project (Geldzin), showing the components and their dependencies from a user perspective.

(1) The application is accessible by three kinds of actors: regular users (User), Jenkins (for #devops), and other applications (Integration). The principle here is to design your applications for various kinds of access.

  1. Regular users interface through a desktop UI (browser-based), a mobile app, or other documents and reports that are produced by the application.
  2. Jenkins is used for building, testing, and deploying the software. This user mostly interacts with the application from the #devops perspective.
  3. The Integration actor is basically programmatic access to the application via web services (SOAP/REST) or other means. I will be using a Mule ESB integration to manage the application, perform special operations, feed in externally-sourced data, or obtain data for analysis. My whole “big data” experiment will live in Mule, alongside other integrations for updating my stock portfolio or results of web crawlers, that will be needed in the application.

(2) All interaction with the application is proxied using Apache HTTP Server in production or Nginx elsewhere. Security is enforced at that point (SSL, authentication, filtering), and conceivably redirection to the various servers running each of the modules, or load-balanced instances running the whole application. We could also implement caching at this point, if needed, as well as tap into external security mechanisms such as CAS or Federation (to facilitate SSO or *factor authentication). The principle here is that proxying, security, load-balancing, and caching should not exist in the application code, and should be separately configurable without needing to rebuild the application.

(3) The project itself is standard Maven, but is multi-modular. Each of the modules (Webservice API, Webapp, Mobile app, and Reporting/Documents) is a subproject of the overall “parent” application project. These modules have dependency on the Core subproject, which itself is dependent on the Common subproject. All plugins and libraries that will be used in the project are defined in the parent project. The principal here is to promote code reuse, separate concerns, setup for easier maintenance, and produce artifacts that can be reused elsewhere.

(4) Notice that there is only one place where all the business logic lives (in Core). This module is logically layered: DAOs and messaging in the backend, and business services on the front that provide an API over which the dependent modules interface. [See: DAO pattern].

(5) Separating out common utilities and resources (in Common) is a recent practice in my world, for three reasons: (i) host DTOs that will be used by actor-facing modules, preventing explicit use of backend entity and model classes, (ii) present or consume data the way it is used in actor-facing modules, independent of how it is conceived in the backend, and (iii) a singular packaging of common code or resources used across all modules. [See: DTO pattern].

(6) The Data component needs no further explanation; it is basically a database or file system. In practice, most of my data lives in a database, but items such as email templates or produced documents are stored on the file system. The principle here is to use the proper storage medium for your data.

(7) The Messaging component is how the application itself communicates with the outside world. In this application, I am using it to send error messages to a JMX queue monitored by another of my Mule integrations, and to send out emails or text messages to users. This facility will be extended to communicate with, say, Google Calendar, to obtain users’ tasks/events, or their banks to synchronize transactions.

(8) The Devops package at the bottom is where build/test/deploy scripts live, that will provision and configure a server on which the application will run, build and deploy the software, and perform tests, including spinning up virtual environments if needed. Project configuration should also live in this space, including strategies of generating property/config files that are needed by the application. I’ve seen situations where test suites also live here, but it is not my practice.

When you start a project along these principles, you ought to start with the Devops package as part of initial scaffolding because it sets you up nicely for Agile incremental/regression build/test. Personally, I develop applications backend-to-frontend, and have found it useful to do Test-Driven Development when developing the backend, then do functional/behavioral development and testing when developing the actor-facing/UI modules.

#326: Build and Test Software in a DevOps Environment

Your software projects must have three characteristics, at a minimum, in order to be conducive to a continuous integration strategy that includes automated build and testing:
(1) they should be source-controlled in an SCM repository where automated CI tools can access code and scripts.
(2) the devops environment must already have all necessary resources required to build a testable artifact of the software.
(3) the project should define or follow a standard way to build and test the software.

As an example, I have a perpetually-evolving web application project (Geldzin) built in Java. I use Bitbucket for SCM, which satisfies the first requirement. I use Maven for build and dependency management (third requirement), therefore the project itself follows common Maven practice. I have a devops environment driven by Jenkins that has Java, Maven, Ansible, Vagrant+Virtualbox, MySQL, and Git preinstalled. The Jenkins job will build and test the software whenever there are changes. It is probably a good idea for each developer’s branch to be associated with their own Jenkins job, and only merge into the main branch after successful build/test.

To build and test my application, I use Ansible to prepare the test environment. This includes setting up the database that will be used (refreshed from production), and preparing the properties needed.

[Geldzin4DEV] $ /usr/bin/ansible-playbook /data/jenkins/workspace/Geldzin4DEV/ansible/test-preparation.yml -i /data/jenkins/workspace/Geldzin4DEV/ansible/ansible.inventory -f 1 --vault-password-file /etc/ansible/vault_password

Then I launch a Maven build that will use Cobertura during test, along with FindBugs and other code coverage tools, as well generate code documentation (JavaDocs). The unit tests are TestNG-driven, covering the Spring services (and in effect the data/persistence layer behind the services, all the way to the database). I do not call Maven tests directly because Cobertura does it after instrumenting the source code for analysis.

/usr/shared/maven/bin/mvn -gs /usr/shared/maven/conf/settings.xml clean cobertura:cobertura site:site site:stage -Pjenkins_sanity

What happens during build and test is largely defined in a Maven profile.

1:  <profile>  
2:    <id>jenkins_sanity</id>  
3:    <properties>  
4:      <log4j.geldzin.core>DEBUG</log4j.geldzin.core>  
5:      <log4j.geldzin.webapp>DEBUG</log4j.geldzin.webapp>  
6:    </properties>  
7:    <reporting>  
8:      <plugins>  
9:        <plugin>  
10:          <groupId>org.apache.maven.plugins</groupId>  
11:          <artifactId>maven-site-plugin</artifactId>  
12:          <version>3.4</version>  
13:          <configuration>  
14:            <reportPlugins>  
15:              <plugin>  
16:                <groupId>org.apache.maven.plugins</groupId>  
17:                <artifactId>maven-project-info-reports-plugin</artifactId>  
18:              </plugin>  
19:            </reportPlugins>  
20:          </configuration>  
21:        </plugin>  
22:        <plugin>  
23:          <groupId>org.apache.maven.plugins</groupId>  
24:          <artifactId>maven-javadoc-plugin</artifactId>  
25:          <version>2.10.3</version>  
26:          <configuration>  
27:            <aggregate>true</aggregate>  
28:          </configuration>  
29:        </plugin>  
30:        <plugin>  
31:          <groupId>org.apache.maven.plugins</groupId>  
32:          <artifactId>maven-jxr-plugin</artifactId>  
33:          <version>2.5</version>  
34:        </plugin>  
35:        <plugin>  
36:          <groupId>org.codehaus.mojo</groupId>  
37:          <artifactId>cobertura-maven-plugin</artifactId>  
38:          <version>2.7</version>  
39:          <configuration>  
40:            <aggregate>true</aggregate>  
41:            <formats>  
42:              <format>html</format>  
43:              <format>xml</format>  
44:            </formats>  
45:          </configuration>  
46:        </plugin>  
47:        <plugin>  
48:          <groupId>org.codehaus.mojo</groupId>  
49:          <artifactId>findbugs-maven-plugin</artifactId>  
50:          <version>3.0.3</version>  
51:          <configuration>  
52:            <excludeFilterFile>../findbugs/exclusions.xml</excludeFilterFile>  
53:          </configuration>  
54:        </plugin>  
55:      </plugins>  
56:    </reporting>  
57:  </profile>  

After testing, an Ansible script will clean up the test environment, deleting the database that was setup earlier and clearing the properties.

/usr/bin/ansible-playbook /data/jenkins/workspace/Geldzin4DEV/ansible/test-conclusion.yml -i /data/jenkins/workspace/Geldzin4DEV/ansible/ansible.inventory -f 1 --vault-password-file /etc/ansible/vault_password

One important metric to keep track of is how long it takes to build and test your projects, and how much memory was used. The build and test process itself can become inefficient, affecting the overall performance of the CI strategy.

[INFO] ------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Geldzin ................................ SUCCESS [ 32.065 s]
[INFO] Geldzin Core ........................... SUCCESS [03:14 min]
[INFO] Geldzin Web Application ................ SUCCESS [01:37 min]
[INFO] ------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------
[INFO] Total time: 05:35 min
[INFO] Finished at: 2017-01-27T03:19:36-07:00
[INFO] Final Memory: 81M/808M
[INFO] ------------------------------------------------------------

 

#325: Setup Jenkins Jobs with Ansible

Now that we have a fully provisioned #devops environment, it is time to begin making use of it in our software development process. (See #324: Provisioning with Ansible, Part 9 – Ansible for the conclusion of the provisioning series).

For the most part, my build/test/deploy strategy loosely follows modern continuous integration (CI) practice. All my projects are source-code-versioned on a Git host (Github or Bitbucket), and I use Jenkins for continuous integration. In Jenkins, each developer has a job that builds and tests their code as soon as it is committed into their branch, itself a fork of “master”. If everything passes, the developer then creates a pull request to merge their changes into master.

So you don’t have to manually setup these build jobs in Jenkins, we can have Ansible setup the jobs when/after it provisions the Jenkins server, or as needed, to promote consistency. I prefer separating the job setup from Jenkins server provisioning, so we’ll create a new playbook just for managing Jenkins jobs.

(1) The playbook is jenkins.yml. It’ll run on servers in the “jenkins_servers” group and use variables specified in settings.yml. It’ll setup the “jenkins” user as sudoer, and then configure each defined job.

(2) Jobs are defined in their own directories under the jenkins/ directory. For example, I have the “Geldzin4DEV” job defined, composed of tasks (*/job.yml), variables (*/variables.yml), and configuration (in */templates/config.xml.j2). The idea here is to have a standard layout and nomenclature of files so we can abstract the play as much as possible.

(3) Setting up a job is as simple as processing the */templates/config.xml.j2 and saving it remotely in the Jenkins installation’s jobs/ directory as config.xml. When you restart Jenkins, it’ll pick up the job and ready it for build.

(4) As specified in the job’s config.xml, we take advantage of plugins we installed during Jenkins provisioning. The job does three things:

  1. Uses Ansible to setup the test environment, including creating a properties file and configuring the database as needed for test.
  2. Uses Maven to run the tests.
  3. Uses Ansible again to clean up, including deletion of any temporary files, the properties file, and the database that was created before.

(5) It is expected that the project being tested define all resources needed to complete this testing. The Ansible scripts and Maven profiles are defined in the projects we intend to build and deploy.

I should note that Geldzin is a simple financial management web application that my family and friends have used since about 2013. I developed it as a replacement for the Excel spreadsheets I used to manage my finances, but have also since used it as a proving ground for various technologies I take interest in.

You can review the sources for setting up build jobs in Jenkins with Ansible in my Github/09_BUILDS branch.

#324: Provisioning with Ansible, Part 9 – Ansible

Since I intend to use Ansible for deployments of software into the various tiers (development, integration, production), the last thing I will provision in the #devops environment is Ansible itself. In #316: Install Ansible on CentOS7, I had installed it manually but will now roll it into the general provisioning of the environment, and stick with a specific version.

(1) In the local inventory (ansible.inventory), define a new role “ansible” for build servers on which Ansible will be installed.

(2) Setup the role with a directory layout similar to what’s in the image below.

The role’s playbook (tasks/main.yml) first checks whether Ansible is already installed, and if not, proceeds with its installation. It uses variables in vars/main.yml (unencrypted) and vars/secrets.yml (encrypted).

The secret here is the Ansible Vault password, which will be written to all build servers. The strategy is that sensitive data in source code is encrypted using this same Vault password across the board, and that the build servers should be able to decrypt the data in flight during tier configuration, builds, and deployments.

Vault is a huge bonus for Ansible in the world of infrastructure automation because it is part-and-parcel of the framework, as opposed to Chef or Puppet that do not have native encryption schemes. I also like that secrets are only decrypted in memory (and fast), eliminating the chance that temporary files with the decrypted secrets could exist post-process in the build environment. Beware that logging should be tweaked to not print secrets to logs and consoles during the process.

With Ansible installed, we have a system fully provisioned and ready to perform #devops duties, which will include obtaining source code from a Git repository, building the software and running tests in Jenkins, and deploying the final product to the designated tier as needed.

The complete sources can be reviewed from my Github/08_ANSIBLE branch.

#323: Provisioning with Ansible, Part 8 – Vagrant and VirtualBox

Before software is deployed to production, I often do integration, performance, load, and UI tests in a pre-production environment. The strategy involves building the applications, spinning up a virtual environment as similar to production as possible, deploying the applications to the environment, and running tests to establish benchmarks of how the applications will behave in production. By virtual environments, I mean Vagrant and VirtualBox, which we will now have provisioned on the #devops machines.

(1) In the local inventory (ansible.inventory), create a new group that will list the servers on which VirtualBox and Vagrant will be installed. I am calling this group “virtual_machines”.

[virtual_machines]
devops_vm

(2) We want all our build servers to have these components installed, so we add a role to that grouping, along with other roles we had specified before.

# Build servers
- hosts: build_servers
  become: yes
  gather_facts: yes
  roles:
  - role: java
    tags: java
  - role: maven
    tags: maven
  - role: git
    tags: git
  - role: vm
    tags: vm

(3) Now we can set up the “vm” role the usual way, with a directory layout similar to what’s in the image below.

The role’s playbook (tasks/main.yml) first checks whether VirtualBox is already installed, and if not, proceeds with its installation (tasks/virtualbox.yml). Then it checks whether Vagrant is installed, and if not, proceeds with its installation (tasks/vagrant.yml).

VirtualBox was trickier to install because it requires the kernel-devel module, without which you cannot instantiate virtual boxes. My setup includes the sources (files/kernel-devel-3.10.0-327.36.3.el7.x86_64.rpm), which are installed into YUM before installing VirtualBox.

Vagrant is a more straight-forward installation using YUM. We essentially ask YUM to install the RPM of the specific version we want.

The complete sources can be reviewed from my Github/07_VM branch. At this point, you’ll have a provisioned system that has Java, Maven, MySQL, Artifactory, Git, Jenkins, and now Vagrant+VirtualBox installed.

#322: Provisioning with Ansible, Part 7 – Jenkins

In #312: Install Jenkins on CentOS7, I manually installed Jenkins in a CentOS 7 devops environment. I will now use Ansible to automate the process.

(1) In the local inventory (ansible.inventory), create a new group that will list the servers on which Jenkins will be installed. I am calling this group “jenkins_servers“.

[jenkins_servers]
devops_vm

(2) Then update the playbook (provision.yml) with the role and hosts for which Jenkins will be installed, which I am calling simply “jenkins“.

# Jenkins servers
- hosts: jenkins_servers
  become: yes
  gather_facts: yes
  roles:
  - role: jenkins
    tags: jenkins

(3) If you are using Vagrant+VirtualBox as a test environment, such as my use of host “devops_vm”, you need to specify the port from which Jenkins can be accessed on the local/host machine. So, in Vangrantfile, add the line below, which simply means that on my laptop (which is hosting the VM), I can just hit http://localhost:28021 to access the Jenkins running in the VM.

devops.vm.network :forwarded_port, guest: 18082, host: 28081, id: 'jenkins'

(4) The various Jenkins hosts can have different settings for the instance of Jenkins running on them. So I take advantage of host_vars to configure these settings. For example, to host_vars/devops_vm I add:

# Jenkins
jenkins_admin_email: 'TEST Jenkins &lt;jenkins@strive-ltd.com&gt;'
jenkins_external_url: http://{{ ansible_default_ipv4.address }}:{{ jenkins_http_port }}

(5) Now build the role, with the following file layout:

(6) One of the differences between this automated provisioning of Jenkins and the manual installation I had done before, is that I further customize and pre-configure the instance with the plugins I will likely need later. So tasks/configure.yml will do the customization, and tasks/plugins.yml will setup the plugins. Furthermore, the Jinja2 templates/*.j2 will further pre-configure Jenkins users, security, and plugins.

(7) The unencrypted vars/secrets.yml declares the following variables:

---
# Admin user
jenkins_admin_username:
jenkins_admin_fullnames:
jenkins_admin_apitoken:
jenkins_admin_password:
jenkins_admin_password_hash:
jenkins_admin_email:

# Mailer plugin
jenkins_mailer_username:
jenkins_mailer_password:
jenkins_mailer_smtp_host:
jenkins_mailer_smtp_ssl:
jenkins_mailer_smtp_port:

# Public key
jenkins_public_key:

# Private key
jenkins_private_key:

# Credentials plugin
jenkins_credentials_username:
jenkins_ssh_keys_dir:
jenkins_ssh_private_key_file:
jenkins_ssh_public_key_file:

One of the strategies here is to have the same private/public SSH key for all the Jenkins installations so that you will only need to add one SSH key in your Git repositories for any of the Jenkins jobs to access. We configure the instance’s credentials to use this SSH key when we customize the Credentials plugin.

Of course you can further setup Jenkins build jobs and email templates, etc, but at this point I only care about having a functioning Jenkins server, ready with plugins and security (users and credentials) for projects down the road. Build jobs will be dealt with in another post.

The complete sources can be reviewed from my Github/06_JENKINS branch. At this point, provisioning will give you an evolved system incrementally installed with Java, Maven, MySQL, Artifactory, Git, and now Jenkins.