2016. december 8.

JIRA - Confluence connector in PHP - Part I


For long time, our partner has been using Atlassian JIRA and Confluence to manage their projects, and to organize and discuss work with their team.


Quality Score of JIRA projects

A few people started to extract certain key metrics from JIRA in order to be able to quantify the quality across the ongoing and historical projects. The metrics were used to calculate a single Quality Score (QS) value, which clearly indicates the progress and health of any given project, without requiring the understanding of intricate details within each project. Additionally, historical analysis of this metric allows management to use to compare and validate internal changes (e.g. a change of team composition, introduction of new testing approaches etc are directly measurable via the Quality Score).

Initially this process was performed manually over the course of a few days monthly, so it was an excellent candidate for replacing with an automated workflow.

The following blog posts will describe how we have extracted information from JIRA, how we implemented the logics and processing in the middleware, and then integrated with Confluence, to generate a human digestible report.

Get data from JIRA

The task

I had to calculate a Quality Score (QS) of selected products consisting of a few projects and present the calculated score and data backing the calculations in Confluence pages. It would be a plus to have a Table of contents with links to the corresponding pages.  

The first question that arose was how to get data out of JIRA. Fortunately it has a moderately documented REST API, so the answer came easily.

The solution

After reading the documentation I decided to use json and php curl.

I planned to go along with the following tasks:

  • Create a base class to connect JIRA and execute my queries.
  • Create a Config class to manage (read and write) configuration data. 
  • Create a webpage to set up parameters, like username, password needed to connect JIRA and Confluence, products details etc. and press a button to start the process. 
  • It was their responsibility to ensure the security of this page so I didn’t have to worry about it.
  • Create classes for each Score - needed to the equation - to extract data from JIRA and calculate the Quality Score.
  • Create pages in Confluence for each time interval, for each project and product, including a table of contents.
    • Create a class to handle managing (creating, updating and deleting) pages in Confluence.

The calculation was not only a complicated but also a time consuming one. I had to send more than 40 queries to collect all data needed for processing the QS for one project:

  • Get the statuses of the issues belonging to the selected project
    • Open
    • In Progress
    • Reopened
    • Ready to Release
    • etc.
  • Get the resolution of issues such as:
    • Cannot Reproduce
    • Duplicate
    • Bug, but won’t fix
    • etc.
  • Get priority of issues
    • Critical
    • High
    • Urgent

When I had all the necessary data I had to calculate the Assumed Risk Score, the Average Age of an issue, the Defect Effectiveness Score, Escaped Defects, Regression and many more. I also had to calculate a so called Confidence Modifier. The Confidence Modifier is used to qualify a score and give it weight.  The higher the number, the more confidence in the score we have.  Low confidence means that we should take the resulting score with a grain of salt in that it is derived from a small set of data and may not remain true given larger volumes of data.  Higher Confidence Modifiers demonstrate that the dataset used to derive the score is sufficiently large to believe what the data is telling us.

I had to do the calculation for all projects, all products, and for every quarter and every year for four years back and for lifetime. 

Initially we had seven products consisting of a total of 30 projects. This means that to have a QS calculated for one project only, I had to send 40 * (4 years * 4 quarters + 4 years + 1 lifetime) queries against jira. Multiply it with 30 projects and 7 products and we will get a number around 30 thousand. 

It was clear that this task couldn’t be run in a web browser; however, the client needed an admin page to set up the parameters and start the process.

When he pressed the start button, it started a background process to do all the calculations:

  1. if (isset($_POST['start'])) {
  2.  $msg = '<div class="message">';
  3.  $msg .= '<p>Processing data has just started. It might take longer than one hour.</p>';
  4.  $msg .= "<p>An email notification will be sent to the email address provided in settings when it's finished.</p>";
  5.  $msg .= '</div>';
  7.  $cmd = "php process_data.php $date > /dev/null &";
  8.  exec($cmd);
  9. }

Pay attention to  & at the end of line. This way we can execute the command without waiting to finish it so the web browser won’t send a “Maximum execution time of 30 seconds exceeded” message.

After I started to run my queries, I realized that the connection was sometimes lost and in such cases I had to start the calculation from the beginning because I needed all the figures for my equation. Therefore, in case of error, I added a loop to repeat the request five times and wait an increasing amount of time, each time.

This way the lost-connection error was solved.

  1. public function exec($url) {
  2.   $username = USER_NAME;
  3.   $password = USER_PASSWORD;
  5.   $curl = curl_init();
  6.   curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
  7.   curl_setopt($curl, CURLOPT_USERPWD, "$username:$password");
  8.   curl_setopt($curl, CURLOPT_URL, $url);
  9.   curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
  10.   curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
  11.   curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
  13.   $response = curl_exec($curl);
  14.   // If no response, repeat the request.
  15.   if (empty($response)) {
  16.     $seconds = 5;
  17.     $i = 1;
  18.     $count = 5;
  19.     while (empty($response) && $count >= 0) {
  20.       sleep($i * $seconds);
  21.       $response = curl_exec($curl);
  22.       $i++;
  23.       $count--;
  24.     }
  25.   }
  26.   $result = $this->getResult($response, $curl, $url);
  27.   return $result;
  28. }

Just for the record, the getResult function does nothing else, just handles the possible errors and, in case of success, returns the result as an array.

  1. private function getResult($response, $curl, $url) {
  2.   // If request failed.
  3.   if (!$response) {
  4.     $http_response = curl_getinfo($curl, CURLINFO_HTTP_CODE);
  5.     $body = curl_error($curl);
  6.     curl_close($curl);
  7.     // The server successfully processed the request, but is not returning any content.
  8.     if ($http_response == 204) {
  9.       return '';
  10.     }
  11.     $error = 'CURL Error (' . get_class($this) . ")\n
  12.       url: $url\n
  13.       body: $body";
  14.     throw new Exception($error);
  15.   }
  16.   else {
  17.     // If request was ok, parsing http response code.
  18.     $http_response = curl_getinfo($curl, CURLINFO_HTTP_CODE);
  19.     curl_close($curl);
  21.     // don't check 301, 302 because setting CURLOPT_FOLLOWLOCATION
  22.     if ($http_response != 200 && $http_response != 201) {
  23.       $error = "CURL HTTP Request Failed: Status Code :
  24.            $http_response, URL: $url
  25.           \nError Message : $response";
  26.          throw new Exception($error);
  27.     }
  28.   }
  29.   $result = json_decode($response, TRUE);
  30.   return $result;
  31. }

How to build the queries:

  1. private function initQuery($query = array()) {
  2.   $host = YOUR_HOST;
  3.   $api_uri = YOUR_API_URI;
  5.   $result = $host . $api_uri . '/';
  6.   if (!empty($path)) {
  7.     $result .= $path;
  8.   }
  9.   if (!empty($query)) {
  10.     $result .= '?' . http_build_query($query);
  11.   }
  12.   $this->query = $result;
  13. }

Where $query usually looks like this:

  1. $query = array(
  2.   'jql' => '',
  3.   'fields' => '*none',
  4.   'startAt' => 0,
  5.   'maxResults' => 0,
  6. );

For ‘jql’ I usually put together a query on JIRA UI then passed it to the query.  

fields =*none’ because I only needed the count of results. For the same reason, startAt and maxResults are 0.

But if you want to select some fields and you expect more than, let’s say 500 records, as a result you might do something like this:

First we need to set up our query:

  1. $query = array(
  2.   'jql' => '',
  3.   'fields' =>  'created, resolutiondate', // select created and resolutiondate fields
  4.   'startAt' => 0,
  5.   'maxResults' => 500,
  6. );

To get all issues just create a function like this:

  1. protected function getAllProjectIssues($query, $total_count) {
  2.   $result = array();
  3.   $max_results = 100;
  4.   for ($i = 0; $i <= $total_count; $i += $max_results) {
  5.     $query['startAt'] = $i;
  6.     $query['maxResults'] = $max_results;
  7.     $url = $jira->createUrl('search', $query);
  8.     $response = $jira->exec($url);
  9.     $result = array_merge($result, $response['issues']);
  10.   }
  11.   $this->setIssues($result);
  12. }

When running the calculations the first time it took slightly more than 24 hours to get the results. This was, of course, unacceptable, so, first I reduced the number of queries to ten thousand by eliminating the redundant queries. But it still took too long, so I started to think about multiprocessing.

I separated the calculation of one single score to a get_score.php file and saved the results into a .json file with a unique filename where the name is based on timeframe ($indicator), the product ($product_id) and project ($project_id). This file holds an array with the results of the 40 queries necessary to calculate the QS of one project within a timeframe (quarter, year, lifetime) and the results’ calculation is based on the results of queries. Saving the results of one single calculation to a file saved me time: when I occasionally restarted the process I was looking for the file with results. If it existed, then I skipped that calculation.

In the process_data.php file in a foreach loop I called the get_score.php file with the corresponding parameters such as $product_id, $project_id and $indicator. 

  1. try {
  2.   $fname = isset($product_id) ? $product_id : '';
  3.   $fname .= isset($project_id) && is_string($project_id) ? "_$project_id" : '';
  4.   $fname .= isset($indicator) ? "_$indicator" : '';
  5.   $cmd = "php get_score.php $date $indicator $product_id &";
  6.   // We just echo the command to inform the user about the progress.
  7.   echo $cmd . "\n";
  8.   // We check whether the calculation with the given parameters has been already done.   
  9.   if (!file_exists($json_dir_path . DIRECTORY_SEPARATOR . $fname . '.json')) {
  10.     $descriptor = array(
  11.       // stdin is a pipe that the child will read from
  12.       0 => array("pipe", "r"),
  13.       // stdout is a pipe that the child will write to
  14.       1 => array("pipe", "w"),
  15.       // stderr is a file to write to
  16.       2 => array('file', $dir_root . DIRECTORY_SEPARATOR . 'error.log', 'a'),
  17.     );
  18.     $pipes = array();
  19.     // If no file exists with results then we start our calculation.
  20.     $process = proc_open($cmd, $descriptor, $pipes);
  21.     $result = array();
  22.     if (is_resource($process)) {
  23.       fclose($pipes[0]);
  24.       fclose($pipes[1]);
  25.       $return_value = proc_close($process);
  26.     }
  27.     sleep($sleep);
  28.   }
  29. }
  30. catch (Exception $exc) {
  31.   echo $exc->getTraceAsString();
  32. }

There are many examples on the internet about starting processes with the help of the proc_open function. You can find the essence of my solution in this row, pay attention to the & mark at the end of the row:

$cmd = "php get_score.php $date $indicator $product_id &";

The lone & at the end causes the program to be executed in the background. This means that you get a shell prompt back immediately. Without the &, you would get a shell prompt back only when the program finishes executing.

When I started to test this solution the first time, the client’s JIRA server was not prepared to handle about ten thousand requests within a second and considered it as a DOS attack, so I had to add a delay of ten seconds between all calculations with this command:


The value of $sleep variable is set up by configuration. This way the total time needed for calculations was reduced to one hour, which was acceptable to the client.

In the second part I  will talk about creating HTML pages in Confluence using PHP and Confluence REST API. 
If you cannot wait until that, the JIRA and Confluence REST API client can be downloaded from here: https://github.com/Amice/atlassian

Related posts

2019. február 21.
Strategic consulting

In the 20th century, founders had an exact product vision, and based on that, they built an entire product. They formulated a business plan, and within a couple of years, the product was launched. The next step was finding customers.

2016. december 13.

In this part I will talk about creating HTML pages in Confluence using PHP and Confluence REST API.