Steps to Create a Metric API Endpoint
Summary
There are many paths, but we usually follow something along these lines:
What is the CHAOSS metric we want to develop?
Sometimes, there are metrics endpoints that integrate, or visualize several metrics.
Determine what tables in the Augur Schema contain the data we need to develop this metric
Construct a very basic query that does the work of joining those tables in a minimal way so we have a “baseline query.”
Refine the query so that it takes the standard inputs for a “standard metric” if that’s what type it is; alternatively, look at non-standard metrics as they are defined in
AUGUR_HOME/augur/routes
, or one of the visualization metrics inAUGUR_HOME/augur/routes/contributor.py
,AUGUR_HOME/augur/routes/pull_requests.py
orAUGUR_HOME/augur/routes/nonstandard_metrics.py
. (This step is explained in the next section.)
Example Query
This is an example query to Get Us Started on a Labor Effort and Cost Endpoint.
What tables?
repo
repo_group
If we look at the Augur Schema, we can see that effort and cost are contained in the repo_labor
table.
What might our initial query to explore building the endpoint be?
SELECT C.repo_id,
C.repo_name,
programming_language,
SUM ( estimated_labor_hours ) AS labor_hours,
SUM ( estimated_labor_hours * 50 ) AS labor_cost,
analysis_date
FROM
(
SELECT A
.repo_id,
b.repo_name,
programming_language,
SUM ( total_lines ) AS repo_total_lines,
SUM ( code_lines ) AS repo_code_lines,
SUM ( comment_lines ) AS repo_comment_lines,
SUM ( blank_lines ) AS repo_blank_lines,
AVG ( code_complexity ) AS repo_lang_avg_code_complexity,
AVG ( code_complexity ) * SUM ( code_lines ) + 20 AS estimated_labor_hours,
MAX ( A.rl_analysis_date ) AS analysis_date
FROM
repo_labor A,
repo b
WHERE
A.repo_id = b.repo_id
GROUP BY
A.repo_id,
programming_language,
repo_name
ORDER BY
repo_name,
A.repo_id,
programming_language
) C
GROUP BY
repo_id,
repo_name,
programming_language,
C.analysis_date
ORDER BY
repo_id,
programming_language;
Over time, as CHAOSS develops a metric for labor investment, the way we calculate hours, and cost in this query will adapt to whatever the CHAOSS community determines is an apt formula.
We will fit this metric into one of the different types of metric API Endpoints discussed in the next section.
Note
Augur uses https://github.com/boyter/scc to calculate information contained in the labor_value
table, which is populated by the value_worker
tasks.