Steps to Create a Metric API Endpoint ========================================== Summary --------------------------------------------------------------------------- There are many paths, but we usually follow something along these lines: 1. What is the CHAOSS metric we want to develop? 2. Sometimes, there are metrics endpoints that integrate, or visualize several metrics. 3. Determine what tables in the Augur Schema contain the data we need to develop this metric 4. Construct a very basic query that does the work of joining those tables in a minimal way so we have a "baseline query." 5. Refine the query so that it takes the standard inputs for a "standard metric" if that's what type it is; alternatively, look at non-standard metrics as they are defined in ``AUGUR_HOME/augur/routes``, or one of the visualization metrics in ``AUGUR_HOME/augur/routes/contributor.py``, ``AUGUR_HOME/augur/routes/pull_requests.py`` or ``AUGUR_HOME/augur/routes/nonstandard_metrics.py``. (This step is explained in the next section.) Example Query --------------------------------------------------------------------- This is an example query to Get Us Started on a Labor Effort and Cost Endpoint. 1. What tables? .. code-block:: python repo repo_group If we look at the Augur Schema, we can see that effort and cost are contained in the ``repo_labor`` table. 2. What might our initial query to explore building the endpoint be? .. code-block:: sql SELECT C.repo_id, C.repo_name, programming_language, SUM ( estimated_labor_hours ) AS labor_hours, SUM ( estimated_labor_hours * 50 ) AS labor_cost, analysis_date FROM ( SELECT A .repo_id, b.repo_name, programming_language, SUM ( total_lines ) AS repo_total_lines, SUM ( code_lines ) AS repo_code_lines, SUM ( comment_lines ) AS repo_comment_lines, SUM ( blank_lines ) AS repo_blank_lines, AVG ( code_complexity ) AS repo_lang_avg_code_complexity, AVG ( code_complexity ) * SUM ( code_lines ) + 20 AS estimated_labor_hours, MAX ( A.rl_analysis_date ) AS analysis_date FROM repo_labor A, repo b WHERE A.repo_id = b.repo_id GROUP BY A.repo_id, programming_language, repo_name ORDER BY repo_name, A.repo_id, programming_language ) C GROUP BY repo_id, repo_name, programming_language, C.analysis_date ORDER BY repo_id, programming_language; 3. Over time, as CHAOSS develops a metric for labor investment, the way we calculate hours, and cost in this query will adapt to whatever the CHAOSS community determines is an apt formula. 4. We will fit this metric into one of the different types of metric API Endpoints discussed in the next section. .. note:: Augur uses https://github.com/boyter/scc to calculate information contained in the ``labor_value`` table, which is populated by the ``value_worker`` tasks.