Effective Tokens Specification

Version: 0.2.0 Status: Draft Publication Date: 2026-04-02 Editor: GitHub Agentic Workflows Team This Version: effective-tokens-specification Latest Published Version: This document

Abstract

This specification defines Effective Tokens (ET), a normalized unit for measuring Large Language Model (LLM) usage across token classes, model-relative computational intensity, and multi-invocation execution graphs. ET provides a single unified metric for composite LLM workloads including multi-step pipelines, tool-augmented calls, sub-agent orchestration, and recursive inference.

Status of This Document

This section describes the status of this document at the time of publication. This is a draft specification and may be updated, replaced, or made obsolete by other documents at any time.

This document is governed by the GitHub Agentic Workflows project specifications process.

Introduction
Conformance
Terminology
Token Accounting Model
Multi-Invocation Aggregation
Execution Graph Requirements
Reporting
Implementation Requirements
Extensibility
Compliance Testing
Appendices
References
Change Log

1. Introduction

1.1 Purpose

Token counts reported by LLM APIs are not directly comparable: different token classes (input, cached, output, reasoning) carry different computational costs, and different models have different relative costs. Effective Tokens normalizes these variables into a single scalar that reflects true computational intensity, enabling consistent measurement and comparison across complex multi-agent systems.

1.2 Scope

This specification covers:

Definition of token classes and their default weights
The per-invocation ET computation formula
Aggregation across multi-invocation execution graphs
Structural requirements for invocation nodes and summary reports

This specification does NOT cover:

Billing, pricing, or cost allocation
Model selection or routing strategies
Streaming or partial token reporting

1.3 Design Goals

An ET implementation:

Preserves raw token counts per invocation
Normalizes across token classes using disclosed weights
Normalizes across models using per-model multipliers
Supports aggregation across any number of invocations
Produces a single reproducible metric from identical inputs
Carries no dependency on billing or pricing systems

2. Conformance

2.1 Conformance Classes

Conforming implementation: An implementation that satisfies all MUST/SHALL requirements in this specification.

Partially conforming implementation: An implementation that satisfies core accounting requirements (Sections 4–5) but omits optional fields or extensions.

2.2 Requirements Notation

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

2.3 Compliance Levels

Level 1 – Basic: Single-invocation ET computation (Section 4)
Level 2 – Standard: Multi-invocation aggregation and execution graph (Sections 5–6)
Level 3 – Complete: Full reporting and extensibility support (Sections 7–9)

3. Terminology

3.1 Token Classes

Class	Symbol	Description
Input Tokens	I	Tokens newly processed by the model
Cached Input Tokens	C	Tokens served via cache or prefix reuse
Output Tokens	O	Tokens generated by the model
Reasoning Tokens	R	Internal tokens used during inference (optional)

3.2 Model Multiplier

The Copilot Multiplier (m) is a scalar representing the relative computational intensity of a model versus a defined baseline. Its value is model-specific and MUST be disclosed by the implementation.

3.3 Invocation

A single LLM request-response cycle. Each invocation produces one set of token counts and yields one ET value.

3.4 Sub-Agent

Any invocation triggered by another LLM call or orchestration layer. Examples include tool-using agents, retrieval-augmented calls, planning/execution agents, and recursively delegated LLM calls.

3.5 Execution Graph

A directed structure representing all invocations associated with a single top-level request. The root node has no parent; sub-agents reference their triggering invocation as their parent.

4. Token Accounting Model

4.1 Raw Token Count

For each invocation, the raw total is:

raw_total_tokens = I + C + O + R

4.2 Token Class Weights

Default weights for the four token classes are:

Token Class	Symbol	Default Weight
Input	w_in	1.0
Cached Input	w_cache	0.1
Output	w_out	4.0
Reasoning	w_reason	4.0

Implementations MAY override these values but MUST disclose the weights used in any reported output.

4.3 Base Weighted Tokens

Per invocation:

base_weighted_tokens =
    (w_in × I) + (w_cache × C) + (w_out × O) + (w_reason × R)

4.4 Effective Tokens Per Invocation

effective_tokens = m × base_weighted_tokens

5. Multi-Invocation Aggregation

5.1 Total Effective Tokens

For a request involving N invocations:

ET_total = Σ (m_i × base_weighted_tokens_i)

Each invocation MAY use a different model and multiplier.

5.2 Total Raw Tokens

raw_total_tokens = Σ (I_i + C_i + O_i + R_i)

5.3 Invocation Count

total_invocations = N

This count MUST include the root call, all sub-agent calls, and all tool-triggered LLM calls.

6. Execution Graph Requirements

Implementations MUST represent multi-call workflows as a directed execution graph.

6.1 Node Schema

Each node (invocation) MUST conform to:

{
  "id": "string",
  "parent_id": "string | null",
  "model": {
    "name": "string",
    "copilot_multiplier": number
  },
  "usage": {
    "input_tokens": number,
    "cached_input_tokens": number,
    "output_tokens": number,
    "reasoning_tokens": number
  },
  "derived": {
    "base_weighted_tokens": number,
    "effective_tokens": number
  }
}

6.2 Root Invocation

The root invocation MUST have parent_id = null. It represents the user-facing request that initiates the execution graph.

6.3 Sub-Agent Invocations

Each sub-agent invocation MUST reference a valid parent_id. Sub-agent invocations MAY recursively spawn further invocations.

7. Reporting

A conforming response MUST include a summary object alongside the invocations array:

{
  "summary": {
    "total_invocations": number,
    "raw_total_tokens": number,
    "base_weighted_tokens": number,
    "effective_tokens": number
  },
  "invocations": [ ... ]
}

8. Implementation Requirements

8.1 Completeness

All LLM calls MUST be included in the execution graph. Hidden or system-triggered calls MUST be counted.

8.2 Determinism

Given identical inputs and multipliers, ET MUST be reproducible. Implementations SHOULD NOT introduce non-deterministic factors into the computation.

8.3 Versioning

Implementations SHOULD version their token weights and model multipliers so that historical reports remain interpretable.

8.4 Partial Visibility

When sub-agents are not fully observable, implementations MUST still report aggregate totals. Invocation nodes with incomplete data SHOULD be flagged to indicate missing information.

9. Extensibility

Implementations MAY:

Add new token classes (e.g., tool_tokens)
Add latency or compute metadata per invocation node
Support streaming or partial progress updates

Extensions MUST NOT alter the core ET definition or the default weight values without disclosure.

10. Compliance Testing

10.1 Test Suite Requirements

10.1.1 Token Accounting Tests

T-ET-001: Single invocation with all four token classes produces correct base_weighted_tokens
T-ET-002: Single invocation ET equals m × base_weighted_tokens
T-ET-003: Zero-value token classes do not affect the result
T-ET-004: Custom weights are applied when default weights are overridden

10.1.2 Aggregation Tests

T-ET-010: Multi-invocation ET_total equals the sum of per-invocation ET values
T-ET-011: raw_total_tokens equals the sum of all raw tokens across all invocations
T-ET-012: total_invocations count includes root, sub-agents, and tool-triggered calls

10.1.3 Execution Graph Tests

T-ET-020: Root node has parent_id = null
T-ET-021: All sub-agent nodes reference a valid parent_id
T-ET-022: Node schema includes all required fields

10.1.4 Reporting Tests

T-ET-030: Summary object is present in all conforming responses
T-ET-031: Summary values are consistent with per-invocation data

10.2 Compliance Checklist

Requirement	Test ID	Level	Status
Per-invocation base weighted tokens	T-ET-001–004	1	Required
Per-invocation ET computation	T-ET-002	1	Required
Multi-invocation aggregation	T-ET-010–012	2	Required
Execution graph node schema	T-ET-020–022	2	Required
Summary reporting	T-ET-030–031	3	Required
Custom weight disclosure	T-ET-004	1	Required
Versioning of weights/multipliers	—	3	Recommended
Partial visibility flagging	—	2	Recommended

Appendices

Appendix A: Worked Example

A.1 Scenario

A request triggers three invocations: a root call, a retrieval sub-agent, and a final synthesis call.

A.2 Input Data

{
  "invocations": [
    {
      "id": "root",
      "parent_id": null,
      "model": { "name": "model-a", "copilot_multiplier": 2.0 },
      "usage": {
        "input_tokens": 500,
        "cached_input_tokens": 200,
        "output_tokens": 150,
        "reasoning_tokens": 0
      }
    },
    {
      "id": "retrieval",
      "parent_id": "root",
      "model": { "name": "model-b", "copilot_multiplier": 1.0 },
      "usage": {
        "input_tokens": 300,
        "cached_input_tokens": 0,
        "output_tokens": 100,
        "reasoning_tokens": 0
      }
    },
    {
      "id": "synthesis",
      "parent_id": "root",
      "model": { "name": "model-a", "copilot_multiplier": 2.0 },
      "usage": {
        "input_tokens": 200,
        "cached_input_tokens": 100,
        "output_tokens": 250,
        "reasoning_tokens": 0
      }
    }
  ]
}

A.3 Computation

root:
  base = (1.0 × 500) + (0.1 × 200) + (4.0 × 150) = 500 + 20 + 600 = 1120
  ET   = 2.0 × 1120 = 2240

retrieval:
  base = (1.0 × 300) + (4.0 × 100) = 300 + 400 = 700
  ET   = 1.0 × 700 = 700

synthesis:
  base = (1.0 × 200) + (0.1 × 100) + (4.0 × 250) = 200 + 10 + 1000 = 1210
  ET   = 2.0 × 1210 = 2420

A.4 Output

{
  "summary": {
    "total_invocations": 3,
    "raw_total_tokens": 1800,
    "base_weighted_tokens": 3030,
    "effective_tokens": 5360
  }
}

Appendix B: Core Formula Reference

ET_total = Σ [ m_i × (w_in × I_i + w_cache × C_i + w_out × O_i + w_reason × R_i) ]

With default weights:

ET_total = Σ [ m_i × (I_i + 0.1 C_i + 4 O_i + 4 R_i) ]

Appendix C: Security Considerations

ET values are derived from token usage metadata. Implementations SHOULD treat per-invocation token data as potentially sensitive since usage patterns may reveal information about system prompts, model configurations, or user behavior. Aggregate ET values suitable for observability dashboards SHOULD be separated from detailed per-invocation data in access-controlled reporting systems.

References

Normative References

[RFC 2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, March 1997. https://www.ietf.org/rfc/rfc2119.txt

Informative References

[OPENAI-USAGE] OpenAI API Reference — Usage Objects. https://platform.openai.com/docs/api-reference
[ANTHROPIC-USAGE] Anthropic API Reference — Token Usage. https://docs.anthropic.com/en/api/getting-started

Change Log

Version 0.2.0 (Draft)

Adopted W3C-style specification format
Added conformance levels (Basic, Standard, Complete)
Added compliance testing section with test IDs
Added Appendix C: Security Considerations
Clarified partial visibility requirements

Version 0.1.0 (Draft)

Initial definition of Effective Tokens metric
Defined four token classes and default weights
Defined per-invocation and multi-invocation formulas
Defined execution graph node schema

Effective Tokens Specification

Effective Tokens Specification

Abstract

Status of This Document

Table of Contents

1. Introduction

1.1 Purpose

1.2 Scope

1.3 Design Goals

2. Conformance

2.1 Conformance Classes

2.2 Requirements Notation

2.3 Compliance Levels

3. Terminology

3.1 Token Classes

3.2 Model Multiplier

3.3 Invocation

3.4 Sub-Agent

3.5 Execution Graph

4. Token Accounting Model

4.1 Raw Token Count

4.2 Token Class Weights

4.3 Base Weighted Tokens

4.4 Effective Tokens Per Invocation

5. Multi-Invocation Aggregation

5.1 Total Effective Tokens

5.2 Total Raw Tokens

5.3 Invocation Count

6. Execution Graph Requirements

6.1 Node Schema

6.2 Root Invocation

6.3 Sub-Agent Invocations

7. Reporting

8. Implementation Requirements

8.1 Completeness

8.2 Determinism

8.3 Versioning

8.4 Partial Visibility

9. Extensibility

10. Compliance Testing

10.1 Test Suite Requirements

10.1.1 Token Accounting Tests

10.1.2 Aggregation Tests

10.1.3 Execution Graph Tests

10.1.4 Reporting Tests

10.2 Compliance Checklist

Appendices

Appendix A: Worked Example

A.1 Scenario

A.2 Input Data

A.3 Computation

A.4 Output

Appendix B: Core Formula Reference

Appendix C: Security Considerations

References

Normative References

Informative References

Change Log

Version 0.2.0 (Draft)

Version 0.1.0 (Draft)