Effective Tokens Specification
Effective Tokens Specification
Section titled “Effective Tokens Specification”Version: 0.2.0 Status: Draft Publication Date: 2026-04-02 Editor: GitHub Agentic Workflows Team This Version: effective-tokens-specification Latest Published Version: This document
Abstract
Section titled “Abstract”This specification defines Effective Tokens (ET), a normalized unit for measuring Large Language Model (LLM) usage across token classes, model-relative computational intensity, and multi-invocation execution graphs. ET provides a single unified metric for composite LLM workloads including multi-step pipelines, tool-augmented calls, sub-agent orchestration, and recursive inference.
Status of This Document
Section titled “Status of This Document”This section describes the status of this document at the time of publication. This is a draft specification and may be updated, replaced, or made obsolete by other documents at any time.
This document is governed by the GitHub Agentic Workflows project specifications process.
Table of Contents
Section titled “Table of Contents”- Introduction
- Conformance
- Terminology
- Token Accounting Model
- Multi-Invocation Aggregation
- Execution Graph Requirements
- Reporting
- Implementation Requirements
- Extensibility
- Compliance Testing
- Appendices
- References
- Change Log
1. Introduction
Section titled “1. Introduction”1.1 Purpose
Section titled “1.1 Purpose”Token counts reported by LLM APIs are not directly comparable: different token classes (input, cached, output, reasoning) carry different computational costs, and different models have different relative costs. Effective Tokens normalizes these variables into a single scalar that reflects true computational intensity, enabling consistent measurement and comparison across complex multi-agent systems.
1.2 Scope
Section titled “1.2 Scope”This specification covers:
- Definition of token classes and their default weights
- The per-invocation ET computation formula
- Aggregation across multi-invocation execution graphs
- Structural requirements for invocation nodes and summary reports
This specification does NOT cover:
- Billing, pricing, or cost allocation
- Model selection or routing strategies
- Streaming or partial token reporting
1.3 Design Goals
Section titled “1.3 Design Goals”An ET implementation:
- Preserves raw token counts per invocation
- Normalizes across token classes using disclosed weights
- Normalizes across models using per-model multipliers
- Supports aggregation across any number of invocations
- Produces a single reproducible metric from identical inputs
- Carries no dependency on billing or pricing systems
2. Conformance
Section titled “2. Conformance”2.1 Conformance Classes
Section titled “2.1 Conformance Classes”Conforming implementation: An implementation that satisfies all MUST/SHALL requirements in this specification.
Partially conforming implementation: An implementation that satisfies core accounting requirements (Sections 4–5) but omits optional fields or extensions.
2.2 Requirements Notation
Section titled “2.2 Requirements Notation”The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.
2.3 Compliance Levels
Section titled “2.3 Compliance Levels”- Level 1 – Basic: Single-invocation ET computation (Section 4)
- Level 2 – Standard: Multi-invocation aggregation and execution graph (Sections 5–6)
- Level 3 – Complete: Full reporting and extensibility support (Sections 7–9)
3. Terminology
Section titled “3. Terminology”3.1 Token Classes
Section titled “3.1 Token Classes”| Class | Symbol | Description |
|---|---|---|
| Input Tokens | I | Tokens newly processed by the model |
| Cached Input Tokens | C | Tokens served via cache or prefix reuse |
| Output Tokens | O | Tokens generated by the model |
| Reasoning Tokens | R | Internal tokens used during inference (optional) |
3.2 Model Multiplier
Section titled “3.2 Model Multiplier”The Copilot Multiplier (m) is a scalar representing the relative computational intensity of a model versus a defined baseline. Its value is model-specific and MUST be disclosed by the implementation.
3.3 Invocation
Section titled “3.3 Invocation”A single LLM request-response cycle. Each invocation produces one set of token counts and yields one ET value.
3.4 Sub-Agent
Section titled “3.4 Sub-Agent”Any invocation triggered by another LLM call or orchestration layer. Examples include tool-using agents, retrieval-augmented calls, planning/execution agents, and recursively delegated LLM calls.
3.5 Execution Graph
Section titled “3.5 Execution Graph”A directed structure representing all invocations associated with a single top-level request. The root node has no parent; sub-agents reference their triggering invocation as their parent.
4. Token Accounting Model
Section titled “4. Token Accounting Model”4.1 Raw Token Count
Section titled “4.1 Raw Token Count”For each invocation, the raw total is:
raw_total_tokens = I + C + O + R4.2 Token Class Weights
Section titled “4.2 Token Class Weights”Default weights for the four token classes are:
| Token Class | Symbol | Default Weight |
|---|---|---|
| Input | w_in | 1.0 |
| Cached Input | w_cache | 0.1 |
| Output | w_out | 4.0 |
| Reasoning | w_reason | 4.0 |
Implementations MAY override these values but MUST disclose the weights used in any reported output.
4.3 Base Weighted Tokens
Section titled “4.3 Base Weighted Tokens”Per invocation:
base_weighted_tokens = (w_in × I) + (w_cache × C) + (w_out × O) + (w_reason × R)4.4 Effective Tokens Per Invocation
Section titled “4.4 Effective Tokens Per Invocation”effective_tokens = m × base_weighted_tokens5. Multi-Invocation Aggregation
Section titled “5. Multi-Invocation Aggregation”5.1 Total Effective Tokens
Section titled “5.1 Total Effective Tokens”For a request involving N invocations:
ET_total = Σ (m_i × base_weighted_tokens_i)Each invocation MAY use a different model and multiplier.
5.2 Total Raw Tokens
Section titled “5.2 Total Raw Tokens”raw_total_tokens = Σ (I_i + C_i + O_i + R_i)5.3 Invocation Count
Section titled “5.3 Invocation Count”total_invocations = NThis count MUST include the root call, all sub-agent calls, and all tool-triggered LLM calls.
6. Execution Graph Requirements
Section titled “6. Execution Graph Requirements”Implementations MUST represent multi-call workflows as a directed execution graph.
6.1 Node Schema
Section titled “6.1 Node Schema”Each node (invocation) MUST conform to:
{ "id": "string", "parent_id": "string | null", "model": { "name": "string", "copilot_multiplier": number }, "usage": { "input_tokens": number, "cached_input_tokens": number, "output_tokens": number, "reasoning_tokens": number }, "derived": { "base_weighted_tokens": number, "effective_tokens": number }}6.2 Root Invocation
Section titled “6.2 Root Invocation”The root invocation MUST have parent_id = null. It represents the user-facing request that initiates the execution graph.
6.3 Sub-Agent Invocations
Section titled “6.3 Sub-Agent Invocations”Each sub-agent invocation MUST reference a valid parent_id. Sub-agent invocations MAY recursively spawn further invocations.
7. Reporting
Section titled “7. Reporting”A conforming response MUST include a summary object alongside the invocations array:
{ "summary": { "total_invocations": number, "raw_total_tokens": number, "base_weighted_tokens": number, "effective_tokens": number }, "invocations": [ ... ]}8. Implementation Requirements
Section titled “8. Implementation Requirements”8.1 Completeness
Section titled “8.1 Completeness”All LLM calls MUST be included in the execution graph. Hidden or system-triggered calls MUST be counted.
8.2 Determinism
Section titled “8.2 Determinism”Given identical inputs and multipliers, ET MUST be reproducible. Implementations SHOULD NOT introduce non-deterministic factors into the computation.
8.3 Versioning
Section titled “8.3 Versioning”Implementations SHOULD version their token weights and model multipliers so that historical reports remain interpretable.
8.4 Partial Visibility
Section titled “8.4 Partial Visibility”When sub-agents are not fully observable, implementations MUST still report aggregate totals. Invocation nodes with incomplete data SHOULD be flagged to indicate missing information.
9. Extensibility
Section titled “9. Extensibility”Implementations MAY:
- Add new token classes (e.g.,
tool_tokens) - Add latency or compute metadata per invocation node
- Support streaming or partial progress updates
Extensions MUST NOT alter the core ET definition or the default weight values without disclosure.
10. Compliance Testing
Section titled “10. Compliance Testing”10.1 Test Suite Requirements
Section titled “10.1 Test Suite Requirements”10.1.1 Token Accounting Tests
Section titled “10.1.1 Token Accounting Tests”- T-ET-001: Single invocation with all four token classes produces correct
base_weighted_tokens - T-ET-002: Single invocation ET equals
m × base_weighted_tokens - T-ET-003: Zero-value token classes do not affect the result
- T-ET-004: Custom weights are applied when default weights are overridden
10.1.2 Aggregation Tests
Section titled “10.1.2 Aggregation Tests”- T-ET-010: Multi-invocation
ET_totalequals the sum of per-invocation ET values - T-ET-011:
raw_total_tokensequals the sum of all raw tokens across all invocations - T-ET-012:
total_invocationscount includes root, sub-agents, and tool-triggered calls
10.1.3 Execution Graph Tests
Section titled “10.1.3 Execution Graph Tests”- T-ET-020: Root node has
parent_id = null - T-ET-021: All sub-agent nodes reference a valid
parent_id - T-ET-022: Node schema includes all required fields
10.1.4 Reporting Tests
Section titled “10.1.4 Reporting Tests”- T-ET-030: Summary object is present in all conforming responses
- T-ET-031: Summary values are consistent with per-invocation data
10.2 Compliance Checklist
Section titled “10.2 Compliance Checklist”| Requirement | Test ID | Level | Status |
|---|---|---|---|
| Per-invocation base weighted tokens | T-ET-001–004 | 1 | Required |
| Per-invocation ET computation | T-ET-002 | 1 | Required |
| Multi-invocation aggregation | T-ET-010–012 | 2 | Required |
| Execution graph node schema | T-ET-020–022 | 2 | Required |
| Summary reporting | T-ET-030–031 | 3 | Required |
| Custom weight disclosure | T-ET-004 | 1 | Required |
| Versioning of weights/multipliers | — | 3 | Recommended |
| Partial visibility flagging | — | 2 | Recommended |
Appendices
Section titled “Appendices”Appendix A: Worked Example
Section titled “Appendix A: Worked Example”A.1 Scenario
Section titled “A.1 Scenario”A request triggers three invocations: a root call, a retrieval sub-agent, and a final synthesis call.
A.2 Input Data
Section titled “A.2 Input Data”{ "invocations": [ { "id": "root", "parent_id": null, "model": { "name": "model-a", "copilot_multiplier": 2.0 }, "usage": { "input_tokens": 500, "cached_input_tokens": 200, "output_tokens": 150, "reasoning_tokens": 0 } }, { "id": "retrieval", "parent_id": "root", "model": { "name": "model-b", "copilot_multiplier": 1.0 }, "usage": { "input_tokens": 300, "cached_input_tokens": 0, "output_tokens": 100, "reasoning_tokens": 0 } }, { "id": "synthesis", "parent_id": "root", "model": { "name": "model-a", "copilot_multiplier": 2.0 }, "usage": { "input_tokens": 200, "cached_input_tokens": 100, "output_tokens": 250, "reasoning_tokens": 0 } } ]}A.3 Computation
Section titled “A.3 Computation”root: base = (1.0 × 500) + (0.1 × 200) + (4.0 × 150) = 500 + 20 + 600 = 1120 ET = 2.0 × 1120 = 2240
retrieval: base = (1.0 × 300) + (4.0 × 100) = 300 + 400 = 700 ET = 1.0 × 700 = 700
synthesis: base = (1.0 × 200) + (0.1 × 100) + (4.0 × 250) = 200 + 10 + 1000 = 1210 ET = 2.0 × 1210 = 2420A.4 Output
Section titled “A.4 Output”{ "summary": { "total_invocations": 3, "raw_total_tokens": 1800, "base_weighted_tokens": 3030, "effective_tokens": 5360 }}Appendix B: Core Formula Reference
Section titled “Appendix B: Core Formula Reference”ET_total = Σ [ m_i × (w_in × I_i + w_cache × C_i + w_out × O_i + w_reason × R_i) ]With default weights:
ET_total = Σ [ m_i × (I_i + 0.1 C_i + 4 O_i + 4 R_i) ]Appendix C: Security Considerations
Section titled “Appendix C: Security Considerations”ET values are derived from token usage metadata. Implementations SHOULD treat per-invocation token data as potentially sensitive since usage patterns may reveal information about system prompts, model configurations, or user behavior. Aggregate ET values suitable for observability dashboards SHOULD be separated from detailed per-invocation data in access-controlled reporting systems.
References
Section titled “References”Normative References
Section titled “Normative References”- [RFC 2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, March 1997. https://www.ietf.org/rfc/rfc2119.txt
Informative References
Section titled “Informative References”- [OPENAI-USAGE] OpenAI API Reference — Usage Objects. https://platform.openai.com/docs/api-reference
- [ANTHROPIC-USAGE] Anthropic API Reference — Token Usage. https://docs.anthropic.com/en/api/getting-started
Change Log
Section titled “Change Log”Version 0.2.0 (Draft)
Section titled “Version 0.2.0 (Draft)”- Adopted W3C-style specification format
- Added conformance levels (Basic, Standard, Complete)
- Added compliance testing section with test IDs
- Added Appendix C: Security Considerations
- Clarified partial visibility requirements
Version 0.1.0 (Draft)
Section titled “Version 0.1.0 (Draft)”- Initial definition of Effective Tokens metric
- Defined four token classes and default weights
- Defined per-invocation and multi-invocation formulas
- Defined execution graph node schema
Copyright 2026 GitHub Agentic Workflows Team. All rights reserved.