What makes an API self-documenting for AI agents?

For agents, self-documenting means the interface exposes machine-readable discovery, validation, and semantics. In practice that means MCP metadata for capability discovery and JSON Schema contracts, plus JSON-LD when field meaning must stay consistent across tools or systems.

Do I need JSON-LD if my MCP tool already has JSON Schema?

Not always. JSON Schema handles structure and validation, but it does not fully express semantic identity across tools. Add JSON-LD when multiple endpoints reuse entities, when field names drift between teams, or when downstream agents need stable meaning beyond simple type checks.

Should MCP annotations like readOnlyHint be enforced as security rules?

No. MCP annotations are descriptive hints for clients, not hard security guarantees. Enforce real access control, validation, and audit behavior on the server side, then publish annotations as supplemental metadata.

Where should I publish glossary docs for an MCP server?

Expose them as resources discoverable through resources/list and retrievable through resources/read. That keeps the docs close to the API contract and lets clients fetch examples or field definitions at runtime instead of depending on static wikis.

Self-Documenting APIs for AI Agents with MCP & JSON-LD

Most API docs are written for people and then awkwardly re-explained to agents in prompts. That breaks down fast once multiple tools, schemas, and field names start drifting. A better pattern is to make the API describe itself in two layers: MCP for capability discovery and invocation contracts, and JSON-LD for semantic meaning. The result is an interface an AI client can discover, validate, and interpret with far less custom instruction.

MCP already gives agents discovery via tools/list, resources/list, and JSON Schema contracts.
JSON-LD adds stable meaning for fields like status, version, and updatedAt.
Use outputSchema plus structuredContent so the same response is both machine-validated and semantically typed.
Publish short glossary resources and examples so agents can inspect docs instead of relying on hidden prompt lore.

Prerequisites

What you need before you start

A server that exposes MCP tools or resources over JSON-RPC.
A domain with stable nouns and verbs, such as deployments, incidents, invoices, or tickets.
JSON Schema familiarity, especially object types, enums, required fields, and additionalProperties.
A place to publish machine-readable examples or field glossaries as MCP resources.
Sanitized sample payloads. If your examples contain real customer data, clean them first with the Data Masking Tool.

Bottom Line

Use MCP to describe what the agent can call, and JSON-LD to describe what the returned data means. That split produces APIs that are easier to discover, safer to automate, and less dependent on brittle hand-written prompts.

Step 1: Publish capabilities as contracts, not prose

Your first job is to make every callable operation self-describing through MCP. The current MCP specification defines tool metadata, resource metadata, and JSON-RPC transports such as stdio and Streamable HTTP. For self-documenting APIs, the key move is simple: treat inputSchema, outputSchema, and resource descriptors as the primary documentation surface, not a sidecar.

Start with a narrow tool definition

{
  "name": "get_deployment_status",
  "title": "Get Deployment Status",
  "description": "Returns rollout health for one service in one environment.",
  "inputSchema": {
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "type": "object",
    "additionalProperties": false,
    "properties": {
      "service": {
        "type": "string",
        "description": "Internal service slug, for example billing-api"
      },
      "environment": {
        "type": "string",
        "enum": ["staging", "prod"]
      }
    },
    "required": ["service", "environment"]
  },
  "outputSchema": {
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "type": "object",
    "additionalProperties": false,
    "properties": {
      "service": { "type": "string" },
      "environment": { "type": "string" },
      "status": { "type": "string", "enum": ["healthy", "degraded", "down"] },
      "version": { "type": "string" },
      "latencyMs": { "type": "integer", "minimum": 0 },
      "updatedAt": { "type": "string", "format": "date-time" }
    },
    "required": ["service", "environment", "status", "version", "latencyMs", "updatedAt"]
  },
  "annotations": {
    "readOnlyHint": true,
    "idempotentHint": true,
    "openWorldHint": false
  }
}

This does three important things:

It tells the agent exactly which arguments are valid.
It constrains the response shape before any model reasoning starts.
It exposes operational hints without bloating the natural-language description.

Watch out: MCP tool annotations are hints, not trust boundaries. A client should not treat readOnlyHint or openWorldHint as security guarantees, and your server still needs normal authorization, validation, and audit controls.

Publish reference docs as resources

Next, expose a glossary or field dictionary through resources/list and resources/read. That gives agents a place to inspect domain meaning without smuggling everything into a system prompt.

{
  "uri": "docs://glossary/deployment-status",
  "name": "deployment-status-glossary",
  "title": "Deployment Status Glossary",
  "description": "Definitions for rollout status, latency, version, and freshness fields.",
  "mimeType": "application/ld+json"
}

If you want those snippets consistently readable by humans too, run them through TechBytes' Code Formatter before publishing them as examples.

Step 2: Add JSON-LD semantics to the fields that matter

MCP tells the agent the structure of a payload. JSON-LD tells the agent what the fields mean in a graph-aware, reusable way. That distinction matters when multiple tools return similar fields with different semantics, or when the same entity appears across deployment, incident, and release workflows.

Define a reusable context

{
  "@context": {
    "tb": "https://example.com/ns#",
    "schema": "https://schema.org/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "service": "tb:service",
    "environment": "tb:environment",
    "status": "tb:status",
    "latencyMs": {
      "@id": "tb:latencyMs",
      "@type": "xsd:integer"
    },
    "version": "schema:softwareVersion",
    "updatedAt": {
      "@id": "schema:dateModified",
      "@type": "xsd:dateTime"
    }
  }
}

A few design rules keep this maintainable:

Reuse standard vocabularies for generic concepts such as software version and modification time.
Create a small custom namespace only for domain-specific terms you actually own.
Version the context carefully; changing field meaning is an API change, not a docs edit.
Keep the JSON-LD context stable across every tool that returns the same entity shape.

Pro tip: Do not JSON-LD-wrap every payload in your platform. Start with the shared entities agents repeatedly join across tools, such as deployments, incidents, users, or documents.

Map shape and meaning separately

This is the design pattern that works in practice:

Let JSON Schema enforce required fields, types, enums, and object boundaries.
Let JSON-LD explain semantic identity and cross-system meaning.
Keep business descriptions short; the schema and context should carry most of the load.

That split makes your interface easier to evolve. You can tighten validation without changing meaning, or extend meaning without rewriting every tool description.

Step 3: Return structured results that agents can validate and interpret

Once you advertise an outputSchema, return structured data that matches it. Then embed the JSON-LD form in the same result so the client gets both validation and semantics.

{
  "structuredContent": {
    "@context": {
      "tb": "https://example.com/ns#",
      "schema": "https://schema.org/",
      "xsd": "http://www.w3.org/2001/XMLSchema#",
      "service": "tb:service",
      "environment": "tb:environment",
      "status": "tb:status",
      "latencyMs": { "@id": "tb:latencyMs", "@type": "xsd:integer" },
      "version": "schema:softwareVersion",
      "updatedAt": { "@id": "schema:dateModified", "@type": "xsd:dateTime" }
    },
    "@type": "tb:DeploymentStatus",
    "service": "billing-api",
    "environment": "prod",
    "status": "healthy",
    "version": "2.8.1",
    "latencyMs": 143,
    "updatedAt": "2026-05-11T10:12:00Z"
  },
  "content": [
    {
      "type": "text",
      "text": "billing-api in prod is healthy at version 2.8.1 with 143 ms median latency."
    }
  ]
}

The text content helps with direct rendering and operator readability. The structured object is what downstream agents should key off for workflows, planning, and cross-tool joins.

Verification / Expected Output

You know this design is working when a new MCP client can discover the tool, inspect the schemas, call it with valid arguments, and reason about the result without custom prompt instructions.

Run this checklist

Confirm the tool appears in tools/list with a root object inputSchema and, if present, a root object outputSchema.
Read the glossary resource through resources/read and verify the returned MIME type is suitable for your docs, such as application/ld+json.
Call the tool with one valid payload and one invalid payload to ensure the server enforces the schema instead of relying on the model.
Inspect the result and verify that the fields in structuredContent exactly match the advertised schema.

Expected outcome

A generic MCP client can infer required arguments without a handwritten how-to prompt.
A second agent can reuse the same JSON-LD context across multiple tools and compare entities safely.
Your docs resource becomes a live source of truth instead of a stale wiki page.

Troubleshooting / What's Next

Troubleshooting: top 3 failures

The tool is discoverable but still called incorrectly. Your schema is probably too loose. Add enums, required fields, format, and additionalProperties: false where appropriate.
Two tools return the same field name with different meanings. Fix the JSON-LD context and normalize the entity model. Semantic drift is exactly what JSON-LD is supposed to stop.
Clients ignore the annotations and make unsafe choices. That is normal. Move safety guarantees into policy, authz, and runtime checks; keep annotations as informational metadata only.

What's next

Add shared entity contexts for your top three cross-tool objects, not just one endpoint.
Publish example resources for successful and error responses so agents can inspect edge cases.
Introduce contract tests that validate tools/list, response schemas, and JSON-LD context stability in CI.
If you expose prompts, keep them thin. They should orchestrate documented tools and resources, not replace them.

The practical target is not “more metadata.” It is fewer hidden assumptions. When MCP carries the callable contract and JSON-LD carries the semantic contract, your API becomes easier for both humans and agents to use correctly on first contact.

Self-Documenting APIs for AI Agents with MCP & JSON-LD

Bottom Line

Prerequisites

What you need before you start

Bottom Line

Step 1: Publish capabilities as contracts, not prose

Start with a narrow tool definition

Publish reference docs as resources

Step 2: Add JSON-LD semantics to the fields that matter

Define a reusable context

Map shape and meaning separately

Step 3: Return structured results that agents can validate and interpret

Verification / Expected Output

Run this checklist

Expected outcome

Troubleshooting / What's Next

Troubleshooting: top 3 failures

What's next

Frequently Asked Questions

Get Engineering Deep-Dives in Your Inbox