An HTML Standard for AI Use Disclosure
Hacker News
A proposal for a new HTML attribute and meta tag to enable element-level disclosure of AI involvement in web content, addressing the lack of granular transparency.
Hacker News
A proposal for a new HTML attribute and meta tag to enable element-level disclosure of AI involvement in web content, addressing the lack of granular transparency.
AI 生成摘要
一項提議新的HTML屬性和meta標籤,以實現網頁內容中AI參與的元素級別揭露,解決了缺乏細粒度透明度的問題。
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Explainer: AI Content Disclosure for HTML — element-level markup for AI authorship transparency
Web pages increasingly contain text produced with varying degrees of AI
involvement — from light AI-assisted editing to fully autonomous generation.
There is currently no standard HTML mechanism for authors to disclose AI
involvement at element-level granularity within a page.
This explainer proposes an aidisclosure HTML attribute and a companion
tag, enabling authors to declare the degree
of AI involvement in any section of a web page.
A modern news article page might contain a human-written investigation
alongside an AI-generated summary sidebar and AI-moderated user comments.
Today, there is no standard way to label these sections differently.
Existing approaches operate at coarser granularity:
WHATWG HTML #9479 proposes
a page-level tag with four values. It does not support marking
individual elements. Commenters on that issue (42+) identified element-level
granularity as the critical missing capability.
IETF draft-abaris-aicdh-00
defines an AI-Disclosure HTTP response header. It applies to entire
HTTP responses and cannot distinguish mixed content within a page.
C2PA 2.2 provides cryptographic provenance
for media files (images, video, audio). It does not support HTML text
content and is designed for file-level, not element-level, assertions.
The EU AI Act Article 50
(effective August 2026) requires that AI-generated text content be "marked in
a machine-readable format and detectable as artificially generated or
manipulated." Major platforms (YouTube, Meta, TikTok) already require AI
disclosure in their policies. A standard mechanism would serve both regulatory
compliance and voluntary transparency.
For pages with uniform AI involvement:
The value mixed signals that different sections have different levels;
inspect element-level attributes for detail.
An aidisclosure global attribute on any HTML element:
Four values, aligned with the IETF AI-Disclosure header and IPTC Digital
Source Type vocabulary:
For search engine discoverability, the same information can be expressed as
structured data. The simplest form is just the level string:
An expanded form supports optional metadata. All fields except level are
strictly optional — publishers may have legitimate reasons not to disclose
specific tools or providers:
(Proposed as a comment on
schemaorg/schemaorg#3391.)
A newsroom publishes an investigative piece with a human-written article
and an AI-generated summary:
A blogger writes a post and uses an LLM for grammar, style, and clarity
improvements:
An automated system generates weather reports without per-instance human
oversight:
A literary journal positively asserts that no AI was used:
Note: aidisclosure="none" is a positive assertion. The absence of the
attribute means "unknown," not "none."
These are complementary. A CDN or reverse proxy can set the HTTP header;
a CMS can set the meta tag; an author or AI tool can set element-level
attributes. None supersedes the others.
To address a frequently raised concern ("where do you draw the line with
grammar checkers?"), here is boundary guidance:
Not AI (no disclosure needed):
ai-assisted:
ai-generated:
autonomous:
The boundary is generative/inferential AI — systems trained on data
that produce novel outputs. Deterministic tools that apply fixed rules are
not covered.
Does not handle mixed-content pages — the single most requested feature in
that issue's 42+ comments.
No element-level granularity. Not accessible to client-side tools processing
the DOM. Cannot distinguish mixed content within a page.
C2PA is file-based cryptographic provenance. HTML pages are dynamically
assembled from templates, databases, and user input — they are not single
files with stable hashes. C2PA and this proposal are complementary, not
competing.
Using data-ai-disclosure instead of a dedicated attribute was considered.
Trade-off: data-* attributes have no semantic meaning to browsers or
assistive technology. A dedicated attribute signals intent for future browser
integration (e.g., address bar indicators, accessibility announcements) and
is consistent with how other proposals (like containertiming) have
proceeded.
Too verbose for common cases. RDFa requires namespace declarations and
multi-attribute markup for simple assertions. The aidisclosure attribute
provides a lightweight default; RDFa or JSON-LD can supplement it for richer
structured data needs.
Responses to the W3C Security and Privacy Self-Review Questionnaire:
This standard serves responsible publishers, regulated industries, and AI
tool vendors who want to be transparent. It is not a detection mechanism.
The EU AI Act makes compliance mandatory for covered entities, and major
platforms already require disclosure in their terms of service.
The analogy is rel=nofollow: voluntary, widely adopted because it aligns
incentives, and useful despite being ignorable by bad actors.
See What Counts as AI? above. The boundary is
generative/inferential AI — systems trained on data that produce novel
outputs. Deterministic spell-check and thesaurus tools are excluded.
True for any self-declared metadata. The standard enables honest disclosure;
verification requires pairing with C2PA or regulatory auditing. The value is
in the signal for those who choose to use it honestly — same as Schema.org
structured data, which search engines use despite its spoofability.
The granular levels (four values, not binary) allow publishers to distinguish
"AI helped me edit" from "AI wrote everything." The ai-assisted level
should carry no more stigma than acknowledging the use of a human copy
editor.
That is exactly why granularity matters. Binary "AI/not-AI" is already
inadequate. The spectrum from none to autonomous reflects the reality of
modern content workflows and remains meaningful as AI tools become ubiquitous.
Explainer: AI Content Disclosure for HTML — element-level markup for AI authorship transparency
There was an error while loading. Please reload this page.
0
2