Finish up the rest of the privacy considerations

domenic · domenic · commit 9e0cbda25c2e · 2025-03-27T10:48:06.000+09:00
diff --git a/README.md b/README.md
@@ -417,19 +417,7 @@ For the time being, the Chrome built-in AI team is moving forward more aggresive
 
 ## Privacy considerations
 
-### General concerns about language-model based APIs
-
-If cloud-based language models are exposed through this API, then there are potential privacy issues with exposing user or website data to the relevant cloud and model providers. This is not a concern specific to this API, as websites can already choose to expose user or website data to other origins using APIs such as `fetch()`. However, it's worth keeping in mind, and in particular as discussed in our [Goals](#shared-goals), perhaps we should make it easier for web developers to know whether a cloud-based model is in use, or which one.
-
-If on-device language models are updated separately from browser and operating system versions, this API could enhance the web's fingerprinting service by providing extra identifying bits. Mandating that older browser versions not receive updates or be able to download models from too far into the future might be a possible remediation for this.
-
-Finally, we intend to prohibit (in the specification) any use of user-specific information that is not directly supplied through the API. For example, it would not be permissible to fine-tune the language model based on information the user has entered into the browser in the past.
-
-### Detecting available options
-
-The [`availability()` API](#testing-available-options-before-creation) specified here provide some bits of fingerprinting information, since the availability status of each option and language can be one of four values, and those values are expected to be shared across a user's browser or browsing profile.
-
-This privacy threat, and how the API mitigates it, is discussed in detail [in the specification](https://webmachinelearning.github.io/writing-assistance-apis/#privacy-availability).
+Please see [the specification](https://webmachinelearning.github.io/writing-assistance-apis/#privacy).
 
 ## Stakeholder feedback
 
diff --git a/index.bs b/index.bs
@@ -425,6 +425,8 @@ The <dfn attribute for="Summarizer">inputQuota</dfn> getter steps are to return
 
      The summarization should conform to the guidance given by |type|, |format|, and |length|, in the definitions of each of their enumeration values.
 
+     The summarization process must conform to the privacy guidance given in [[#privacy-user-input]].
+
      If |outputLanguage| is non-null, the summarization should be in that language. Otherwise, it should be in the language of |input| (which might not match that of |context| or |sharedContext|). If |input| contains multiple languages, or the language of |input| cannot be detected, then either the output language is [=implementation-defined=], or the implementation may treat this as an error, per the guidance in [[#summarizer-errors]].
 
   1. While true:
@@ -1868,7 +1870,7 @@ Every [=interface=] [=interface/including=] the {{DestroyableModel}} interface m
 
               1. Let |progressFraction| be [$floor$](|rawProgressFraction| &times; 65,536) &divide; 65,536.
 
-                  <div class="note">
+                <div class="note" id="note-download-progress-fraction">
                   <p>We use a fraction, instead of firing a progress event with the number of bytes downloaded, to avoid giving precise information about the size of the model or other material being downloaded.</p>
 
                   <p>|progressFraction| is calculated from |rawProgressFraction| to give a precision of one part in 2<sup>16</sup>. This ensures that over most internet speeds and with most model sizes, the {{ProgressEvent/loaded}} value will be different from the previous one that was fired ~50 milliseconds ago.</p>
@@ -2403,6 +2405,8 @@ A <dfn export>quota exceeded error information</dfn> is a [=struct=] with the fo
 
 <h2 id="privacy">Privacy considerations</h2>
 
+<em>Unlike many "privacy considerations" sections, which only summarize and restate privacy considerations that are already normatively specified elsewhere in the document, this section contains some normative requirements that are not present elsewhere, and adds more detail to the normative requirements present elsewhere. The novel normative requirements are called out using <strong>strong emphasis</strong>.</em>
+
 <h3 id="privacy-availability">Model availability</h3>
 
 For any of the APIs that use the infrastructure described in [[#supporting]], the exact download status of the AI model or fine-tuning data can present a fingerprinting vector. How many bits this vector provides depends on the options provided to the API creation, and how they influence the download.
@@ -2416,7 +2420,7 @@ One of the specification's mitigations is to suggest that the user agent mask th
 
 Because implementation strategies differ (e.g. in how many bits they expose), and other mitigations such as permission prompts are available, a specific masking scheme is not mandated. For APIs where the user agent believes such masking is necessary, a suggested heuristic is to mask by default, subject to a masking state that is established for each (API, options, [=storage key=]) tuple. This state can be set to "unmasked" once a web page in a given [=storage key=] calls the relevant `create()` method with a given set of options, and successfully starts a download or creates a model object. Since [=create an AI model object=] has stronger requirements (see [[#privacy-availability-creation]]), this ensures that web pages only get access to the true download status after taking a more costly and less-repeatable action.
 
-Implementations which use such a [=storage key=]-based masking scheme should ensure that the masking state is reset when other storage for that origin is reset.
+<strong>Implementations which use such a [=storage key=]-based masking scheme should ensure that the masking state is reset when other storage for that origin is reset.</strong>
 
 <h4 id="privacy-availability-creation">Creation-time friction</h4>
 
@@ -2434,13 +2438,13 @@ An important part of making the download status into a less-useful fingerprintin
 
 The part of these APIs which, on the surface, gives developers control over the download process is the {{AbortSignal}} passed to the `create()` methods. This allows developers to signal that they are no longer interested in creating a model object, and immediately causes the promise returned by `create()` to become rejected. The specification has a "should"-level <a href="#warning-download-cancelation">requirement</a> that the user agent not actually cancel the underlying download when the {{AbortSignal}} is aborted. The web developer will still receive a rejected promise, but the download will continue in the background, and the availability status (as seen by future calls to the `availability()` method) will update accordingly.
 
-User agents might be inclined to cancel the download in other situations not covered in the specification, such as when the page is unloaded. This needs to be handled with caution, as if the page can initiate these operations using JavaScript (for example, by navigating away to another origin) that would re-open the privacy hole. So, user agents should not cancel the download in response to any page-controlled actions. Canceling in response to user-controlled actions, however, is fine.
+User agents might be inclined to cancel the download in other situations not covered in the specification, such as when the page is unloaded. This needs to be handled with caution, as if the page can initiate these operations using JavaScript (for example, by navigating away to another origin) that would re-open the privacy hole. So, <strong>user agents should not cancel the download in response to any page-controlled actions</strong>. Canceling in response to user-controlled actions, however, is fine.
 
 <h4 id="privacy-availability-eviction">Download eviction</h4>
 
 Another ingredient in ensuring that websites cannot toggle the availability state back and forth is to ensure that user agents don't use a fixed quota system for the downloaded material. For example, if a user agent implemented the translator API with one download per language arc, supported 100 language arcs, and evicted all but the 30 most-recently-used language arcs, then web pages could toggle the readable-via-`create()` availability state of language arcs from "{{Availability/available}}" back to "{{Availability/downloadable}}" by creating translators for 30 new language arcs.
 
-The simplest mitigation to this is to avoid any API-specific quota, and instead rely on a per-user disk space-based quota. This specification does not mandate that particular solution, but does require that user agent should not implement systems which allow web pages to control the eviction of downloaded material.
+The simplest mitigation to this is to avoid any API-specific quota, and instead rely on a per-user disk space-based quota. This specification does not mandate that particular solution, but does require that <strong>user agents should not implement systems which allow web pages to control the eviction of downloaded material</strong>.
 
 <h4 id="privacy-availability-alternatives">Alternate options</h4>
 
@@ -2451,3 +2455,25 @@ The simplest of these is to treat model downloads like most other stored resourc
 A slight variant of this is to re-download the model every time it is requested by a new [=storage key=], while re-using the on-disk storage. This still uses the uesr's time and bandwidth, but at least saves on disk space.
 
 Going further, a user agent could attempt to fake the download for new [=storage keys=] by just waiting for a similar amount of time as the real download originally took. This then only spends the user's time, sparing their bandwidth and disk space. However, this is less private than the above alternatives, due to the presence of network side channels. For example, a web page could attempt to detect the fake downloads by issuing network requests concurrent to the `create()` call, and noting that there is no change to network throughouput. The scheme of remembering the time the real download originally took can also be dangerous, as the first site to initiate the download could attempt to artificially inflate this time (using concurrent network requests) in order to communicate information to other sites that will initiate a fake download in the future, from which they can read the time taken. Nevertheless, something along these lines might be useful in some cases, implemented with caution and combined with other mitigations.
+
+<h3 id="privacy-model-version">Model version</h3>
+
+Separate from the availability of a model, the specific version or behavior of a model can also be a fingerprinting vector.
+
+For this reason, these APIs do not expose model versions directly. And they take some efforts to avoid exposing the model version indirectly, for example by <a href="#note-download-progress-fraction">censoring the download size</a> in the [=create an AI model object=] algorithm, so that {{CreateMonitor/downloadprogress}} events do not directly expose the size of the model. This also encourages interoperability, by making it harder for web pages to safelist specific models, and instead encouraging them to program against the general API surface.
+
+However, such mitigations are not foolproof. They only protect against simple attempts to passively discover the model version; behavioral probing can still reveal it. (For example, by sending a number of inputs, and checking the output against known patterns for different versions.)
+
+The best way to prevent the model version from becoming a fingerprinting vector is to tie it to the user agent's version, such that the model's version (and thus behavior) only updates in lockstep with already-exposed information such as {{NavigatorID/userAgent|navigator.userAgent}}. <strong>User agents should limit the number of possible model versions that a single user agent version can be paired with</strong>, for example by not providing model updates to older user agent versions. (However, this may not always be possible, for example because the user agent might be obtaining results by using a model bundled with the operating system, whose updates are not under the user agent's control.)
+
+<h3 id="privacy-user-input">User input</h3>
+
+<strong>Implementations must not train or fine-tune models on user input, or otherwise store user input in a way that models can consult in the future.</strong> (For example, using retrieval-augmented generation technology.) <strong>Instead, implementations should expose roughly "the same" capabilities to all sites.</strong>
+
+Using user input in such a way would provide a vector for exposing the user's information to web pages, or for exposing information derived from the user's interactions with one site to another site, both of which are unacceptable privacy leaks.
+
+<h3 id="privacy-cloud-implementations">Cloud-based implementations</h3>
+
+The implementation-defined parts of these APIs can be implemented by delegating to user-agent-provided cloud-based services. This is not, in itself, a significant privacy risk: web developers already have the ability to send arbitrary data (including user-provided data) to cloud services via APIs such as {{WindowOrWorkerGlobalScope/fetch()}}. Indeed, it's likely that web developers will fall back to such cloud services when these APIs are not present.
+
+However, this is something for web developers to be aware of when they use this API, in case their web page has requirements on not sending certain information to third parties. We're contemplating giving control over this possibility to web developers in <#38>.
diff --git a/security-privacy-questionnaire.md b/security-privacy-questionnaire.md
@@ -9,7 +9,7 @@ This feature exposes two large categories of information:
 
 - The availability information for various capabilities of the API, so that web developers know what capabilities are available in the current browser, and whether using them will require a download or the capability can be used readily.
 
-The privacy implications of both of these are discussed [in the explainer](./README.md#privacy-considerations).
+The privacy implications of both of these are discussed [in the specification](https://webmachinelearning.github.io/writing-assistance-apis/#privacy).
 
 > 02.  Do features in your specification expose the minimum amount of information
 >      necessary to implement the intended functionality?
@@ -83,7 +83,7 @@ Otherwise, we do not anticipate any differences.
 
 Not quite yet.
 
-We have [privacy considerations](./README.md#privacy-considerations) section in the explainer, and the start of a [privacy considerations section](https://webmachinelearning.github.io/writing-assistance-apis/#privacy) in the specification. For now it covers only the fingerprinting issue, but we anticipate moving over more content from the explainer over time.
+We have a [privacy considerations section](https://webmachinelearning.github.io/writing-assistance-apis/#privacy) in the specification.
 
 We do not anticipate significant security risks for these APIs at this time, although we will add a section discussing some basics like how to avoid allowing websites to use up all of the user's disk space.