Identity Assertion

The C2PA technical specification allows actors in a workflow to make cryptographically signed assertions about the produced C2PA asset. This signature is issued by the vendor whose software or hardware was used to create the C2PA assertions and the C2PA claim, which is why it is called the C2PA claim generator.

This specification describes a C2PA assertion referred to here as the identity assertion that can be added to a C2PA Manifest to enable a credential holder to prove control over a digital identity and to use that identity to document a named actor’s role in the C2PA asset’s lifecycle.

Version 1.0 Draft 29 April 2024 · Version history

Maintainers:

License

This specification is subject to the Community Specification License 1.0.

Additional information about this specification’s scope and governance can be found at the project’s GitHub repository (creator-assertions/identity-assertion). The Community Specification License documents at the root of that repository are the authoritative governance documents for this specification.

Contributing

This section is non-normative.

This specification is an active working draft. If you wish to contribute to its development, you are invited to:

Foreword

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. No party shall not be held responsible for identifying any or all such patent rights.

Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement.

This document was prepared by the Creator Assertions Working Group.

Known patent licensing exclusions are available in the specification’s notices.md file.

Any feedback or questions on this document should be directed to the specifications repository (GitHub: creator-assertions/identity-assertion).

THESE MATERIALS ARE PROVIDED “AS IS.” The Contributors and Licensees expressly disclaim any warranties (express, implied, or otherwise), including implied warranties of merchantability, non-infringement, fitness for a particular purpose, or title, related to the materials. The entire risk as to implementing or otherwise using the materials is assumed by the implementer and user. IN NO EVENT WILL THE CONTRIBUTORS OR LICENSEES BE LIABLE TO ANY OTHER PARTY FOR LOST PROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KIND WITH RESPECT TO THIS DELIVERABLE OR ITS GOVERNING AGREEMENT, WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE, AND WHETHER OR NOT THE OTHER MEMBER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Table of contents

1. Introduction

This section is non-normative.

1.1. Scope

This specification describes a C2PA assertion that allows a named actor to document their relationship to a C2PA asset produced by them or on their behalf independently from the C2PA claim generator, and to allow consumers of a C2PA asset to independently verify that the received asset was in fact produced by the named actor and has not been tampered with.

For purposes of the Community Specification License, the scope.md document at the root of this project’s GitHub repository is the governing document of this specification’s scope.

This specification draws upon and extends existing work from two standards organizations:

1.2. C2PA technical specification

The Coalition for Content Provenance and Authenticity (C2PA) has developed a technical specification for providing content provenance and authenticity. It is designed to enable global, opt-in, adoption of digital provenance techniques through the creation of a rich ecosystem of digital provenance enabled applications for a wide range of individuals and organizations while meeting appropriate security requirements.

The C2PA technical specification allows other standards bodies to define additional metadata that can be incorporated into a C2PA Manifest. These statements come in the form of a C2PA assertion.

This specification describes a specific C2PA assertion that can be added to a C2PA Manifest to allow an actor to prove control over an digital identity and to bind that identity to a C2PA asset (and, at their option, one or more C2PA assertions) produced by them or on their behalf.

This document references version 2.0 of the C2PA technical specification, but it is also compatible with all 1.x versions of the specification. Where there is a meaningful difference between versions of the C2PA specification, those will be described here.

1.3. Trust over IP technical architecture

The Trust over IP Foundation (ToIP) has developed a technical architecture specification for describing a decentralized digital trust infrastructure.

This specification refers to a “public review draft” of the ToIP technical architecture specification published on 14 November 2022 which is current as of the writing of this specification.

The ToIP architecture specification introduces the concept of verifiable identifier as follows:

Design Principle #5 (Cryptographic Verifiability) states that “messages and data structures exchanged between parties should be verifiable as authentic using standard cryptographic algorithms and protocols”. This requires that Endpoint Systems be able to associate, discover and verify the cryptographic keys associated with a ToIP identifier. This specification will refer to identifiers that meet this basic requirement of cryptographic verifiability as verifiable identifiers (VIDs).

The ToIP definition of the term “verifiable identifier” is well-aligned with the design goals of this specification. As of this writing, ToIP has just begun the process of producing a technical standard that describes how to implement verifiable identifiers. This specification will incorporate relevant technical standards for verifiable identifiers as they become available.

This specification describes mechanisms to support two common individual and organizational identity mechanisms that fit the conceptual description of ToIP verifiable identifier:

1.4. Conceptual overview

Each identity assertion instance allows exactly one credential holder to sign a data structure known as signer_payload which will subsequently be included in a C2PA claim. The signer_payload and corresponding signature from the credential holder binds this identity assertion to a specific C2PA asset and allows the credential holder to endorse or make additional statements about the named actor’s relationship to that C2PA asset.

The credential holder’s signature should generally be construed as reflective of the named actor’s authorization of or active participation in the production of the the C2PA asset in which it appears, as described by the specific C2PA assertions that are referenced in the identity assertion.

This is intended to provide an additional trust signal over and above the signature provided by the C2PA claim generator, as illustrated in the following diagram, which is non-normative:

Identity assertion overview
Figure 1. Overview of identity assertion

The C2PA claim, which is signed by the C2PA claim generator, contains a list of assertions which may contain any number of identity assertions.

Each identity assertion contains a signer_payload with hashlink references to one or more other C2PA assertions, known here as referenced assertions, and a signature from the credential holder over that payload, thus providing a non-repudiable, tamper-evident binding between the named actor and the list of referenced assertions.

This trust signal should be thought of as independent from (and thus, in addition to) the trust signal provided by the C2PA claim generator itself.

This trust signal may be repeated any number of times to convey information specific to distinct named. The following diagram, which is non-normative, illustrates two credential holders, each independently signing different actions and metadata assertions:

Multiple identity assertion example
Figure 2. Example with multiple identity assertions

1.5. Use cases and examples

The following is a non-exhaustive list of potential and general use cases for the identity assertion. Some of these are taken from, or built upon, the use cases developed within the Project Origin Alliance and the Content Authenticity Initiative (CAI) frameworks. Each use case will be described using some generic personas to help make the flow clear.

The identity assertion SHOULD NOT be construed to convey ownership of a C2PA asset. Ownership and rights transfers often take place outside of the digital realm. It is outside the scope of this specification to describe such transfers. It is possible that some other assertion could convey such information; such an assertion could be included in the set of referenced assertions.

1.5.1. Enhancing clarity around provenance and edits for journalistic work

A photojournalist Alice uses a Content Credentials-enabled capture device during a newsworthy event she is covering. Her capture device is configured with her professional credentials and records her identity in each captured asset. She then imports these assets into a Content Credentials-enabled editing application. Her identity is again associated with the edits that she has made. After editing, she sends her assets to her photo editor, Bob. Bob makes additional edits also using a Content Credentials-enabled application. A new identity assertion is recorded for these edits identifying Bob as the person responsible for those edits. The finalized asset is moved into the content management system of a news organization, which is also Content Credentials-enabled, before posting the asset to social media. Anyone viewing these assets can see the identity of Alice, Bob, and the news organization in a Content Credentials-enabled viewer.

1.5.2. Enhance the evidentiary value of critical footage

A human rights defender Jane manages to capture footage containing Content Credentials-enabled provenance of violence during a protest. Jane’s capture device is configured with her professional credentials and records her identity in each captured asset. Jane sends the footage to a human rights organization that verifies that the asset meets video-as-evidence criteria. The human rights organization determines that releasing Jane’s identity to the public could pose a safety or privacy risk, so they redact the identity assertion using a Content Credentials-enabled editing application before publishing the footage.

1.5.3. Election integrity and protecting political campaigns from deepfakes

Aarav is the director of social communications for a political candidate in an upcoming election. He feels concerned that in efforts to manipulate voters, bad actors will create deepfake videos to misrepresent the candidate. To protect the campaign, Aarav decides that all official communications will be created, produced, and published with Content Credentials-enabled tools. The identity assertion equips Aarav and his team with the ability to represent the campaign as the organizational author of the content that they publish. Aarav encourages voters to verify any digital campaign content to ensure that the material has Content Credentials that link the assets to the campaign.

1.5.4. Brand protection in digital marketing

An advertising agency representing a popular sneaker brand wants to ensure that consumers are only purchasing the shoes through official channels. They have found several fraudulent campaigns online claiming to offer the shoes at a discount. To address these scams, the agency decides to incorporate Content Credentials into their creative process, using the identity assertion to represent the designer responsible for the campaign. Before the campaign goes live, the agency redacts the designer’s name and publishes the assets with Content Credentials using the identity assertion to represent the sneaker brand. Now consumers can refer to Content Credentials to identify legitimate promotional campaigns from the sneaker brand.

1.5.5. Attribution for digital creators

One morning, Charlie, an up-and-coming digital artist, woke up to find that one of their designs went viral on social media. Charlie felt upset to see their art detached from their name. Charlie had spent months working on their artwork and was disappointed not to receive credit or compensation for their work. Moving forward, Charlie decides to use the identity assertion to link their name, social media, and copyright information to the art they create. Now, when Charlie posts their content to social media platforms that display Content Credentials, viewers can easily link the artwork back to Charlie.

1.5.6. Audio sampling and artistic collaboration

Eve is a musician with a talent for releasing songs featuring clever lyrics and catchy beats. Ever since Eve’s music gained traction on audio streaming apps, artists have reached out asking permission to sample her beats. Eve views these collaborations as an opportunity to help new listeners discover her music, so it is important to her that the samples are traceable back to her. She decides to begin recording her songs using a Content Credentials-enabled device configured with her identity and her profile on music streaming applications. Now, when other artists release songs that sample Eve’s tunes, she can demonstrate her contribution to the final work.

3. Terms and definitions

3.1. Concepts adapted from C2PA technical specification

The following definitions are adapted from the glossary provided in the C2PA technical specification. This specification uses the prefix “C2PA” to denote data structures incorporated from that specification.

3.1.1. Actor

A human or non-human (hardware or software) that is participating in the C2PA ecosystem. For example: a camera (capture device), image editing software, cloud service, or the person using such tools.

An organization or group of actors may also be considered an actor in the C2PA ecosystem.

3.1.2. C2PA asset

A file or stream of data containing digital content, asset metadata, and a C2PA Manifest.

For the purposes of this definition, we will extend the typical definition of “file” to include cloud-native and dynamically-generated data.

The definition of “C2PA asset” in this specification differs from the definition of “asset” given in the C2PA technical specification. A “C2PA asset” as defined in this specification MUST contain a C2PA Manifest.

3.1.3. C2PA Manifest

The set of information about the provenance of a C2PA asset based on the combination of one or more C2PA assertions (including content bindings), a single C2PA claim, and a claim signature. A C2PA Manifest is part of a C2PA Manifest Store.

A C2PA Manifest can reference other C2PA Manifests.

See Section 11, “Manifests,” of the C2PA technical specification.

3.1.4. C2PA claim

A digitally signed and tamper-evident data structure that references a set of assertions by one or more actors, concerning a C2PA asset, and the information necessary to represent the content binding. If any C2PA assertions were redacted, then a declaration to that effect is included. This data is a part of the C2PA Manifest.

See Section 10, “Claims,” of the C2PA technical specification.

3.1.5. C2PA claim generator

The non-human (hardware or software) actor that generates the C2PA claim about a C2PA asset as well as the claim signature, thus leading to the C2PA asset's associated C2PA Manifest.

3.1.6. C2PA assertion

A data structure which represents a statement asserted by an actor concerning the C2PA asset. This data is a part of the C2PA Manifest.

See Section 6, “Assertions,” of the C2PA technical specification.

3.1.7. C2PA Manifest Consumer

An actor who consumes a C2PA asset with an associated C2PA Manifest for the purpose of obtaining the provenance data from the C2PA Manifest.

3.1.8. Content binding

Information that associates digital content to a specific C2PA Manifest associated with a specific C2PA asset, either as a hard binding or a soft binding.

Content bindings are described in Section 9, “Binding to content,” of the C2PA technical specification.

3.1.9. Hard binding

One or more cryptographic hashes that uniquely identifies either the entire C2PA asset or a portion thereof.

Hard bindings are described in Section 9.2, “Hard bindings,” of the C2PA technical specification.

3.2. Concepts adapted from ToIP technical architecture

The following definitions are adapted from the Trust over IP (ToIP) technology architecture specification. This specification uses the prefix “ToIP” to denote concepts incorporated from that specification.

3.2.1. ToIP verifiable identifier

Any identifier for which an endpoint system is “able to associate, discover and verify the cryptographic keys associated with a ToIP identifier.” This satisfies the ToIP design principle that “messages and data structures exchanged between parties should be verifiable as authentic using standard cryptographic algorithms and protocols.”

See ToIP identifiers in the ToIP technology architecture specification.

3.3. Concepts specific to this specification

3.3.1. Credential holder

The actor that has control (specifically signature authority) over a ToIP verifiable identifier that describes a specific named actor.

3.3.2. Identity assertion

A C2PA assertion that allows a credential holder to prove control over an digital identity and bind the identity to a set of C2PA assertions produced by them or on their behalf.

3.3.3. Identity assertion consumer

A C2PA Manifest Consumer who also consumes and interprets the content of any identity assertions contained within the _C2PA Manifest.

This role can also be thought of as a relying party or verifier as defined in specifications such as the W3C verifiable credentials data model.

3.3.4. Named actor

The actor whose relationship to a C2PA asset is documented by an identity assertion. This may also be referred to as a credential subject when identified by the subject field of a ToIP verifiable identifier.

The named actor is not necessarily the same actor as the credential holder, though there is an implied trust relationship between the two actors.

3.3.5. Placeholder assertion

A temporary C2PA assertion that is created during C2PA claim generation which reserves space for the eventual content of the identity assertion. A placeholder assertion MUST be used when the final file layout of the C2PA asset is required for the hard binding assertion, as described in Section 6.3, “Interaction with data hash assertion”.

3.3.6. Referenced assertions

The set of C2PA assertions that are referenced by an identity assertion and thus bound to (i.e. authorized by or created by) the credential holder named in the identity assertion.

4. Standard terms

The key words “MUST,” “MUST NOT,” “REQUIRED,” “SHALL,” “SHALL NOT,” “SHOULD,” “SHOULD NOT,” “RECOMMENDED,” “NOT RECOMMENDED,” “MAY,” and “OPTIONAL” in this document are to be interpreted as described in BCP 14, RFC 2119, and RFC 8174 when they appear in any casing (upper, lower, or mixed).

5. Assertion definition

5.1. Overview

This specification defines a C2PA assertion known as an identity assertion which MAY be used to bind one named actor to a set of referenced assertions. This binding SHOULD generally be construed as authorization of or participation in the creation of the statements described by those assertions and corresponding portions of the C2PA asset in which they appear.

The identity assertion contains the following data fields:

  • signer_payload contains the set of data to be signed by this credential holder.

  • signature contains the raw byte stream of the credential holder’s signature

  • sig_type defines the data type of signature

  • pad1 and pad2 are byte strings filled with binary 0x00 values used to fill space

The signer_payload field is a structure of type signer-payload-map which is signed by the credential holder. It contains the following fields:

The content of signature depends on the type of credential that is used. Valid credential types and the corresponding signature data structures and sig_type values are defined in Section 8, “Credentials, signatures, and validation methods”.

The identity assertion shall have a label of cawg.identity.

Multiple identity assertions may be used in the same C2PA Manifest to describe the distinct roles of multiple actors in creating a single C2PA asset. This is illustrated in the multi-author example from the conceptual overview. If this occurs, these assertions shall be given unique labels as described by Section 6.4, “Multiple instances,” of the C2PA technical specification.

TO DO (issue #14): Describe how to allow actors to describe their role in the content (previously described via CreativeWork assertion) and potentially in specific steps referenced in C2PA actions assertion.
TO DO (issue #15): Describe where would things like social media accounts live.
TO DO (issue #26): Describe credential holder’s role in relation to the asset.

5.1.1. Referenced assertions requirements

The list of referenced assertions contained in signer_payload.referenced_assertions is subject to the following requirements:

  • For each assertion listed, an assertion with the same url, alg, and hash values MUST also be listed in the created_assertions, generated_assertions, or assertions field of the C2PA claim in which the identity assertion appears.

The assertions field appears only in version 1.x of the C2PA technical specification. It was replaced with created_assertions and generated_assertions in version 2.0 of the C2PA technical specification.
The requirement that the hash value for a referenced assertion be known prior to presenting signer_payload for signature implies that an identity assertion MUST NOT refer to itself.

5.2. CBOR schema

The schema for this type is defined by the identity rule in the following CDDL definition:

identity = {
  "signer_payload": $signer-payload-map, ; content to be signed by credential holder
  "sig_type": tstr .size (1..max-tstr-length), ; a string identifying the data type of the signature field that follows
  "signature": bstr, ; byte string of the signature
  "pad1": bstr, ; byte strings filled with binary `0x00` values used for filling up space
  ? "pad2": bstr, ; optional byte strings filled with binary `0x00` values used for filling up space
}

signer-payload-map = {
  "referenced_assertions": [1* $hashed-uri-map],
}
Future minor version updates (1.1, 1.2, etc.) to this specification MAY add new fields to the signer-payload-map description, provided that such new data members are optional and there is a well-specified default meaning that is compatible with the 1.0 version of this specification. Such updates to the specification SHOULD continue to use the cawg.identity assertion label.

Possible values for the sig_type field and the corresponding interpretations of the signature field are described in Section 8, “Credentials, signatures, and validation methods”.

The hashed-uri-map rule is defined in Section 8.3.1, “Hashed URIs,” of the C2PA technical specification.

5.2.1. Example

An example in CBOR-Diag is shown below, which is non-normative:

{
  "signer_payload": {
    "referenced_assertions": [
      {
        "url": "self#jumbf=c2pa/urn:uuid:F9168C5E-CEB2-4faa-B6BF-329BF39FA1E4/c2pa.assertions/c2pa.hash.data",
        "hash": b64'U9Gyz05tmpftkoEYP6XYNsMnUbnS/KcktAg2vv7n1n8='
      },
      {
        "url": "self#jumbf=c2pa/urn:uuid:F9168C5E-CEB2-4faa-B6BF-329BF39FA1E4/c2pa.assertions/c2pa.thumbnail.claim.jpeg",
        "hash": b64'G5hfJwYeWTlflxOhmfCO9xDAK52aKQ+YbKNhRZeq92c='
      },
      {
        "url": "self#jumbf=c2pa/urn:uuid:F9168C5E-CEB2-4faa-B6BF-329BF39FA1E4/c2pa.assertions/c2pa.ingredient.v2",
        "hash": b64'Yzag4o5jO4xPyfANVtw7ETlbFSWZNfeM78qbSi8Abkk='
      }
    ],
  },
  "sig_type": "cawg.x509.cose",
  "signature": b64'....', // COSE signature
  "pad1": b64'....', // zero-filled pad buffer
  "pad2": b64'....'  // zero-filled pad buffer
}

6. Creating the identity assertion

6.1. Presenting the signer_payload data structure for signature

Prior to presenting the credential holder with the signer_payload data structure for signature, the referenced assertions MUST themselves be created. This process is described in Section 10.3.1, “Creating assertions,” of the C2PA technical specification.

The list of referenced assertions MUST include the same hard binding assertion that is present in the C2PA claim itself. The list of referenced assertions SHOULD include any assertions necessary to allow the actor to accurately describe their relationship to the content. For example, a c2pa.actions assertion could be referenced to attest that the actor performed those specific actions.

The signer_payload data structure MUST be presented to be signed by the credential holder corresponding to each identity assertion. This process is described in more detail in Section 8, “Credentials, signatures, and validation methods”.

If a data hash assertion is being used, the C2PA claim generator MUST also follow the process described in Section 6.3, “Interaction with data hash assertion”.

6.2. Creating the assertion

Once the signature is obtained, the identity assertion can be created and added to the C2PA Manifest’s assertion store, and then referenced in the C2PA claim. If a placeholder assertion was previously added to the C2PA claim, its content MUST now be replaced with the final assertion content as described below.

The signer_payload data structure MUST be unchanged from the data structure that was presented to the credential holder for signature.

The values for the sig_type and signature fields depend on the nature of credential used. Some common signature types are described in Section 8, “Credentials, signatures, and validation methods”.

The C2PA claim generator SHOULD independently validate the signature from the credential holder before proceeding.

If a placeholder assertion was used, the values of the pad1 and pad2 fields MUST now be recomputed (adjusted in size) such that the resulting identity assertion exactly matches the size in bytes of the placeholder assertion. If the signature exceeds the space available in the placeholder assertion, the claim generation process as described in Section 6.3, “Interaction with data hash assertion” MUST be repeated with a larger placeholder assertion.

Preferred/deterministic CBOR serialization of byte arrays uses a variable-length integer to specify the length of the encoded binary data. When the length goes from zero to one byte, or one to two bytes (etc.), the length of the resulting pad jumps by two bytes. This means that not all paddings can be expressed using a single padding field. For example, 24-byte and 26-byte pads can be created, but a 25-byte pad cannot. If this situation arises, the desired padding can be split between pad1 and pad2. For example, to make a 25-byte pad, an implementation can encode 19 bytes into pad1 (resulting in an encoded length of 20 bytes), and 4 bytes into pad2 (resulting in 5 bytes.)

If no placeholder assertion was used, the values of the pad1 and pad2 fields MAY be empty.

The pad1 and pad2 fields of an identity assertion MUST contain only zero-value (0x00) bytes.

6.3. Interaction with data hash assertion

The process described in this section MUST be followed when using a data hash assertion as described by Section 18.5, “Data hash,” of the C2PA technical specification. This process MAY be followed when using other hard binding assertions.

The C2PA technical specification explains the need for pre-computing the C2PA asset’s final file layout when using a data hash assertion as follows:

Some asset file formats require file offsets of the C2PA Manifest Store and asset content to be fixed before the manifest is signed, so that hard bindings will correctly align with the content they authenticate. Unfortunately, the size of a manifest and its signature cannot be precisely known until after signing, which could cause file offsets to change. For example, in JPEG-1 files, the entire C2PA Manifest Store must appear in the file before the image data, and so its size will affect the file offsets of content being authenticated.

Similarly, the size of the identity assertion cannot be known until its signature is obtained. Changing the size of the identity assertion after file layout is completed would invalidate the file offsets contained within the data hash assertion.

In this case, it is necessary to use a placeholder assertion to reserve space for the content of the final identity assertion (including its signature) which will be created later.

When using a data hash assertion, a C2PA claim generator MUST follow the process described in Section 10.3, “Creating a claim,” and Section 10.4, “Multiple step processing,” of the C2PA technical specification with additional steps as described below:

  1. Section 10.3.1, “Creating assertions.” Any identity assertion that will be added to the claim MUST be represented during this step by an assertion using the same label as the final identity assertion. The content of the placeholder assertion is unimportant, except that the size in bytes of the placeholder assertion MUST be large enough to accommodate the final identity assertion.

  2. Section 10.3.2.1, “Adding assertions and redactions”

  3. Section 10.3.2.2, “Adding ingredients”

  4. Section 10.3.2.3, “Connecting the signature”

  5. If using C2PA 1.x process, Section 11.4.1, “Prepare the XMP”.

  6. Section 10.4.1, “Create content bindings”

  7. The list of referenced assertions (including the hard binding assertion) MUST be presented to the credential holder for each identity assertion to be added, as described in [_presenting_the_list_of_referenced_assertions_for_signature] by the corresponding credential holder. Once each signature has been obtained, the placeholder assertion content MUST be replaced with the final identity assertion content incorporating that signature. The C2PA claim generator SHOULD independently validate the signature from the credential holder before proceeding.

  8. The remaining steps from Section 10.4, “Multiple step processing,” MUST now be completed.

  9. Section 10.3.2.4, “Signing a claim”

  10. Section 10.3.2.5, “Time stamps”

  11. Section 10.3.2.6, “Credential revocation information”

These steps are also represented by the following sequence diagram, which is non-normative:

sequenceDiagram participant G as C2PA claim generator participant H as Credential holder G->>G: Create assertions Note right of G: Includes placeholder assertions G->>G: Create content bindings loop For each credential holder G->>H: Request signature over list of referenced assertions H->>G: Provide signature over list of referenced assertions G->>G: Independently verify signature end G->>G: Replace placeholder assertions with final identity assertions G->>G: Create claim G->>G: Issue claim generator signature for final claim G->>G: Create manifest store

7. Validating the identity assertion

7.1. Validation method

Validation of the C2PA Manifest MUST be completed with a finding that the manifest is at least well-formed as per Section 14.3.2, Well-formed manifest,” of the C2PA technical specification before a validator attempts to report on the validity of an identity assertion.

An identity assertion MUST contain a valid CBOR data structure that contains the required fields as documented in the identity rule in Section 5.2, “CBOR schema”. The cawg.identity.cbor.invalid error code SHALL be used to report assertions that do not follow this rule. A validator SHALL NOT consider any extra fields not documented in the identity rule during the validation process.

Extra fields can be read and processed in non-validation scenarios.

For each entry in signer_payload.referenced_assertions, the validator MUST verify that the same entry exists in either the created_assertions or gathered_assertions entry of the C2PA claim. (For version 1 claims, the entry must appear in the assertions entry.) The cawg.identity.assertion.mismatch error code SHALL be used to report violations of this rule.

The validator SHOULD verify that no entry in signer_payload.referenced_assertions is duplicated. The cawg.identity.assertion.duplicate error code SHALL be used to report violations of this rule.

The validator MUST ensure that signer_payload.referenced_assertions contains at least one hard binding assertion as described in Section 9.2, “Hard bindings” of the C2PA technical specification. The cawg.identity.hard_binding_missing error code SHALL be used to report a missing hard binding assertion.

The validator MUST maintain a list of valid sig_type values and corresponding code paths for the signature values that it is prepared to accept. Validators SHOULD be prepared to accept all signature types described in Section 8, “Credentials, signatures, and validation methods”. The cawg.identity.sig_type.unknown error code SHALL be used to report assertions that contain unrecognized sig_type values.

The signature field of an identity assertion MUST contain a valid signature. The procedure for validating each signature type and corresponding status codes are described in Section 8, “Credentials, signatures, and validation methods”.

The pad1 and pad2 fields of an identity assertion MUST contain only zero-value (0x00) bytes. The cawg.identity.pad.invalid error code SHALL be used to report assertions that contain other values in these fields.

7.2. Status codes

The set of standard success and failure codes for identity assertion validations are defined below. These follow the format defined by the section titled Section 15.1, “Status codes,” of the C2PA technical specification.

The url field for a status code MUST always be the label of the identity assertion. It is omitted from the tables below.

7.2.1. Success codes

Value Meaning

cawg.identity.validated

The identity assertion, including the referenced credentials and signature binding the credential holder to the C2PA claim, is validated.

7.2.2. Failure codes

Value Meaning

cawg.identity.cbor.invalid

The CBOR of the identity assertion is not valid.

cawg.identity.assertion.mismatch

The identity assertion contains an assertion reference that could not be found in the C2PA claim.

cawg.identity.assertion.duplicate

The identity assertion contains one or more duplicate assertion references.

cawg.identity.hard_binding_missing

The identity assertion does not reference a hard binding assertion.

cawg.identity.sig_type.unknown

The sig_type of the identity assertion is not recognized.

cawg.identity.pad.invalid

The pad1 or pad2 field contains non-zero bytes.

Additional failure codes relating to specific signature types are defined in Section 8, “Credentials, signatures, and validation methods”.

8. Credentials, signatures, and validation methods

The identity assertion allows multiple signature types to be represented, although only one ToIP verifiable identifier and corresponding signature can be used in any single assertion.

The signature type is represented by the sig_type field. Some credential types are described in this specification. It is strongly recommended that identity assertion validators be prepared to read all of the signature types described in this specification.

Other specifications MAY define additional sig_type values and the corresponding definition of signature with the understanding that some identity assertion validators may not be prepared to understand such assertions. Values of sig_type that begin with the prefix cawg. are reserved for use of the Creator Assertions Working Group and MUST NOT be used in any specification not produced by this group.

Credential types in minor version updates

Future minor version updates (1.1, 1.2, etc.) to this specification MAY:

  1. Add new sections to this specification defining new credential types and their corresponding sig_type values.

  2. Mark existing sections of this specification defining existing credential types and their corresponding sig_type values as deprecated.

Such updates to the specification SHOULD continue to use the cawg.identity assertion label.

8.1. W3C verifiable credentials

This portion of the specification is still undergoing significant exploration and revision. It will be added in a subsequent version.

8.2. X.509 certificates and COSE signatures

In some use cases, an actor in the system may wish to provide an X.509 certificate to have an organizational or individual identity described by the certificate associated with the list of referenced assertions.

The sig_type value for such an assertion MUST be cawg.x509.cose. The signature value MUST be a COSE signature as described below.

8.2.1. Generating the COSE signature

To generate the COSE signature for an identity assertion, the steps described in the listed sections of the C2PA technical specification MUST be followed with adaptations as described subsequently:

In each of the above sections, the following changes MUST be applied:

  • Any reference to the claim MUST be replaced with the CBOR serialization of the signer_payload field from the identity assertion. The signer_payload data structure MUST be serialized as described in Section 4.2.1, “Core Deterministic Encoding”, of RFC 8949: Concise Binary Object Representation. The resulting byte string is presented to the credential holder for signature.

  • Any reference to the claim generator MUST be replaced with the actor whose X.509 certificate is being used for this assertion.

8.2.2. Validating the COSE signature

To validate the COSE signature for an identity assertion, the steps described in the listed sections of the C2PA technical specification MUST be followed with adaptations as described:

In each of the above sections, the following changes MUST be applied:

  • Any reference to the claim MUST be replaced with the CBOR serialization of the signer_payload field from the identity assertion.

  • Any reference to the claim generator MUST be replaced with the actor whose X.509 certificate is being used for this assertion.

  • The validator SHALL maintain one or more trust lists which do not need to be the same as the trust lists used to validate claim generator signatures as described in Section 14.4.1, “C2PA Signers.”

9. Trust model

This section augments Section 14, “Trust model,” of the C2PA technical specification by adding additional trust signals related to the identity of actors involved in creation of a C2PA asset. It does not replace any portion of the C2PA trust model.

9.1. Digital trust introduction

This section is non-normative.

Digital trust is typically described using a trust triangle as shown in the following diagram:

Basic trust triangle
Figure 3. Basic trust triangle

The three roles depicted can each be performed by a human, organization, machine, or some combination thereof. A credential holder establishes a relationship with a credential issuer. If the issuer trusts the credential holder, it will then issue a digital credential which makes statements about the credential subject (which may or may not be the same actor as the credential holder) and is signed by the issuer.

Later, the credential holder can then present this credential to a verifier (also often known as a “relying party”). If the verifier has an existing trust relationship with the credential issuer, then the verifier can choose to trust the credential and the statements made within it.

This pattern can be repeated if there is not a direct trust relationship between the verifier and issuer. The issuer itself might have a credential that is issued by another issuer that is known to the verifier as shown in the following diagram:

Transitive trust triangle
Figure 4. Transitive trust triangle

In this scenario, issuer 2 is playing a dual role as credential holder and credential issuer. The verifier does not have a direct relationship with the issuer of the credentials that were presented (issuer 2). However, it can inspect issuer 2’s credentials and find that they were issued by issuer 1, with whom it does have a direct relationship. Based on the nature of that relationship, it may choose to extend transitive trust to issuer 2 and thus to the credential that issuer 2 issued.

Web browsers provide a well-known example of transitive trust. Browsers have direct relationships with relatively few root trust anchors. Those anchors, known as root certificate authorities, in turn issue credentials to certificate authorities who then issue credentials to individual web site operators who then sign the content presented to a browser. This pattern may be repeated with multiple layers of intermediate certificate authorities. The web browser evaluates this entire chain of credentials when deciding whether to present a web site as trusted or not.

9.2. Trust scenarios in identity assertion

The signature in an identity assertion could be considered as a new credential documenting the relationship between named actor and C2PA asset, which will be referred to in this section as an asset-specific credential.

“Content Credential” is a trademarked term that refers to C2PA Manifests and MUST NOT be used in reference to identity assertion signatures.

For each form of credential that an identity assertion consumer is prepared to accept, it SHOULD maintain:

  1. a list of trust relationships that it is prepared to accept when interpreting any identity assertion, and

  2. one or more mechanisms to check for credentials that have been revoked by the issuer.

There are a few possible relationships between the implementation of the identity assertion, named actor, and credential issuer, as documented in the following subsections.

The trust decisions described in the scenarios should only be evaluated once the identity assertion and the signature material within have been successfully validated as described in Section 7, “Validating the identity assertion”.

9.2.1. Named actor as issuer

In this scenario, the credential holder possesses a credential that describes the named actor and is provisioned with the ability to generate digital signatures on the named actor’s behalf.

This scenario is implicit in the X.509 certificate-based workflow as described in Section 8.2, “X.509 certificates and COSE signatures”. Other credential types MAY also follow this scenario.

The credential holder uses this signature authority directly to generate the asset-specific credential, as depicted in the following diagram, which is non-normative:

Named actor as issuer
Figure 5. Named actor as issuer

In this scenario, the identity assertion consumer SHOULD make its trust decision based on the following predicates:

  1. Is there a direct trust relationship with the named actor? If so, the identity assertion SHOULD be treated as trusted.

  2. Is there a transitive trust relationship with the named actor via its credential issuer? (In other words, does the identity assertion consumer trust the credential issuer to issue valid signature credentials?)

    1. If so, has the credential issuer issued a revocation for the named actor’s credential? If so, the identity assertion SHOULD be treated as untrusted.

    2. If the transitive trust relationship exists and the credential has not been revoked, the identity assertion SHOULD be treated as trusted.

  3. If neither relationship can be demonstrated, the identity assertion SHOULD be treated as untrusted.

The direct trust relationship case is possible, but relatively uncommon.

9.2.2. Named actor without signature authority

In this scenario, the credential holder possesses a credential that describes the named actor but does not have the ability to generate digital signatures on the named actor’s behalf.

In this scenario, the hardware or software implementation that is generating the identity assertion MAY request a summary of the named actor’s credential from the credential holder, and gather that information into the identity assertion, which it will then sign using its own credentials.

Example using W3C verifiable credentials

This example, which is non-normative, depicts a possible workflow for this scenario. In this scenario, the credential holder wishes to use a W3C verifiable credential held in a wallet to generate an identity assertion on behalf of the credential’s named actor.

In this example, the wallet is prepared to selectively disclose portions of the credential via W3C verifiable presentation, but can neither reveal the entire credential nor issue other forms of signature.

sequenceDiagram participant G as C2PA claim generator participant W as Wallet participant H as Credential holder Note right of G: Create signer_payload G->>W: Presentation request
including signer_payload W->>H: Request consent for presentation H->>W: Consent granted W->>G: Verifiable presentation Note right of G: Generate new asset-specific
credential using VP content

In this scenario, the issuer of the asset-specific credential is not the credential holder but the actor that is generating the identity assertion, as depicted in the following diagram, which is non-normative:

Named actor without signature authority
Figure 6. Named actor without signature authority

In this scenario, the identity assertion consumer SHOULD make its trust decision based on the following predicates:

  1. Does the identity assertion consumer trust the identity assertion generator to request a credential summary from the credential holder and accurately reflect that credential summary into the identity assertion?

    1. Is there a direct trust relationship with the identity assertion generator? If so, proceed to step 2.

    2. Is there a transitive trust relationship with the identity assertion generator via its credential issuer? (In other words, does the identity assertion consumer trust the identity assertion generator’s credential issuer to issue valid signature credentials?)

    3. If so, has the credential issuer issued a revocation for the identity assertion generator’s credential? If so, do not proceed. The identity assertion SHOULD be treated as untrusted.

    4. If the transitive trust relationship exists and the credential has not been revoked, proceed to step 2.

    5. If neither relationship can be demonstrated, do not proceed. The identity assertion SHOULD be treated as untrusted.

  2. Does the identity assertion consumer trust the named actor’s credential issuer to issue valid credentials?

    1. Is there a direct trust relationship with the named actor’s credential issuer? If so, proceed to step 3.

    2. Is there a transitive trust relationship with the named actor’s credential issuer via its credential issuer? (In other words, does the identity assertion consumer trust the named actor’s credential issuer to issue valid credentials?) If so, proceed to step 3.

    3. If neither relationship can be demonstrated, do not proceed. The identity assertion SHOULD be treated as untrusted.

  3. Has the credential issuer issued a revocation for the named actor’s credential?

    1. If so, the identity assertion SHOULD be treated as untrusted.

    2. If no such revocation has been issued, the identity assertion SHOULD be treated as trusted.

9.3. Threats to trust model

This section is non-normative.

This section enumerates a number of potential attacks on the identity assertion trust model. If concrete guidance to mitigate or prevent a specific attack is available, that guidance should be incorporated as specific normative requirements elsewhere in this specification and referenced here.

9.3.1. Replay attacks

An attacker could, in theory, extract an identity assertion and place it into a new C2PA Manifest as part of an unrelated C2PA asset.

To prevent this attack, a valid identity assertion must contain a signer_payload.referenced_assertions include a hard binding assertion that properly describes the C2PA asset. A compliant identity assertion consumer should detect that the hard binding assertion referenced by the original identity assertion does not match the attacker’s C2PA asset and fail validation.

9.3.2. Name collisions

  • How do we differentiate between two John Does registered to the same identity provider?

  • How do we differentiate between two John Does registered to different identity providers?

  • How do we differentiate between anonymous reporters who need to remain anonymous at a technical level?

  • How do we differentiate between anonymous reporters in the UI? If we allow them to specify what gets displayed for their ID, then that opens the door for impersonation attacks. As an example, consider an "anonymous" user who asks to be identified as “Barack Obama” to the end-user.

  • Is the end user required to memorize serial numbers, cryptographic identifiers, certificate chains, and the like? Do end users need to keep a “phone book” of serial numbers that they trust?

Whether a name collision is intentional or coincidental, careful attention should be paid as to how to gather the appropriate technical details to allow differentiate distinct actors and to meaningfully expose that differentiation in user experience.

TO DO (add new GH issue): Think through identity presentation so as to provide meaningful differentiation between similarly-named actors.

9.3.3. Parsing and validation errors

Any content including, but not limited to the named actor’s identity, could be subject to a number of parsing or validation attacks:

  • Injection of code (HTML, JavaScript, etc.) into a text field so that the attacker can attempt to control what is displayed to the end user. Does the specification support markup in the text fields? Should all fields be considered unicode strings?

  • Text fields of excessive length: These can cause buffer overflows or could be an attempt to "push" trusted UI indicators out of the rendered view of the user. Should the specification place an upper bound of the length of given fields?

  • Injection of special characters: These can be truncation attacks. For instance, if the UI parser is written in C, then an attacker might try to inject a null byte to cause discrepancies in the code about what should be displayed. Are there any special characters necessitated by the specification that need to be escaped before being placed into the text field of an assertion?

TO DO (add new GH issue): Refine discussion of above items and add specific guidance.

9.3.4. Homoglyph and typo-squatting attacks

An attacker could use of visually-similar Unicode characters to mislead an end user into accepting a mistaken assertion of identity on behalf of a specific named actor. Such attacks are common in phishing and impersonation attacks conducted on domain names and social media.

TO DO (add new GH issue): Discuss and add guidance.

9.3.5. Validation of credential status

  • Does the identity assertion consumer need to do real-time checks to ensure that the named actor’s credential is still considered valid? What harms or risks may accrue to the identity assertion consumer in the process of making online inquiries about such status?

  • If a bad actor makes it into the system, then how does that ID get blocked? Is there an OCSP,CRL, bad actor database, etc.? If so, what is the governance for such a list and who maintains it?

  • What if the bad actor is in an ingredient and not the primary piece of content?

  • What is displayed to the end-user in both the primary content and ingredient scenarios?

TO DO (add new GH issue): Discuss and add guidance.

9.3.6. Compromise of private key material

In practice, the credential holder’s signing keys will be issued to systems that perform identity assertion signing operations. These systems may make these operations available to end users and/or be deployed to user-owned platforms (e.g., mobile phones). Issuance or disclosure of signing keys to malicious actors enables attackers to create claim signatures on arbitrary assets using the compromised identity. The resulting identity assertions will be valid in terms of the identity assertion specification, but effectively allow for spoofing identity.

It is therefore important that systems that manage identity assertion signing keys adhere to security and key management best practices. This includes leveraging platform-specific features (e.g., hardware security modules and cloud key management services), minimizing key reuse, and revoking keys when compromise is suspected. For more information on key management, see the NIST Key Management Guidelines.

Some identity assertion generation and signing systems may be exposed to untrusted users. Exploitation or misuse of these systems may allow attackers to create identity assertion signatures on arbitrary assets using identities provided by the system. The resulting identity assertions will be valid in terms of the identity assertion specification, but effectively allow for spoofing identity. The impact of such an attack may be amplified if identities are shared between users, and/or if the attack goes undetected for an extended period of time.

Identity assertion generation and signing systems should consider industry best practices for information security, secure development and operation, and anti-abuse practices, including leveraging available platform-specific features for deployment (e.g., Android SafetyNet, Apple DeviceCheck and AppAttest).

9.3.7. Tampering with identity assertion

TO DO: Write new section describing potential attacks on the content of the identity assertion itself. Signature within IA should protect against signer_payload modifications and C2PA hash-link references should protect against substitution of a new identity assertion.

9.3.8. Re-signing by an adversarial claim generator

TO DO: Update following discussion based on #95: Security fixes to identity map and subsequent related discussion in #97: Create a CAWG identity threat model.

Appendix A: Version history

This section is non-normative.

27 November 2023

  • Initial private review draft.

29 November 2023

  • Added this version history section.

30 November 2023

  • Minor proofreading edits.

01 December 2023

02 December 2023

03 December 2023

  • Add a TO DO item in Section 9, “Trust model”.

  • Proofreading: Correct references to distributed identifiers to decentralized identifiers.

  • Incorporate additional review feedback.

13 December 2023

15 December 2023

18 December 2023

20 December 2023

11 January 2024

  • Minor edits throughout the document based on feedback received over past few weeks.

24 January 2024

02 February 2024

04 February 2024

05 February 2024

  • Refined description of roles in governance.md in project repository.

19 February 2024

20 February 2024

  • Promoted from pre-draft to draft status.

26 February 2024

28 February 2024

  • Prepare 1.0-draft version.

  • Remove discussion of W3C VCs. (This section will be restored in a post-1.0 version.)

18 March 2024

  • Remove user experience section. (This section will be restored in a post-1.0 version.)

  • Remove W3C VC concepts from terms and definitions section. (This section will be restored in a post-1.0 version.)

  • Clarify usage of credential holder versus credential subject.

19 March 2024

  • Close open issue regarding EKU requirements for X.509 credentials.

  • Clarify wording regarding prohibition on identity assertion self-references.

25 March 2024

  • Create a top-level tbs map which contains referenced_assertions and may be extended to include other material which will be signed by the credential holder.

  • Add language stating that this assertion is not intended to convey ownership of a C2PA asset.

  • Clarify wording about zero-fill bytes in pad1 and pad2 fields.

  • Add requirement on validator to report duplicate assertion references if found.

01 April 2024

  • Change validation language to be more permissive of extra fields in CBOR map data structure.

08 April 2024

  • Rename tbs (to be signed) to signer_payload.

  • Change sig_type value for X.509 to cawg.x509.cose.

  • Reserve sig_type values starting with cawg. for future CAWG specifications.

  • State that future versions of this specification may add new sig_type values without breaking the identity assertion format or requiring a major version change.

29 April 2024

Pending merge