Architecture

Overview

Witness's design is that of a Web2 service with a minimal Web3 component that gives it superpowers. The Web2 portions handle connecting to many clients and batch-merklizing their data as efficiently as possible. The Web3 component creates a publically accessible anchor of trust for potentially untrusting clients and consumers to coordinate on. With this combination, Witness is able to provide Web3 utility at Web2 scale.

When comparing this architecture to other blockchains layers and resources, we describe Witness as a Data Existence Layer. It solely extends verifiability and existence guarantees from existing blockchains.

Data Existence

Components

To reason about Witness, consider a few components:

Client: a user or application that wants to produce provenance of digital data
Witness server: an API server that accepts hashes from clients and adds them to an ever-growing merkle tree, to be checkpointed and proven against
Witness contract: a smart contract on Ethereum that stores Witness merkle roots and provides proof verification
Consumer: a user or application that wants to verify the provenance of digital data

With these pieces, we can go through the flows of producing and consuming provenance.

Producing Provenance

Clients that produce provenance may be applications or users. For example, a user may want to prove that they were the first to create a piece of digital content, or an application may want to record an interaction at a point in time. In either case, the client will need to produce a hash of the data they want to prove provenance for.

With this hash, they can initiate the following steps which end with them receiving a proof of its provenance:

1. Client submits hash to Witness's server

The client submits the hash to Witness's server. The server queues the hash, immediately returning a sequence ID to the client. The queue of hashes is frequently flushed and merklized in a batch, in anticipation of the next step.

2. Witness's server submits merkle root to Ethereum

After flushing and merklizing the queue of hashes accumulated from clients, the server submits the resulting merkle root to the Witness smart contract on Ethereum. The Witness smart contract stores the merkle root, creating a public record of the its existence at a given point in time. Once the transaction is settled, clients have an etablished "checkpoint" covering their hash.

3. Server distributes proof

Once a merkle root "checkpoint" is settled, the server can distribute a proof to the client that verifies hash's inclusion. In combination with the onchain record of that root's submission, the proof can be used to verify the hash's provenance.

Consuming Provenance

The consumer may be an application that wants to verify the provenance of a piece of content before displaying it to users, or a user that wants to verify the provenance of a piece of content before purchasing it. The consumer may even be a smart contract on Ethereum looking to verify provenance of submitted data before acting upon it.

The consumer is given a hash of the data they want to verify. With this hash, they can initiate the following steps:

1. Consumer fetches the proof from the Witness Server

For convenience, the Witness server will serve proofs indexed by hash. However, it's worth noting that the proof could also be provided by any party, including the owner of the hash.

2. Consumer verifies the proof against the Witness Contract

When verifying a proof, the consumer is essentially verifying its inclusion in a merkle tree. The Witness contract is used as a bulletin board for merkle roots, so any proofs can have their merkle roots identified as existing at a given point in time. Onchain consumers have the option of verifying their proof directly against the Witness contract, creating provenance-aware contracts.

Properties

Web2 scale, Web3 Utility

At a high level, Witness does the following:

batch-merklize hashes submitted by clients
submit the merkle root onto the chain
serve proofs for the hashes

Submitting the root onto the chain is the only step that involves a Web3-metered resource, and its cost is amortized across all the hashes in the batch. Merklizing and serving proofs are fairly straightforward operations and involve purely Web2-metered resources. By divorcing the amount of utility gained from the amount of gas paid, Witness can scale to Web2 levels of throughput, while still providing Web3 utility.

Private data, publically verifiable provenance

Witness's design allows for the production and consumption of provenance for private data. This is because the data itself is never exposed to any parties, only its hash. This gives ultimate power to the creator of the data, who can choose to reveal the data at any point in the future, while still being able to prove its provenance in private or ZK settings.

Favorable trust boundaries

While the current design of Witness involves a server as a privileged actor, the degrees to which they can misbehave (and therefore the amount a client must trust them) is minimal. Consider a malicious server; because they don't know the underlying data of any received hash, they can't effectively "frontrun". Further, because the hash reveals no information about the underlying data, the only grounds upon which things like censorship or ordering unfairness can occur are via other sources of information, such as who the submitter is.

As a client attempting to produce provenance, you have to trust the server to conduct the above flow and ultimately distribute a proof to you. While this is unideal compared to a truly trustless option, the flow will typically take less than 30 seconds, and once a client receives their proof they no longer have to rely on the server for anything.

A consumer of provenance needs only a single party to serve them the proof for the data they're seeking to verify the provenance of. The Witness server will serve proofs for convenience, but any party, including the party serving the data hash, can include the proof data for the consumer.

Overall, while there is a small trust component to being "included" in a Witness update and receiving your proof, the dynamic involves privacy of the underlying data, and provides the benefit that clients will have a trustless proof akin to having submitted the hash directly to the chain, but at zero cost, within 30 minutes. As an escape hatch, users are ultimately always able to bail out to submitting their hash directly to the underlying chain to capture its provenance.

Schema agnostic

Witness is schema agnostic, meaning it can be used to produce and consume provenance for any type of data. This is because Witness only deals with hashes of data, which are always a uniform 32 bytes. A simple schema may involve just the hash of the data payload, while a more complex one may involve signatures from corresponding private keys, or backlinks to previous provenance hashes.

This means Witness can capture the provenance of anything from large files to small incremental datapoints along a supply chain.

Provenance Usecases