Skip to content

Gateway

Winterhuman edited this page Apr 14, 2022 · 52 revisions

IPNS-Link Gateways, like IPFS Gateways, act as a bridge between IPFS and HTTP. Their purpose is to take the libp2p connection, formed between an internal go-ipfs node and the Listener of an Exposer, and use that connection to serve a website over a path or subdomain.

Components

Internal IPFS node

This is an internal go-ipfs or js-ipfs node which is run within every IPNS-Link Gateway, this node is responsible for the following:

  • Resolving the OriginIDs of Manifests.
  • Decrypyting the ciphertext within the Manifest to find the Listener's multiaddresses.
  • Establishing a connection to Listener nodes and then peering with them.
  • Forwarding the ports given by p2p-proxy over the libp2p connection to the Webserver.
  • Caching and serving files found under {cache,stream}_path on behalf of the Listener.

However, this internal IPFS node is not responsible for caching all /ipfs/* and /ipns/* addresses like a normal IPFS Gateway, the content IPNS-Link Gateways already caches plus the content of the regular IPFS network would dramatically increase the amount of disk space required by Gateway operators otherwise. Instead, the Gateway will do a random 303 redirect to Public IPFS Gateways such as https://ipfs.io, https://dweb.link, etc, as governed by the following rules:

dnslink.ipns.gateway.tld

  • This address is resolved by the internal IPFS node.
  • Once the DNSLink is resolved, resolve the /ipfs/cid or /ipns/oid address it points to.

oid.ipns.gateway.tld

  • This address is resolved by the internal IPFS node. However, it only resolves if oid.json is more than 24 hours old, or, if the current multiaddresses in oid.json don't resolve.
  • If the IPNS record we retrieve points to the same CID that /man-cache/oid.man has, read from /json-cache/oid.json.
  • If the IPNS record we retrieve points to a new CID, but oid.man already exists, overwrite oid.man with the new Manifest file and rewrite oid.json with the JSON within the new ciphertext.
  • If the IPNS record we retrieve points to a new CID, and no oid.man file exists for it, move onto the inline check.
  • If the IPNS record we retrieve does not have an inline file, treat it as key.ipns.gateway.tld.
  • If the IPNS record we retrieve does have an inline file, fetch the inline file and move onto the <!--IPNS-Link-- comment block check.
  • If <!--IPNS-Link-- is present, then it's a Manifest. Save it to /man-cache/oid.man in the internal IPFS node. Simultaneously, decrypt the ciphertext in the Manifest and save the JSON to /json-cache/oid.json in the Fileserver, that way /man-cache/oid.man doesn't need to be decrypted every time the JSON is needed.
  • If <!--IPNS-Link-- is not present, treat it as key.ipns.gateway.tld.

key.ipns.gateway.tld and cid.ipfs.gateway.tld

  • If key or cid is cached in the internal IPFS node, then the internal IPFS node will resolve them.
  • If key or cid is not cached in the internal IPFS node, do a 303 redirect to a random Public IPFS Gateway.
  • The only way key and cid can be present in the internal IPFS node's cache is if they were added by the node itself from the {cache,stream}_path of all connected Origins.

oid.ipns.gateway.tld/{cache,stream}_path

  • This address is resolved by the internal IPFS node.
  • stream_path and cache_path are directories at the Origin, however, they can never be accessed directly.
  • For all files and directories under these paths, the Listener has already cached everything under these paths within itself, the Gateway will redirect all requests for files and directories to /{ipfs,ipns}/ addresses listed in oid.json. cache and stream are stored in the Listener at first, but, the internal IPFS node will cache these files and directories once they're resolved until they are no longer listed.
  • However, cache and stream addresses are not treated equally by the internal IPFS node.
  • When cache_path is requested, the CID is resolved and cached once, this CID is kept until it is no longer present in any oid.json files.
  • When stream_path is requested, stream_key is resolved continuously to watch for changes, new CIDs added under the key are resolved and cached until those CIDs are no longer under any key in any oid.json files.
  • For stream_key, the Gateway will read from oid.json and look for /ipns/stream_key addresses in the JSON, the internal IPFS node will resolve all stream_key addresses and will then cache the content if it isn't already, when the user does oid.ipns.gateway.tld/stream_path the user will be served the content at /ipns/stream_key instead of the content at the Origin.
  • For cache_cid, the Gateway will read from oid.json and look for /ipfs/cache_cid addresses in the JSON, the internal IPFS node will resolve all cache_cid addresses and will then cache the content if it isn't already, when the user does oid.ipns.gateway.tld/cache_path the user will be served the content at /ipfs/cache_cid instead of the content at the Origin.
  • Because of how IPFS works, requests for stream_key and cache_cid will be sent to all known peers of the internal IPFS node, this includes Listeners as well as peers in the wider network. However, since Listeners are peered with the Gateway, they should always be the first to respond and serve the content, if they don't then the request will either be served by a random IPFS node or timeout.

gateway.tld/{ipfs,ipns}/*

  • If this address is given, do a 307 redirect to their subdomain equivalents.

gateway.tld/* (With the exceptions gateway.tld/{ipfs,ipns}/*)

  • This address is resolved by the Webserver.
  • The Webserver will check in the Fileserver if the requested files or directories are allowed to be exposed. For instance, /json-cache/* should never be exposed for the public to access.

Fileserver

This is a simple flat-file datastore which stores files and their timestamps, the Fileserver serves the following:

  • The internal files of the Gateway served by the Webserver, such as index.html, etc.
  • The public key of the Gateway available at gateway.tld/publickey, which is used for trusting Gateways.
  • oid.json files stored under the directory /json-cache/, the timestamps of these files are used for the expiry date check.

Note that the Fileserver could be a part of the Webserver and is not necessarily a separate application.

Webserver

This is a simple webserver which does the following:

  • It serves the internal files for the main website at gateway.tld.
  • It serves the Origins, that are made accessible by the ports exposed by the p2p-proxy connection, under the subdomain oid.ipns.gateway.tld.
  • It serves the responses given by the internal IPFS node and Public IPFS Gateways for when gateway.tld/{ipfs,ipns}/* or *.{ipfs,ipns}.gateway.tld is requested.

Failure Conditions

  1. If gateway.tld/{ipfs,ipns}/* is given, do a 307 redirect to their subdomain equivalents.
  2. If dnslink.ipns.gateway.tld, the DNSLink's /{ipfs,ipns}/ addresses, {oid,key}.ipns.gateway.tld, cid.ipfs.gateway.tld, or oid.ipns.gateway.tld/{cache,stream}_path can't be resolved, return a 404 error.
  3. If oid.ipns.gateway.tld can be resolved, but the ciphertext can't be decrypted, do a 307 redirect randomly to one of the trusted Gateways listed below the ciphertext. If no trusted gateways are listed, return a 403 error.
  4. If oid.ipns.gateway.tld can be resolved and the ciphertext can be decrypted, but the multiaddresses for the Listener don't lead anywhere, do a 307 redirect to the URL specified by on_fail in the JSON. If on_fail is empty, return a 404 error.
  5. If oid.ipns.gateway.tld can be resolved, the ciphertext can be decrypted, the Listener can be contacted, but, the connection has no p2p-proxy ports exposed for the Gateway to use (or the ports don't function), return a 500 error.
  6. For requests to gateway.tld that are not GET or HEAD, return a 405 error.
  7. If http://1.2.3.4/ is given, do a 307 redirect to https://gateway.tld.

Clone this wiki locally