docs/website/src/pages/en/subgraphs/querying/distributed-systems-guide.mdx at f7470889dd025a76a279aa858dd48be03192ca88 · graphprotocol/docs

title	How to Retrieve Consistent Data in a Distributed Environment

Below are two distinct how-to scenarios that demonstrate how to maintain consistent data when querying The Graph in a distributed setting.

By following these steps, you can avoid data inconsistencies that arise from block reorganizations (re-orgs) or network fluctuations.

How to Poll for Updated Data

When you need to fetch the newest information from The Graph without stepping back to an older block:

Initialize a minimal block target: Start by setting minBlock to 0 (or a known block number). This ensures your query will be served from the most recent block.
Set up a periodic polling cycle: Choose a delay that matches the block production interval (e.g., 14 seconds). This ensures you wait until a new block is likely available.
Use the block: { number_gte: $minBlock } argument: This ensures the fetched data is from a block at or above the specified block number, preventing time from moving backward.
Handle logic inside the loop: Update minBlock to the most recent block returned in each iteration.
Process the fetched data: Implement the necessary actions (e.g., updating internal state) with the newly polled data.

/// Example: Polling for updated data
async function updateProtocolPaused() {
  let minBlock = 0

  for (;;) {
    // Wait for the next block.
    const nextBlock = new Promise((f) => {
      setTimeout(f, 14000)
    })

    const query = `
      query GetProtocol($minBlock: Int!) {
          protocol(block: { number_gte: $minBlock }, id: "0") {
            paused
          }
          _meta {
            block {
              number
            }
          }
      }
    `

    const variables = { minBlock }
    const response = await graphql(query, variables)
    minBlock = response._meta.block.number

    // TODO: Replace this placeholder with handling of 'response.protocol.paused'.
    console.log(response.protocol.paused)

    // Wait to poll again.
    await nextBlock
  }
}

How to Fetch a Set of Related Items from a Single Block

If you must retrieve multiple related items or a large set of data from the same point in time:

Fetch the initial page: Use a query that includes _meta { block { hash } } to capture the block hash. This ensures subsequent queries stay pinned to that same block.
Store the block hash: Keep the hash from the first response. This becomes your reference point for the rest of the items.
Paginate the results: Make additional requests using the same block hash and a pagination strategy (e.g., id_gt or other filtering) until you have fetched all relevant items.
Handle re-orgs: If the block hash becomes invalid due to a re-org, retry from the first request to obtain a non-uncle block.

/// Example: Fetching a large set of related items
async function getDomainNames() {
  let pages = 5
  const perPage = 1000

  // First request captures the block hash.
  const listDomainsQuery = `
    query ListDomains($perPage: Int!) {
      domains(first: $perPage) {
        name
        id
      }
      _meta {
        block {
          hash
        }
      }
    }
  `

  let data = await graphql(listDomainsQuery, { perPage })
  let result = data.domains.map((d) => d.name)
  let blockHash = data._meta.block.hash

  // Paginate until fewer than 'perPage' results are returned or you reach the page limit.
  while (data.domains.length === perPage && --pages) {
    let lastID = data.domains[data.domains.length - 1].id
    let query = `
      query ListDomains($perPage: Int!, $lastID: ID!, $blockHash: Bytes!) {
        domains(
          first: $perPage
          where: { id_gt: $lastID }
          block: { hash: $blockHash }
        ) {
          name
          id
        }
      }
    `

    data = await graphql(query, { perPage, lastID, blockHash })

    for (const domain of data.domains) {
      result.push(domain.name)
    }
  }

  // TODO: Do something with the full result.
  return result
}

Recap and Next Steps

By using the number_gte parameter in a polling loop, you ensure time moves forward when fetching updates. By pinning queries to a specific block.hash, you can retrieve multiple sets of related information consistently from the same block.

If you encounter re-orgs, plan to retry from the beginning or adjust your logic accordingly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Poll for Updated Data

How to Fetch a Set of Related Items from a Single Block

Recap and Next Steps

FilesExpand file tree

distributed-systems-guide.mdx

Latest commit

History

distributed-systems-guide.mdx

File metadata and controls

How to Poll for Updated Data

How to Fetch a Set of Related Items from a Single Block

Recap and Next Steps