Reporting non-nullable violations

Photo by Fleur on Unsplash

Reporting non-nullable violations

Find the complete code for this article on Github

Over the first three parts of this series, we've built an Apollo Server plugin that reports all of the fields in each request and the possible types it could be. Today we'll extend this to analyse the actual response and report any responses that contain nulls in places where they shouldn't.

For each node in the response tree, we find any violations by

  1. Determining the concrete types the data might be

  2. Determining which of the fields we should examine

  3. For each field, check that it is not violating our proposed schema

  4. For each field that is branch (non-scalar) field:

    1. Get the possible types of the field

    2. Recurse on each of the value(s)

Articles in this series:

Step 1: Determine the concrete parent types

As you may recall from part 3, we receive an array of parent types in the possibleParents argument. In order to check all of the possible fields to see if they've been declared @proposedNonNullable, we need to expand any interfaces or unions into their concrete types:

export function getProposedNonNullableViolations(
  requestContext: GraphQLRequestContext<BaseContext>,
  possibleParents: readonly (GraphQLObjectType | GraphQLAbstractType)[],
  selectionSet: SelectionSetNode,
  path: readonly string[] = [],
  data: Record<string, unknown>
): ProposedNonNullableViolation[] {
  const allPossibleParents = possibleParents
    .flatMap(getPossibleTypes(requestContext))
    .filter(isPossibleParent(requestContext, data))
  // ...
}

Let's have a look at the code for getPossibleTypes():

export function getPossibleTypes(
  requestContext: GraphQLRequestContext<BaseContext>
): (
  graphQLType: GraphQLAbstractType | GraphQLObjectType
) => GraphQLObjectType[] {
  return (graphqlType) => {
    if (isAbstractType(graphqlType)) {
      return requestContext.schema
        .getPossibleTypes(graphqlType)
        .flatMap(getPossibleTypes(requestContext))
    } else {
      return [graphqlType]
    }
  }
}

This function accepts the request context, and returns a function that can be called on a GraphQLAbstractType or GraphQLObjectType. If it's an abstract type, it asks the schema for the types that extend it. Since these might also be abstract, we need to recurse on each of the returned types. If it's a concrete type, we simply return it as an array.

Here is the code for isPossibleParent(). It's a simple function that uses data.__typename (if it was requested by the client) to narrow down allPossibleParents to a single element:

export function isPossibleParent(
  requestContext: GraphQLRequestContext<BaseContext>,
  data: { __typename?: string }
): (parent: GraphQLObjectType) => boolean {
  return (parent) => {
    return isNil(data.__typename) || parent.name === data.__typename
  }
}

Step 2: Determine which fields to examine

We need to get the list of fields to examine like we did in parts 2 and 3. However if we have data.__typename, we can use it to ignore fields that aren't relevant. This is done with the isPossibleField() function, added to the chain of functions that compute fieldNodes:

  const fieldNodes = selectionSet.selections
    .flatMap((selection) => getFields(requestContext, selection))
    .filter(isPossibleField(requestContext, data))
    .map(prop('fieldNode'))
    .filter((fieldNode) => fieldNode.name.value !== '__typename')
  const fieldNodesByDataProperty = groupBy(getDataProperty, fieldNodes)
  const uniqueFieldNodes = Object.values(fieldNodesByDataProperty).map(
    head
  ) as FieldNode[]

To allow us to do this, we have extended getFields() to return an array of { parent?: string, fieldNode: FieldNode }, so that when we have a fragment that's limited to a particular type (eg ...on Book { }) we can exclude those fields if data is a different type of object.

Here is the code for isPossibleField(). It looks very similar to isPossibleParent():

export function isPossibleField(
  requestContext: GraphQLRequestContext<BaseContext>,
  data: { __typename?: string }
): (requestedField: RequestedField) => boolean {
  return (requestedField) => {
    if (!requestedField.parent) {
      return true
    } else {
      const parentType = requestContext.schema.getType(
        requestedField.parent
      ) as GraphQLObjectType | GraphQLAbstractType
      const possibleParentTypes = getPossibleTypes(requestContext)(parentType)

      return possibleParentTypes.some(isPossibleParent(requestContext, data))
    }
  }
}

After filtering out the fields that aren't relevant to this object, we need to extract the FieldNodes from the objects returned by getFields(), and finally we want to ignore the __typename field because it's not part of our schema.

To avoid walking the same parts of tree multiple times, we filter out any duplicate fields. We do this in a slightly convoluted way because we also need to combine the selection sets of any branch nodes. First we group all the fields by their data property (their alias if they have one, otherwise the field's name) and store it in fieldNodesByDataProperty, which is a Record<string, FieldNode[]>. We then take the first element (head()) of each set of field nodes. We need to add as FieldNode[] to the end because head() is typed as returning FieldNode | undefined since it will return undefined when given an empty array. We know that all of the arrays it is passed will contain at least one element, so we can safely narrow the type here.

Step 3: Checking node violations

This is where we actually check for null values. Here's the code:

function getNodeViolations(
    leafNode: FieldNode
  ): ProposedNonNullableViolation[] {
    const isProposedNonNullable = allPossibleParents.some(
      fieldIsProposedNonNullable(leafNode)
    )
    const violationsAreDefinite = !allPossibleParents.some(
      valueCanBeNull(leafNode)
    )

    if (!isProposedNonNullable) {
      return []
    }

    const dataProperty = getDataProperty(leafNode)
    const value = data[dataProperty]
    const isArray = Array.isArray(value)
    const values = isArray ? value : [value]
    const violatingPaths: string[] = []

    values.forEach((item, idx) => {
      if (isNil(item)) {
        violatingPaths.push(
          [...path, isArray ? `${dataProperty}[${idx}]` : dataProperty].join(
            '.'
          )
        )
      }
    })

    return violatingPaths.map((path) => ({
      path,
      isDefinite: violationsAreDefinite
    }))
  }

The first task is to decide whether any of the possible fields has been declared as @proposedNonNullable. That information is in the astNode of the schema field (AST stands for Abstract Syntax Tree), so we can check it as follows:

export function isProposedNonNullable(
  field: GraphQLField<unknown, unknown>
): boolean {
  return Boolean(
    field.astNode?.directives?.some(
      (directive) => directive.name.value === 'proposedNonNullable'
    )
  )
}

We then determine whether any violations are definite. A violation is definite when the relevant field in all the possible parents is either non-nullable or @proposedNonNullable. Most of the time this will be true, however it will be false when you have a schema like this and the client has not requested __typename:

type Book {
  name: String @proposedNonNullable
}

type Movie {
  name: String
}

union Media = Book | Movie

Next, we get the property that the data is actually stored in, taking into account the field's alias if any, and normalise the value to an array. We do that normalisation so that the following code can treat array and singular values identically. If @proposedNonNullable has been declared on a list field, we want to treat both null and [null] as violations.

Finally we iterate through each of the values of the normalised array, recording the full path of any that we find that are null.

Step 4: Checking the children of branch nodes

We need a list of the unique branch nodes, so

Our code to check a branch node looks like this:

 function processBranchNode(
    branchNode: FieldNode
  ): ProposedNonNullableViolation[] {
    const dataProperty = getDataProperty(branchNode)
    const value = data[dataProperty]
    const isArray = Array.isArray(value)
    const values = isArray ? value : [value]
    const possibleNodeTypes = allPossibleParents
      .map(getParentField(branchNode))
      .filter((value): value is GraphQLField<unknown, unknown> =>
        Boolean(value)
      )
      .map(prop('type'))
      .map(getBaseType)
      .filter(anyPass([isObjectType, isAbstractType]))
    const uniquePossibleNodeTypes = uniqBy(prop('name'), possibleNodeTypes)
    const violations: ProposedNonNullableViolation[] = []

    values.forEach((item, idx) => {
      if (!isNil(item)) {
        violations.push(
          ...getProposedNonNullableViolations(
            requestContext,
            uniquePossibleNodeTypes,
            selectionSetsByDataProperty[dataProperty],
            [...path, isArray ? `${dataProperty}[${idx}]` : dataProperty],
            item as Record<string, unknown>
          )
        )
      }
    })

    return violations
  }

The first 4 lines of the function match the ones in getNodeViolations. After that we work out the types this node could possibly be. There's a bit to unpack here, but it's not too complex:

  1. For each of our possible parents, we get the relevant field (ie the one having the same name as our branch node)

  2. Some of our parents may not have a field with that name, so filter out any undefined values in the array

  3. We're only interested in the field's type, so extract the type property of each field

  4. Remove any modifiers like list or non-nullable, ie convert [MyType!]! to MyType

  5. Remove any scalar types

We then remove any duplicates by using uniqBy(). uniquePossibleNodeTypes will be passed in as the list of possibleParents when we recurse.

Finally we check for violations within each non-null element of the values array by calling getProposedNonNullableViolations(). selectionSetsByDataProperty as a Record<string, SelectionSetNode> and is created by the main function with the following code:

  const selectionSetsByDataProperty = mapObjIndexed(
    getCombinedSelectionSet,
    fieldNodesByDataProperty
  )

// getCombinedSelectionSet.ts
export function getCombinedSelectionSet(
  fieldNodes: readonly FieldNode[]
): SelectionSetNode {
  const selections = fieldNodes.flatMap(
    (fieldNode) => fieldNode.selectionSet?.selections ?? []
  )

  return { kind: Kind.SELECTION_SET, selections }
}

We do this instead of using the current branch node's selection set because we may have a query like this:

query {
  authors {
    name
  }
  authors {
    location
  }
}

We don't want to perform the recursion multiple times, we only directly process one of the authors field nodes, but we need to check both name and location for violations.

Returning the violations

Finally, we need to call our getNodeViolations() and processBranchNode() functions. We determine the branch nodes by filtering only the field nodes that have a selection set.

const uniqueBranchNodes = uniqueFieldNodes.filter(
    (fieldNode) => selectionSetsByDataProperty[getDataProperty(fieldNode)].selections.length
  )

  return [
    ...uniqueFieldNodes.flatMap(getNodeViolations),
    ...uniqueBranchNodes.flatMap(processBranchNode)
  ]

And there we have it - our getProposedNonNullableViolations() function will scan a response and return all the paths that violate our @proposedNonNullable directive.

Appendix: Why not use willResolveField instead?

Apollo server provides an additional hook called willResolveField which is called immediately before each call to a field resolver. The willResolveField hook can return a function that's then called once the field's value has been resolved.

Writing the plugin to use this instead of analysing the response at the end would have been much simpler. Unfortunately we can't do that because the hook is only called when there is a field resolver, so we can't check the majority of fields that don't have a field resolver.