GQL

Blog

  • Complex Pattern Matching in GQL

    The workhorse of Graph Query Language (GQL) is pattern matching. Pattern matching is done by specifying one or more path patterns in the MATCH clause. A single path pattern matches a linear path of vertices and edges, while more complex patterns can be matched by combining multiple path patterns, separated by comma. In this blogpost we will explore complex patterns.

    We will use the following sample Graph of Restaurants, Likes, Cities and Persons

    Let’s create some sample data

    create graph restaurant_graph {
      node person({name string})
      , node restaurant({name string})
      , node country({name string})
      , node city({name string})
      , node state({name string})
      , relationship located_in ()-[]->()
      , relationship lives_in ()-[]->()
      , relationship likes ()-[]->()
    };
    
    use restaurant_graph;
    
    insert (n1:person {_id:'uroosa', name:'Uroosa'})
      , (n2:person {_id:'fatima', name:'Fatima'})
      , (n3:person {_id:'maryam', name:'Maryam'})
      , (n4:person {_id:'zainab', name:'Zainab'})
      , (n5:person {_id:'hannah', name:'Hannah'})
      , (n6:restaurant {_id:'vostok', name:'Vostok'})
      , (n7:restaurant {_id:'kusan', name:'Kusan '})
      , (n8:restaurant {_id:'navat', name:'Navat'})
      , (n9:city {_id:'Santa Clara', name:'Santa Clara'})
      , (n10:city {_id:'San Jose', name:'San Jose'})
      , (n11:city {_id:'Los Angeles', name:'Los Angeles'})
      , (n12:state {_id:'Calfornia', name:'California'})
      , (n1)-[:likes]->(n6)
      , (n4)-[:likes]->(n6)
      , (n1)-[:likes]->(n8)
      , (n2)-[:likes]->(n8)
      , (n3)-[:likes]->(n8)
    
      , (n1)-[:lives_in]->(n9)
      , (n2)-[:lives_in]->(n9)
      , (n3)-[:lives_in]->(n10)
      , (n4)-[:lives_in]->(n9)
      , (n5)-[:lives_in]->(n10)
      
      , (n6)-[:located_in]->(n9)
      , (n7)-[:located_in]->(n10)
      , (n8)-[:located_in]->(n11)
      , (n9)-[:located_in]->(n12);

    Let’s say we are interested in the finding all the Liked Restaurants where the Restaurant is in the same City where the Person who Liked it.

    match (n1:person)-[:lives_in]->(n2:city)
    , (n3:restaurant)-[:located_in]->(n2)
    , (n1)-[:likes]->(n3:restaurant)
    
    return table(n1.name, n2.name as lives_in, n3.name as restaurant);
    

    This will return

    Notice the use of 3 Paths separated to commas to match this complex pattern. Note that the MATCH pattern share the variable n1, n2, and n3 . This means that they are the same variable rather than three different variables. This is what enables us to express the original requirements as MATCH patterns.

    GQL Binding Variables (like n1, n2, and n3 in the aforementioned example) carry data between statements. When you reuse a variable name, GQL automatically ensures it refers to the same data, creating powerful join conditions.

    Now let’s say we interested in finding all the Liked Restaurants where the Restaurant is NOT in the same City where the Person who Liked it. We can use the NOT EXISTS clause for this.

    match (n1:person)-[:lives_in]->(n2:city)
    , (n1)-[:likes]->(n3:restaurant)
    WHERE NOT EXISTS {
        match (n3)-[:located_in]->(n2)
    }
    return table(n1.name, n2.name as lives_in, n3.name as restaurant);

    Observe that designing the same request in SQL using recursive Common Table Expressions would be way more complex, difficult to read, and so complicated to maintain and update.