Engineers tap machine learning to improve graph analytics

Augmenting graph analytics with AI can detect more complicated anomalies, vendors say.

analytics

Graph databases hold numerous attractions for financial services users, among them the ability to detect hidden patterns in data that could be harder to spot otherwise.

Some financial institutions are now looking to go a step further by augmenting graph analytics—the process of analyzing data in graph format to show relationships between data points—with machine learning to identify more complex data patterns.  

Andy McMahon, machine learning engineering lead at UK bank NatWest, says that in the context of graph databases, machine learning can leverage algorithmic and computational power to answer more complex questions without the need for explicit programming.

“You can catch a lot of cases by writing specific queries against a database, but you can do so much more with machine learning,” he says. “In any potential application of machine learning, there are cases where you want to extend your capability and catch cases that would have been hard to anticipate.”

A common industry use case for graph databases is the detection of fraudulent behavior, such as a web of suspicious transactions at a bank. “Seeing these types of patterns can be much harder when using classical or traditional tabular and flatter data models,” McMahon says.

As a hypothetical example, this technology could be used to classify customers based on their relationships with other customers. “The customers could be nodes on the graph and we want to classify if they belong to cohort A or B in the cases where we don’t have that information. If you were to try and write a set of queries or rules to do this accurately, it would be extremely difficult—or perhaps impossible—whereas this is quite a standard supervized machine learning problem,” he says.

Trevor Belstead, chief information officer for wholesale banking and post-trade at technology provider and consultancy Delta Capita, says that while graph analytics and machine learning on their own are not enough to find connections, a combination of the two can be significant. “The benefit of the graph analytics being driven by machine learning or deep learning [is in being able to] provide the usable interface to understand information quicker,” he says.

Belstead says combining graph-based anti-money-laundering (AML) and machine learning models can also help reduce people risk. “People are still the fallible part of AML. We have seen multiple articles of people doing the wrong thing inside the bank, deliberately pushing through transactions [and] fraudulent things,” he says.

On the know-your-customer front, the vendor says its platform can look at connected parties and their transactions, and by understanding their radius of influence through the use of graph models it can start to understand the scope of impact of a particular instance of fraud.

Belstead says machine learning can identify anomalies to a degree of accuracy and understanding that would be beyond the capabilities of graph analytics.

“Deep learning and machine learning will provide those anomalies. Build them into the machine model and build them into the predictive model—that is something you do over time to keep the model relevant—and then the graph analytics will use that as well as further information,” he says.

Just the beginning

Rik Van Bruggen, vice president for Europe, the Middle East and Africa at graph database technology provider Neo4J, says transforming a dataset into a graph-based network can reveal interesting patterns that were not visible before.

Take data lineage, for instance. Regulatory reporting requirements require banks to be able to go back along the complicated chains their data passes through, understand how it is manipulated, and trace it back to an authoritative source. Using a graph database, such connections can be queried and revealed within milliseconds, which helps make the governance process more transparent and auditable. 

“What is happening right now—and I think the next two or three years are going to be full of this—is that we apply algorithms and machine learning techniques through these network structures,” Van Bruggen says.

On top of its graph database, Neo4J offers a set of plug-ins that allow clients to explore graph data using machine learning. “It leverages the infrastructure that we build up with the graph database and is another way of providing value based on that dataset,” he says.

Its collection of algorithms are grouped into categories, such as ”centrality”, which focus on how important an entity is in a network, or “similarity”, which allow clients to calculate, based on the structure of the network, how similar one node or piece of information should be to another. If one entity is already suspect, for instance, then a similar one might also be suspect.

Van Bruggen says such techniques are going to allow users to make more sense of these network structures. “We have only seen the early beginnings of that,” he says.

NatWest’s McMahon says machine learning won’t replace graph analytics completely. “If you can write a simple rule to capture something then it’s very computationally cheap to run that rule, and you should do that. What machine learning will help do, though, is remove the need to worry about writing 100 rules or 1,000 rules or 10,000 rules to capture all cases, so it allows you to scale much more,” he says.

In terms of the types of machine learning that could be applied to graph analytics, McMahon sees many options available. These include standard algorithms that perform searches inside a graph or carry out various types of sorting, all the way up to graph embeddings or neural networks. 

“The wider data science and machine learning community are at a very early stage of exploring the real capabilities of applying machine learning to graphs, which makes it a very exciting area,” he says. 

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

Data catalog competition heats up as spending cools

Data catalogs represent a big step toward a shopping experience in the style of Amazon.com or iTunes for market data management and procurement. Here, we take a look at the key players in this space, old and new.

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here