The Cutting Edge: Network Analytics for Financial Fraud Detection and Mitigation

The application of network analysis to the growing challenge of fraud and financial crime is a fast emerging advanced data analytics frontier. As any good fraud investigation knows, fraud and financial crime are as much deep social phenomenon as aspects of financial transactions gone awry. Thus the application of social network analysis is able to provide deep insights to detect and prevent tangled and complex cases of fraud.

complexity

complexity

Fraud is estimated to consume approximately 5% of annual global gross commercial revenues, resulting in a loss of more than €2.6 trillion*. Further, with the swelling rise of globalization and inexorable advance of communication technology, fraud is growing each year in volume, scope, and sophistication. Traditional fraud detection and mitigation approaches involve highly manual forensics efforts and a ‘roll-up-your-sleeves-and-dig’ approach. However, the growing scale and complexity of fraud schemes means that fraudsters are increasingly able to circumvent and evade such techniques. The increasing ineffective strategy of random spot-checking allows profits to be bled from businesses and institutions.

Network analysis is a growing challenge of fraud and financial crime. By applying Data analytics this approach is able to provide deep insights to detect and prevent tangled and complex cases of fraud.

Want to learn more about network analysis?  Overview and demonstration of network analytics from a lecture by drs. Mongeau

Advanced analytics methods such as machine learning are applied to detect fraudulent transactions. With this approach the false positive ratio in fraud detection can be reduced dramatically, resulting in levels of operational efficiency and effectiveness not achievable via traditional fraud detection methods. The results are compelling: firms and agencies that apply machine learning to fraud detection have significantly improved their detection rates.

Along with machine learning, integrated ‘network graph’ analytics can be applied to detect and mitigate fraud. This end-to-end approach results in new insights for detecting and mitigating fraud: storing and retrieving interconnected information in a native ‘network graph’ format, delivering interactive network visualizations to discover hidden structures, locating clusters and patterns, identifying links in transaction chains, and applying specialized network-focused statistical algorithms to identify and extract patterns.

Want to know more about graph databases?  See Neo4J’s website…

1

Figure 1 An end-to-end network analytics approach: encoding data in a graph format, producing interactive visualizations, identifying patterns, and conducting statistical analysis.

Fraud is a willful act combining highly social factors: incentives, means, opportunities, and a dose of rationalization. This confluence of enabling factors can be tracked and detected as phenomenon which occur in social networks: individuals committing wrong and breaking rules in highly interconnected webs of trust, institutions, transactions, and exchanges.

Network graph analytics allows for a comprehensive and ‘native’ examination of the world as sets of overlapping networks. The general method works by obtaining seemingly simple heterogeneous datasets describing connections between associated elements. For instance, a dataset mixing a large set of tax and banking transaction records, company ownership data, property ownership information, cellphone records, and email exchanges.

2

Figure 2 The pillars of fraud: highly social fraud factors are difficult to identify via structured datasets, but emerge via the agglomeration of data into a network, allowing for latent pattern detection.

By loading such seemingly simple ‘metadata’ into a native network format, insightful and powerful visualizations of hidden patterns and connections in networks of exchanges can be produced. Also via this approach advanced statistical analysis can be conducted concerning the nature of the network exchanges, identifying ‘normal’ types of transactions and quickly isolating and detecting ‘abnormal’ exchanges.

3

Figure 3 A network representation of email communication patterns preceding the collapse of the U.S. energy trading company Enron. Key actors emerge as central ‘nodes’, along with lesser known facilitators. The same technique can be applied to complex forensics investigations, providing for the detection of communication patterns and identifying key parties. This approach can be enhanced with semantic analytics to highlight key terms and sentiments used in communications.

Traditional SQL-based relational database management (RDBM) approaches have inherent limitations in storing and extracting highly interconnected information. While a powerful method for ensuring data integrity and retrieving structured information, RDBMs solutions have inherent limitations when attempting to represent networks. For example, identifying chains of friends-of-friends-of friends, a common feature of social networking sites such as Facebook and LinkedIn, is best served by Not-Only-SQL (NOSQL) solutions such as graph databases.

4t

Figure 4 Seemingly straight-forward questions, such as “who co-owns a house with a friend-of-a-friend” become quickly complex and computationally intensive when using relational databases.

SQL QUERY – Who co-owns a house with a friend-of-a-friend?
SELECT [1Person].Person_Name, [2Address].Address, [2Address].City, [2Address].Country, [4Friend].ContactName, [2Address_2].Address, [2Address_2].City, [2Address_2].Country, [6Friend_of_Friend].ContactName, [2Address_1].Address, [2Address_1].City, [2Address_1].Country
FROM (((6Friend_of_Friend INNER JOIN ((4Friend INNER JOIN (1Person INNER JOIN 3Person_Friends ON [1Person].Person_Key = [3Person_Friends].Person) ON [4Friend].Person_Key = [3Person_Friends].Friend) INNER JOIN 5Friend_Friends ON [4Friend].Person_Key = [5Friend_Friends].Person) ON [6Friend_of_Friend].Person_Key = [5Friend_Friends].Friend) INNER JOIN 2Address AS 2Address_1 ON [6Friend_of_Friend].Person_Key = [2Address_1].Person_ForKey) INNER JOIN 2Address AS 2Address_2 ON [4Friend].Person_Key = [2Address_2].Person_ForKey) INNER JOIN 2Address ON [1Person].Person_Key = [2Address].Person_ForKey
ORDER BY [1Person].Person_Name, [4Friend].ContactName;

NOSQL graph databases store and retrieve data in a native network format. Neo4J is a market leading graph database which can be rapidly implemented (you can obtain a free O’Reilly book covering graph databases on their website). Applying network data storage, management, and retrieval, advanced network analytics can be applied to quickly detect potential fraud. Techniques applied include advanced network pattern discovery, cluster analysis, applied graph mathematics, statistical analysis of transaction chains, and transaction chain identification and retrieval.

This approach can be used to detect possible tax fraud schemes, EU VAT carousel fraud for instance, given a set of tax posting, invoicing, banking transaction, and company ownership data supplemented with select third-party data (credit risk and criminal records, for instance). The same approach can be applied to credit card fraud risk. Each additional dataset is ‘layered’ onto the base network, creating increasingly rich patterns. The approach is also useful in identifying structural weaknesses and areas where increased monitoring and control should be applied.

5

Figure 5 An example of a particular EU cross-border tax fraud scheme encoded as a basic network pattern in the Neo4J graph database. Searches can be quickly conducted across very large datasets based on known patterns. As well, discovery-focused statistical analysis can be conducted to identify unusual patterns appropriate for follow-up investigation.

The value in native graph storage and retrieval for fraud detection and mitigation is multi-layered:

  1. Establishes a native machine-readable and transferable language for describing fraud schemes;
  2. Establishes a method for storing fraud scheme patterns in terms of both general, high-level implementation (of interest to computation detection and machine learning) as well as lower-level detail (of interest to investigators and forensics specialists)
  3. Provides a platform for computational and forensics / investigative specialists to fluidly collaborate
  4. Provides a platform for mixing and ‘layering’ details concerning fraud cases in terms of multiple domains (i.e. financial transactions, communications, associations of individuals, business ownership)
  5. Provides a basis for interactive visualization for pattern discovery
  6. Provides a basis for pattern detection in large datasets based on pre-identified schemes
  7. Facilitates the identification of new schemes via statistical pattern analysis and machine learning
  8. Provides a basis for detecting potential systemic fraud via risk-weighted analysis of localized clusters
  9. Provides a quantitative measure of fraud patterns which can be used more generally to identify statistical propensity (i.e. path length and transaction volume along set paths in isolated clusters being a measurable risk-factor for carousel fraud)

These uses can easily overlap and integrate with existing approaches to fraud detection and analysis.

On the advanced forefront, network simulations can be run on network graph data. Once a particular market or set of transactions are sufficiently represented as network phenomenon, simulations can be run to better understand the nature of the market or phenomenon under investigation.

6

Figure 6 Multi-agent simulation utilizing network data can be used to examine the dynamic nature of networks, for instance to identify potential structural weaknesses in financial control or compliance systems based on game theory models of behavior.

Techniques such as multi-agent simulation of game theory scenarios can thus be applied to understand structural weaknesses in controls, markets, or transaction chains. For instance, by modeling a large trading operation as a network of transactions, trust, and incentives, trading operations can be simulated in order to detect and better understand the risk of trading fraud, a persistent problem causing ever-spiraling financial institutional losses.

WANT TO KNOW MORE? RECENT ACFE PRESENTATION ON ADVANCED ANALYTICS FOR FRAUD DETECTION AND MITIGATION
http://www.academia.edu/6645077/Continuous_Fraud_Monitoring_and_Detection_via_Advanced_Analytics_State-of-the-Art_Trends_and_Directions

* Source: ACFE ‘Report to the Nations 2012 Global Fraud Study’

http://www.acfe.com/uploadedFiles/ACFE_Website/Content/rttn/2012-report-to-nations.pdf

ABOUT THE AUTHOR

7

Scott Mongeau, MA MA GD MBA PhD (ABD)
Analytics Manager, Risk Services
Deloitte Netherlands

Scott Mongeau, Analytics Manager at Deloitte, has more than 20 years of experience in project-focused analytics functions in a range of industries. He is an active university researcher, lecturer, conference presenter and writer in the areas of data analytics, fraud analytics, and social network analysis (SNA).

, , , , , , , , , , ,

About SARK7

Scott Allen Mongeau (SARK7) is an INFORMS Certified Analytics Professional (CAP) and a Data Scientist in the Cybersecurity business unit at SAS Institute. Scott has over 20 years of experience in project-focused analytics functions in a range of industries, including IT, biotech, pharma, materials, insurance, law enforcement, financial services, and start-ups. Scott is a part-time PhD (ABD) researcher at Nyenrode Business University. He holds a Global Executive MBA (OneMBA) and Masters in Financial Management from Erasmus Rotterdam School of Management (RSM). He has a Certificate in Finance from University of California at Berkeley Extension, a MA in Communication from the University of Texas at Austin, and a Graduate Degree (GD) in Applied Information Systems Management from the Royal Melbourne Institute of Technology (RMIT). He holds a BPhil from Miami University of Ohio. Having lived and worked in a number of countries, Scott is a dual American (native) and Dutch citizen. He may be contacted at: webmaster@sark7.com All posts are copyright © 2015 SARK7 All external materials utilized imply no ownership rights and are presented purely for educational purposes.

View all posts by SARK7

Subscribe

Subscribe to our RSS feed and social profiles to receive updates.

10 Comments on “The Cutting Edge: Network Analytics for Financial Fraud Detection and Mitigation”

  1. Errol Says:

    Great article!

    Regarding the rdmb setup you should be able to meet most of the requirements stated in the article with the traditional many to many approach. The weakness with the tables represented in the picture is poor design which sets the limitation to a fixed nr of friends. The improved design could then be visualized by using software like I2 or even free open open source libraries like d3js

    Reply

    • sctr7 Says:

      Agreed that relational database approach can be used for this type of problem (a simple extension of join recursion). Performance, query simplicity, and flexibility in data modeling are three benefits of native graph data storage and retrieval.

      There are some computational problems which can be more elegantly addressed via a native graph approach, travelling salesman type problems for example: http://www.codeproject.com/Articles/536506/MappingplusShortestplusRoutesplusUsingplusaplusGra

      LinkedIn, Twitter, and many online dating service utilize graph databases as there are clear benefits of native graph storage in their highly networked problem sets. Facebook apparently utilizes a hybrid relational / graph data storage and retrieval approach.

      Recently I was on a large forensics project where we were working with hundreds of RDBM tables requiring complex SQL procedures that were at times 70 pages long. As the problem set involved highly networked phenomenon, this would have been a good case for using graph database storage and retrieval.

      The value proposition is covered in more detail in the O’Reilly book ‘Graph Databases’, which is available for free here: http://www.neo4j.org/learn .
      NOSQL means Not Only SQL, so the implication is that graph databases can work nicely side-by-side with RDBMS solutions. Indeed, I often store my core data in a traditional DB and export to a graph database. Likewise, queries on the graph can be exported and returned to a RDBMS for analysis (or exposure via software such as i2). The implication is thus that this is a new tool that can supplement and extend traditional RDBMS approaches, especially when investigating complex phenomenon such as fraud.

      Reply

  2. sctr7 Says:

    Another good example – bank fraud detection via graph / network analysis: http://gist.neo4j.org/?github-neo4j-contrib%2Fgists%2F%2Fother%2FBankFraudDetection.adoc

    Reply

    • casualwear85 Says:

      Thanks,

      I have been looking into this a bit but haven’t found a good internal use case yet. Our focus is mostly to follow the money and focus on probability scoring by applying different classification algorithms this adds another dimension.

      I see the benefits with more complex fraud and money laundering schemes. I worked with some of your UK and us colleagues on an assignment surprised we didn’t touch this approach.

      Reply

Trackbacks/Pingbacks

  1. Network analytics: more than pretty pictures | BAM! Business Analytics Management… - August 14, 2014

    […] Blog posting on network analytics for fraud detection:  https://sctr7.com/2014/06/27/the-cutting-edge-network-analytics-for-financial-fraud-detection-and-mit… […]

  2. Predictive policing: the brave new age of law enforcement analytics | BAM! Business Analytics Management… - September 5, 2014

    […] sctr7 Network analytics for fraud detection […]

  3. Fraud analytics: collected links | BAM! Business Analytics Management… - October 1, 2014

    […] post Network analytics for fraud […]

  4. Anti money laundering (AML): the network graph analytics approach | BAM! Business Analytics Management… - October 10, 2014

    […] post Network analytics for fraud […]

  5. Exploring Recommenders with Networks | I am still thinking… - July 1, 2015

    […] Cutting Edge: Network Analytics for Financial Fraud Detection and Mitigation, by Scott Mongeau, https://sctr7.com/2014/06/27/the-cutting-edge-network-analytics-for-financial-fraud-detection-and-mit… […]

  6. Visualisation in Business – Part 2 - July 24, 2016

    […] Source: Scott Allen Mongeau, SAS Institute […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: