Summary

In this developer code pattern, we will analyze insurance claims data and determine whether there are any fraudulent claims filed by users. We do this by analyzing data patterns using IBM Db2 Graph. The query extracts claims from the database and analyzes them using the visualization library. Analysts from insurance companies can visually analyze the graph by finding patterns in data related to patients, doctor visits, multiple claims, etc. and determine whether there are suspicious claims filed.

Description

As the volume of data grows, it has become a challenge to analyze vast networks of connected data. To overcome this, there is a rapid adoption to graph database technologies, since they’re built around relationships and represent data in a way that is more intuitive to read and gain insights. There are a number of graph databases that handle graph-only use cases well, but enterprises require analytic systems that can perform data transformations, aggregations, and other operations in addition to graph analytics. While graph databases perform exceptionally well for certain types of analyses, they are not suitable for workloads that involve aggregation over a large amount of data. They are not ideal for transactional processing and don’t scale well. Most existing graph databases are stand-alone and cannot easily integrate with these other analytics systems.

In this code pattern, we will learn about IBM Db2 Graph, which enables IBM Db2 Graph analytics on top of Db2, allowing you to perform IBM Db2 Graph analytics and SQL (for transactional processing, transformations, and other analytics use cases) on the same copy of data, without data duplication or requiring changes to the underlying database structure.

Related work from others:  Latest from MIT Tech Review - Generative AI deployment: Strategies for smooth scaling

IBM Db2 Graph will create a virtual graph view of the underlying data using the existing relationships already defined in Db2. Alternatively, you can create your own graph model by defining how the tables and views defined in Db2 map into nodes and edges in your graph. IBM Db2 Graph then exposes the graph model so you can execute Gremlin queries. IBM Db2 Graph fetches only the necessary data from Db2 at the time of query execution, so any updates made to data in Db2 will be reflected.

After completing this code pattern, you will be able to:

Load data to the IBM Db2 instance on cloud.
Create an IBM Db2 Graph instance running locally.
Connect IBM Db2 Graph to the IBM Db2 instance on cloud and create IBM Db2 Graph database.
Run Gremlin queries on top of the IBM Db2 database.
Analyze the data using Jupyter Notebook.

Flow

Load or use existing data to/from IBM Db2 database.
Run the IBM Db2 Graph Docker container. IBM Db2 Graph will connect to IBM Db2 database on cloud and creates an overlay file, which allows users to run Gremlin queries using IBM Db2 Graph.
Analyze the data using Jupyter Notebook.

Instructions

Ready to get started? Check out the full instruction in the README, where you will step through theh details of:

Cloning the repo
Creating the IBM Db2 service on cloud
Creating schemas and loading data
Running IBM Db2 Graph locally
Connecting IBM Db2 Graph with IBM Db2 database
Running Gremlin queries
Installing Anaconda environment to run Jupyter Notebooks
Configuring IBM Db2 driver
Running Jupyter Notebooks to view visualizations

Related work from others:  Latest from MIT : A new way to look at data privacy

Similar Posts