@@ -14,82 +14,67 @@ Fraud Detection Accelerator Using AWS SageMaker
1414 :depth: 1
1515 :class: singlecol
1616
17- Revolutionize fraud detection in finance with MongoDB Atlas and Amazon
18- SageMaker Canvas. Leverage real-time data and AI for stronger defenses
19- against cybercrime.
20-
2117- **Use cases:** `Gen AI <https://www.mongodb.com/use-cases/artificial-intelligence>`__,
2218 `Fraud Prevention <https://www.mongodb.com/industries/financial-services/fraud-prevention>`__
2319
2420- **Industries:** `Financial Services <https://www.mongodb.com/industries/financial-services>`__,
2521 `Insurance <https://www.mongodb.com/industries/insurance>`__
2622
27- - **Products and tools:** `Atlas <https://www.mongodb.com/atlas/database>`__,
28- `Atlas Charts <https://www.mongodb.com/products/charts>`__,
29- `Data Federation <https://www.mongodb.com/products/platform/atlas-data-federation>`__
23+ - **Products and tools:** `MongoDB Atlas <https://www.mongodb.com/atlas/database>`__,
24+ `MongoDB Atlas Charts <https://www.mongodb.com/products/charts>`__,
25+ `MongoDB Data Federation <https://www.mongodb.com/products/platform/atlas-data-federation>`__
3026
3127- **Partners:** `Amazon S3 <https://aws.amazon.com/s3/>`__,
3228 `Amazon SageMaker Canvas <https://aws.amazon.com/pm/sagemaker/>`__
3329
3430Solutions Overview
3531------------------
36- Financial services organizations face growing risks from cybercriminals. High-profile
37- hacks and fraudulent transactions undermine faith in the industry. As technology evolves,
38- so do the techniques employed by these perpetrators, making the battle against fraud a
39- perpetual challenge. Existing fraud detection systems often grapple with a critical
40- limitation: relying on stale data. The newest tactics often can be seen in the data.
41- That's where the power of operational data comes into play.
42-
43- By harnessing `real-time data
44- <https://www.mongodb.com/basics/real-time-analytics-examples>`__, fraud detection models
45- can be trained on the most accurate and relevant clues available. |service-fullname|, a highly
46- scalable and flexible developer data platform, coupled with Amazon SageMaker Canvas, an
47- advanced `machine learning <https://www.mongodb.com/basics/machine-learning>`__ tool,
48- presents a groundbreaking opportunity to revolutionize fraud detection. By harnessing
49- operational data and leveraging the power of real-time insights, financial institutions
50- can fortify their defenses against cybercriminals who seek to exploit vulnerabilities for
51- illicit gains. |service-fullname| proves its strength as an operational data store,
52- accommodating high-volume transactional data with exceptional performance and flexibility.
53- Meanwhile, Amazon SageMaker Canvas empowers business analysts to leverage AI/ML solutions
54- effortlessly, providing a no-code platform that brings the power of advanced analytics to
55- their fingertips.
56-
57- Challenges with Legacy Fraud Systems
58- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59-
60- - **Incomplete data visibility from legacy systems:** Lack of access to relevant data
61- sources hampers fraud pattern detection.
62-
63- - **Latency issues in fraud prevention systems:** Legacy systems lack real-time
64- processing, causing delays in fraud detection.
65-
66- - **Difficulty in adapting legacy systems:** Inflexibility hinders the adoption of
67- advanced fraud prevention technologies.
68-
69- - **Weak security protocols in legacy systems:** Outdated security exposes vulnerabilities
70- to cyber attacks.
71-
72- - **Operational challenges due to technical sprawl:** Diverse technologies complicate
73- maintenance and updates.
74-
75- - **Lack of collaboration between teams:** Siloed approach leads to delayed solutions and
76- higher overhead.
32+
33+ Financial institutions face growing risks from cybercriminals, including high-profile
34+ hacks and fraudulent transactions. Cyber incidents undermine customer trust and
35+ can result in significant financial losses for companies. Companies struggle to
36+ implement secure systems, due to the limitations of legacy fraud systems, which
37+ include:
38+
39+ - **Incomplete data visibility:** Lack of access to relevant data sources for pattern
40+ detection.
41+
42+ - **Latency within fraud systems:** Lack of real-time processing capabilities
43+ that causes fraud detection delays.
44+
45+ - **Weak security protocols:** Outdated security that exposes vulnerabilities to
46+ cyber attacks.
47+
48+ - **Technical sprawl:** Diverse technologies that complicate maintenance and updates.
49+
50+ - **Poor team collaboration:** Siloed approaches that lead to delayed responses.
51+
52+ To overcome these challenges, financial companies can use
53+ `real-time analytics <https://www.mongodb.com/basics/real-time-analytics-examples>`__
54+ solutions powered by MongoDB Atlas and Amazon SageMaker Canvas. These tools deliver
55+ strong fraud detection systems that use the most accurate data available for
56+ their operations.
57+
58+ In this system, MongoDB Atlas stores the operational data and processes
59+ high-volume transactions. While, Amazon SageMaker Canvas uses sophisticated AI
60+ and `machine learning <https://www.mongodb.com/basics/machine-learning>`__ (ML)
61+ tools to power advanced analytics for fraud detection.
7762
7863Reference Architectures
7964-----------------------
80- Below, you will find the architecture used to build this fraud solution. The architecture
81- includes an end-to-end solution for detecting different types of fraud in the banking
82- sector, including card fraud detection, identity theft detection, account takeover
83- detection, money laundering detection, consumer fraud detection, insider fraud detection,
84- and mobile banking fraud detection to name a few.
65+ Below is the architecture used to build this fraud detection solution.
66+ The architecture includes an end-to-end solution for detecting different types
67+ of fraud in the banking sector, including card fraud detection, identity theft
68+ detection, and consumer fraud detection.
8569
8670The architecture diagram illustrates model training and near real-time inference. The
8771operational data stored in `MongoDB Atlas <https://www.mongodb.com/atlas/database>`__ is
88- written to the `Amazon S3 <https://aws.amazon.com/s3/>`__ bucket using the Triggers
89- feature in {+atlas-app-services+}. Thus stored, data is used to create and train the
90- model in `Amazon SageMaker Canvas <https://aws.amazon.com/pm/sagemaker/>`__. The SageMaker
91- Canvas stores the metadata for the model in the |s3| bucket and exposes the model endpoint
92- for inference.
72+ written to the `Amazon S3 <https://aws.amazon.com/s3/>`__ bucket using
73+ `MongoDB Atlas Triggers <https://www.mongodb.com/docs/atlas/atlas-ui/triggers/>`__.
74+ Thus stored, the data is used to create and train the model in
75+ `Amazon SageMaker Canvas <https://aws.amazon.com/pm/sagemaker/>`__. The SageMaker
76+ Canvas stores the metadata for the model in the |s3| bucket and exposes the model
77+ endpoint for inference.
9378
9479.. figure:: /includes/images/industry-solutions/fraud-prevention-architecture.png
9580 :figwidth: 750px
@@ -99,12 +84,16 @@ for inference.
9984
10085Data Model Approach
10186-------------------
102- The data is divided into two separate files: one containing identity information and the
103- other containing transaction data. These files are connected through the TransactionID.
104- It's important to note that not every transaction includes associated identity details.
87+ The data is divided into two separate files:
88+
89+ - Transaction
90+ - Identity
10591
106- Based on the above two datasets, we prepare a test join on the TransactionID, adding the
107- target column as Fraud.
92+ These files are connected through the ``TransactionID``.
93+ However, not every transaction includes associated identity details.
94+
95+ Based on the above two datasets, prepare a test join on the ``TransactionID``,
96+ adding the target column as Fraud.
10897
10998*Data courtesy of* `Kaggle <https://www.kaggle.com/c/ieee-fraud-detection/data>`__.
11099
@@ -143,76 +132,51 @@ target column as Fraud.
143132 TransactionAmt,
144133 isFraud
145134
146- Building the Solution
147- ---------------------
148- The detailed step-by-step guide to build this solution can be found in this `Github repo
135+ Build the Solution
136+ ------------------
137+ The detailed step-by-step guide to build this solution is available on this `Github repo
149138<https://github.com/mongodb-partners/Frauddetection_with_MongoDBAtlas_and_SageMakerCanvas/blob/main/README.md>`__.
150- Below you will find an overview of those steps taken:
139+ Below is an overview of those steps taken:
151140
1521411. Set up the `S3 bucket
153142 <https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html>`__
154143 to which the |service-fullname| data needs to be exported.
1551442. `Set up
156145 <https://www.mongodb.com/basics/clusters/mongodb-cluster-setup#:~:text=about%20storage%20capacity.-,Creating,-a%20MongoDB%20Cluster>`__
157146 an |service-fullname| Cluster.
158- 3. Set up {+ atlas-app-services+} .
147+ 3. Set up `MongoDB Atlas Triggers and Functions <https://www.mongodb.com/docs/ atlas/atlas-ui/triggers/>`__ .
1591484. Set up the `Amazon SageMaker
160149 <https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-quick-start.html>`__ domain.
161150
162- MongoDB Atlas as the Operational Data Store
163- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
164- The MongoDB Atlas developer data platform is an integrated suite of data services centered
165- on a `cloud database <https://www.mongodb.com/cloud-database>`__ designed to accelerate
166- and simplify how developers build with data. Its ability to handle large amounts of data
167- in a flexible schema empowers financial institutions to effortlessly capture, store, and
168- process high-volume transactional data in real-time. This means that every transaction,
169- every interaction, and every piece of operational data can be seamlessly integrated into
170- the fraud detection pipeline, ensuring that the models are continuously trained on the
171- most current and relevant information available. With MongoDB Atlas, financial
172- institutions gain an unrivaled advantage in their fight against fraud, unleashing the full
173- potential of operational data to create a robust and proactive defense system.
174-
175- Amazon SageMaker Canvas as an AI/ML Solution
176- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
177- Amazon SageMaker Canvas revolutionizes the way business analysts leverage AI/ML solutions
178- by offering a powerful no-code platform. Traditionally, implementing AI/ML models required
179- specialized technical expertise, making it inaccessible for many business analysts.
180- However, SageMaker Canvas eliminates this barrier by providing a visual point-and-click
181- interface to generate accurate ML predictions for classification, regression, forecasting,
182- natural language processing (NLP), and computer vision (CV). SageMaker Canvas empowers
183- business analysts to unlock valuable insights, make data-driven decisions, and harness the
184- power of AI without being hindered by technical complexities. It boosts collaboration
185- between business analysts and data scientists by sharing, reviewing, and updating ML
186- models across tools. It brings the realm of AI/ML within reach, allowing analysts to
187- explore new frontiers and drive innovation within their organizations.
188-
189151Key Learnings
190152-------------
191- - Understand the use of Atlas Application Services and Atlas Charts to build products at
192- scale.
193- - How MongoDB integrates natively with external services (such as AWS SageMaker, AWS S3)
194- to provide even more powerful applications.
195153
196- Technologies and Products Used
197- ------------------------------
154+ - **Develop real-time fraud detection solutions:** MongoDB Atlas handles large
155+ amounts of data in a flexible schema empowering financial institutions to capture,
156+ store, and process high-volume transactional data in real-time.
198157
199- MongoDB Developer Data Platform
200- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
158+ - **Update fraud detection models:** Real-time processing with MongoDB's aggregation
159+ pipeline ensures that models are continuously trained with the most current and
160+ relevant information available. This capacity provides financial institution a
161+ powerful tool to create a robust fraud detection system.
201162
202- - `Atlas Database <https://www.mongodb.com/atlas/database>`__
163+ - **Integrate sophisticated AI and ML tools:** MongoDB integrates with
164+ external services, such as Amazon SageMaker, which offers AI
165+ and ML solutions in a no-code platform. This friendly-user interface makes models
166+ accessible to analysts, enabling them to easily generate accurate ML predictions
167+ for classification, regression, forecasting, natural language processing (NLP),
168+ and computer vision (CV).
203169
204- - `Atlas Charts <https://www.mongodb.com/products/charts>`__
205-
206- - `Atlas Data Federation <https://www.mongodb.com/products/platform/atlas-data-federation>`__
170+ Authors
171+ -------
172+ - Babu Srinivasan, Partner Solutions Architect at MongoDB
173+ - Igor Alekseev, Partner Solutions Architect at AWS
207174
208- Partner Technologies
209- ~~~~~~~~~~~~~~~~~~~~
175+ Learn More
176+ ----------
210177
211- - `AWS S3 <https://aws.amazon.com/s3/>`__
178+ - :ref:`arch-center-hasura-fintech-services`
212179
213- - `AWS SageMaker Canvas <https://aws.amazon.com/pm/sagemaker/>`__
180+ - :ref:`arch-center-is-payments-solution`
214181
215- Authors
216- -------
217- - Babu Srinivasan, Partner Solutions Architect at MongoDB
218- - Igor Alekseev, Partner Solutions Architect at AWS
182+ - :ref:`arch-center-is-card-fraud-solution`
0 commit comments