Thursday, November 28, 2019

Fraud Detection in Banking Transactions free essay sample

The purpose of this document is to detail the description of the Real Time (Active) Fraud Detection in Banking Transactions (FDBT) Project. This document is a key project artifact and is used during the design, construction, and rollout phases. Scope The objective of this project report is to capture the functional and non-functional requirements for the Real Time FDBT project. This report lists out the complete system requirements and design architecture of the project. The requirements contained herein will include, but not be limited to: Capabilities or system functionality – What a system does, to include, but not be limited to: Interfaces (internal and external hardware) Business Rules Data source and destination Exact sequence of operations and the algorithms used in those operations Triggers or stimuli to initiate operations or to force a change in state Error handling, recovery and responses to abnormal situations Validity checks Input/output sequences and conversion algorithms Frequency of use and update Constraints- Limitation imposed on the solution by circumstance, force or compulsion to include, but not be limited to: Design constraints based on TrinucInc IT Standards Control and Governance constraints (internal and external) Non-functional requirements to include, but not be limited to: Performance Requirements, Usability Quality Requirements (audibility, reliability, maintainability, etc. We will write a custom essay sample on Fraud Detection in Banking Transactions or any similar topic specifically for you Do Not WasteYour Time HIRE WRITER Only 13.90 / page ) Business Continuity, Operational Support Security and Control and Training Introduction Background According to the National Check Fraud Center in Charleston, South Carolina, bank fraud alone is a $10 billion a year problem. This is nearly 15 times the $65 million taken in bank robberies annually. The Concise Oxford Dictionary defines fraud as ‘criminal deception; the use of false representations to gain an unjust advantage. Fraud is as old as humanity itself, and can take an unlimited variety of different forms. However, in recent years, the development of new technologies (which have made it easier for us to communicate and helped increase our spending power) has also provided yet further ways in which criminals may commit fraud. As fraud attempts grow in both number and variety, financial institutions are challenged with the need for comprehensive, yet cost effective, risk management solutions. It is our belief that these fraudulent or suspicious financial transactions can be identified, characterized and red-flagged in real-time providing vital information to reduce their occurrences. For e. g. a check deposit followed almost immediately by a cash withdrawal would be a suspicious activity and warrant a red flag to check the customer’s motives. Banking databases with all the transaction information is readily available. We use this information coupled with our business logic to detect fraud and to develop the real time fault detector. Types of bank/financial frauds Check fraud New Account fraud Identity fraud Credit/Debit card fraud ATM transaction fraud Wire fraud Loan fraud Research The research regarding the project was done two fold – Business Issues and Technical research. First step was to identify the various ways in which bank fraud occurs and come up with common sense solutions to them based on our technical knowledge base. Next was to come up with the software architecture with technical decisions on choice of RDBMS, ETL tool and OLAP tool. Business Issues Detailed list of ways fraud occurs and activities that could red flag the transaction as suspicious are (Note: Activities generally pertain to personal banking and not corporate accounts): If check deposit is closely followed by cash withdrawal within say 10 hrs. If transaction type is above a specified number in 48 hours. If active more than one session at the same time. If trying to withdraw more money than the limit in credit. If trying to withdraw more money than the amount in debit. If trying to log on for more than 3 times at once. If any transaction is more than 80% credit limit in 48 hours (one transaction or sum or transactions in the 48 hour period). Deposit activity out of the normal range for any account Invalid Routing Transit numbers Excessive numbers of deposited items Total deposit amounts greater than average Large deposited items masked by smaller deposit transactions The amount xceeds the historical average deposit amount by more than a specified percentage A duplicate deposit is detected Deposited checks contain invalid routing or transit numbers The level of risk can be managed based on the age of the account (closed account getting lot of transactions suddenly). The number of deposits exceed the normal activity by the customer Consider the proximity of the customer’s residence or place of business Wire transfers, letters o f credit, and non-customer transactions, such as funds transfers, should be compared with the OFAC lists before being conducted. A customer’s home/business telephone is disconnected. A customer makes frequent or large transactions and has no record of past or present employment experience. A customer uses the automated teller machine to make several bank deposits below a specified threshold. Wire transfer activity to/from a financial secrecy haven, or high-risk geographic location without an apparent business reason, or when it is inconsistent with the customer’s business or history. Many small, incoming wire transfers of funds received, or deposits made using checks and money orders. Almost immediately, all or most are wired to another city or country in a manner inconsistent with the customer’s business or history. Large incoming wire transfers on behalf of a foreign client with little or no explicit reason. Wire activity that is unexplained, repetitive, or shows unusual patterns. Payments or receipts with no apparent links to legitimate contracts, goods, or services. A customer who purchases a number of cashier’s checks, money orders, or traveler’s checks for large amounts under a specified threshold. Money orders deposited by mail, which are numbered sequentially or have unusual symbols or stamps on them. Suspicious movements of funds from one bank into another, then back into the first bank: 1) purchasing cashier’s checks from bank A; 2) opening up a checking account at bank B; 3) depositing the cashier’s checks into a checking account at bank B; and, 4) wire transferring the funds from the checking account at bank B into an account at bank A. A rapid increase in the size and frequency of cash deposits with no corresponding increase in non-cash deposits Significant turnover in large denomination bills that would appear uncharacteristic given the bank’s location Different banks take different actions when confronted by a fraudulent transaction. The following table describes some of the actions taken by the bank and an ID no. is assigned to each: Bank Action Identification Number Freeze Account – No future transactions 8 Deny transaction 7 Teller warning – confirmed fraud, call security e. g. blacklisted check 6 Teller warning – double check ID, customer has bad history 5 Teller warning – call person from whom check originated. (Large check amt. †¦) 4 Deny ATM/online banking access 3 Reduce line of credit 2 Report to collection agency 1 We further generate a table to assign risk ranks for the various fraud detection rules along with the various dependant transaction parameters: Fraud Activity Description Dependant Parameters Business/Detection Rule Activity scale No. If any transaction(s) is more than 80% credit limit in 48 hours Old amount, requested amount(sum of amounts) Requested amount ? 0. 8 old amount 4 If transaction type is more than †¦types in 48 hours Type of transaction Count of number of similar transaction in specified time span 4 If active more than one session at the same time Transaction time Same transaction times 6 If trying to withdraw more money than the limit in credit Line of credit, requested amount Difference of LOC – Requested amt ? 0 7 If trying to log on for more than 3 times at once. Transaction time Same transaction times 6 Check deposit closely followed by cash withdrawal Transaction time, transaction type Transaction times of deposit and withdrawal ? 10 hrs 6 . . . This transaction data provided by Teradata acts the input to Ab-Initio where we apply our business rules in the transformation stage and obtain our output, which here would be to red flag the fraudulent transactions. Technical Research Relational Data Base Management System TERADATA Teradata data warehouse brings together all of the obtained data into a single repository for a completely integrated, 360-degree view of the truth. The Teradata Warehouse is a powerful, complete solution that combines Teradatas high-performance parallel database technology, a full suite of data access and management tools, and robust data mining capabilities, world-class scalable hardware, and the most experienced data warehousing consultants. With the Teradata RDBMS, you can access, store, and operate on data using Teradata Structured Query Language (Teradata SQL). Teradata SQL is broadly compatible with IBM and ANSI SQL. The Teradata RDBMS provides†¦ Capacity to hold Terabytes of data. Parallel processing which makes it faster than other relational DB. Parallel Data Extensions (PDE) a software interface layer on top of the operating system that enables RDBMS to operate in parallel env) Single data store and can be accessed from anywhere else. Fault tolerance – automatically detects and corrects any hardware failures. Data integrity –completes the transaction or rolls back to a stable state if a fault occurs. The architecture includes both Symmetric Multi-Processing (SMP) and Massively Parallel Processing (MPP) systems. They communicate thru fast interconnect. BYNET for MPP systems and board less BYNET (Virtual ByNet) for SMP systems. Users of the client system send requests to the Teradata RDBMS through a choice of various supported utilities and interfaces, which are listed below: BTEQ, Teradata SQL Assistant, Teradata WinDDI, Teradata MultiTool, FastExport, FastLoad, MultiLoad, Tpump, Teradata Visual Explain. These benefits have resulted in many Oracle Customers shifting their data warehouses from Oracle to Teradata in the past 3 years. Added to this Teradata provides the ability to Establish a single, centralized data warehouse with direct access by hundreds or even thousands of users across the organization. Integrate data from multiple sources and functional areas using one data model. Handle large and growing data volumes, complexity and number of users without any restructuring and reorganizations. Allow ad hoc, complex queries at any time with no scheduling necessary. The technical prowess has won Teradata, a division of NCR, a customer list of about 650 large companies and government agencies, including Albertsons, FedEx, Ford, the U. S. Postal Service and Wal-Mart. Characteristics of Teradata RDBMS that endeared us to use it in our project are: Extensive customer references. Time to solution is enhanced by Teradata’s flexibility that can be used for rapid initial implementation and ongoing extensions. Teradata can significantly lower total cost of ownership. Teradata Database is supported by an infrastructure of support 24/7/365. As data volume grows so will the system, but the growth is relatively effortless from a hardware and software point of view. More and more users will need access to the data warehouse, but adding them will not impact performance. Teradata’s performance with complex and ad hoc queries is unequaled. Loading and reloading the database is fast and fail safe with Teradata. If the warehouse needs to interface with the mainframe, Teradata offers seamless interoperability. Teradata Database is a mature product first released in 1984 Extraction, Transformation and Loading tool Ab Initio software is an ETL tool for enterprise class, mission critical applications such as data warehousing, batch processing, data movement, data analysis and analytics. Ab Initio helps to build large-scale data processing applications and run them in parallel environments. Ab Initio software consists of two main programs: Co-Operating System, which the system administrator installs on a host UNIX or Windows NT Server, as well as on processing nodes. (The host is also referred to as the control node. ) Graphical Development Environment (GDE) uses graphical components to build applications that transform large amounts of data. Business rules are applied to the datasets in the data transformation step by using the filter by expression component. Characteristics of Ab Initio ETL tool needed for the project are: Parallel data cleansing and validation. Parallel data transformation and filtering. Real time, parallel data capture. High performance analytics. Integration with Teradata RDBMS Integration and parallel execution of fraud detection custom codes. Has volume of 30 terabytes of data. Online Application Processing Tool Erwin is used in the design, generation and maintenance of high-quality, high-performance databases, data warehouses and final reports and graphs. From a logical model of the information requirements and business rules that define the database to a physical model optimized for the specific characteristics of your target database, Erwin helps to visually determine the proper structure, key elements and optimal design for the database. Its breakthrough Complete-Compare technology allows iterative development—keeping the model synchronized with the database at all times. Erwin Data Modeler scales across the enterprise by seamlessly integrating with CA’s Model Manager. This powerful model management system provides an important solution to the security management requirement. By dividing, sharing and reusing designs across different development efforts, modeling productivity can be maximized and corporate standards can be easily established and enforced. System Design and Architecture The data warehouse functional architecture as well as the technical architecture serves as a blueprint for the data warehousing effort. Throughout the project along with the architectural details various issues of fraud detection and some solutions to them with technical details have been discussed. The functional requirements express how the system behaves, focusing on the inputs, outputs and processing details of the system. The technical architecture centralizes the information and proposes how to integrate it. The technical architecture model depicts the overall framework and displays physical and logical interconnections. The technical architecture plan provides further insight on details dictated by the model, some of which were not apparent during previous project phases. The plan shows not only what has to be done, but also gives some indication as to why by its placement in the overall architecture model. The technical architecture model and plan provide two key benefits to the data warehouse project: improved communications and enhanced adaptability. Furthermore, by keeping these documentation items synchronized with project modifications, one has a ready, reliable data warehouse reference source. The data obtained by different financial institutions may vary in their formats and information provided. For the sake of generality and flexibility we use data generated according to the official data architecture standards as presented in the Center for Information Management website. (www. cimu. gov. mt) We will start the description of system design by showing the steps involved in the creation of datasets in Teradata. In our project we use the 6. 1 demo version of Teradata. 1. Start the Teradata Instance from the Windows START menu and selecting the Teradata Service Control icon. Minimize the screen after it says â€Å" Teradata Running† 2. Now Start Teradata SQL Assistant also from the Windows START menu. The above screen pops up. It has Query and History windows that pop up by default. In Query Window, we can write all the queries that we want to run. History Window keeps track of all the queries that we run. 3. Click on the top left most Connect button to connect to an old or create a new database. Fig. 1 Selection of database menu. Here financial database is being selected. We can click on the new button and create a new database too. Fig. The above screen shot is an instance of retrieving the Teradata database. Database specifications are entered. It will then ask for the names of database to be filled and some fields where we can specify the username and password for that particular database that we are about to create. Thus a Teradata database is created. 4. The next step would be to create some tables into this database. We can write a query in the Query Window to create a table. Click on the execute button to run any query. Fig. 3 This screen shot shows how a test table is created with two columns. 5. To see the newly created table click on the top left disconnect button and click the connect button again. Similarly, data can be inserted into in using insert query. Note: If we do right click on the query window, we can see Query Builder. We can use the Query Builder to build all sorts of Queries. Query Builder can be used to extract information from the table. The tables created for this project. They are: customer (customer details) cc_acct (Credit Card Account Details) credit_tran (Credit Card Transaction Details) checking_acct (Checking’s Account Details) checking_tran (Checking’s Transaction Details) avings_acct (Savings Account Details) savings_tran (Savings Transaction Details) accts (Account Types) These tables with the columns are listed below: Fig. 4 Customer details table Fig. 5 Credit card account details table Fig. 6 Credit card transaction details table Fig. 7 Checking account details table Fig. 8 Checking account transaction details table Fig. 9 Savings account details table Fig. 10 Savings account transaction details table Fig. 11 Account details table . . Teradata is also used to further provide the following data (but we do not show their creation or use in this project report): Customer information such as full name, address, SSN, number and types of accounts, credit limit, contact phone numbers etc†¦ Bank information such as bank branches, branch locations, sales representatives etc†¦ 6. Our table creation process in Teradata is now over. We now use the ETL tool Ab-Initio to create the graph implementing our business rules. For the sake of presentation simplicity and project confidentiality, we concentrate only on the following fraud detection rules: Customer transaction is not part of the bank (fake transaction). Customer withdrawals are more than 80% of his balance in the span of 24 hrs. There are basically three components for the creation of graphs in Ab-Initio: Input Component Transform Component Output Component. The next step involves specifying the input component in Ab-Initio. We do this by double clicking the input component and entering the URL path of the input file location and also the record format of the file. Fig. 12 Screen shot of the input component step of the graph creation process. Fig. 13 Screen shot with the view of the record format step. The file what we have used in this example is of size 9,649 KB. 8. We can view the data of the input file by right clicking on the component and selecting the view data option. There are almost 112000 records in this file but for simplicity only 100 records are being viewed. Fig. 14 View of our inputted table 9. Now, we join the input port to the reformat component and the output port of the reformat component to the output port. We can now apply transform on the input file by either selecting the transform option in the parameters tab of the reformat component or by entering the record format in the output component. Fig. 15 Reformatting of input components Fig. 6 Applying transform option in the reformat component In this example the tran_time, channel, tran_code has not been provided to the output port. As a result, the output file won’t have these fields in it after the graph is executed. 10. The next step is to build the graph. Fig. 17 Screen shot of the graph building process 11. Upon successful completion of the graph building process, we can view our output file by opening it in any editor like the Note Pad. Fig. 18 The output file of the Ab-Initio Graph as viewed in Note Pad Fig. 19 Full view of all the datasets in the output file The output file is 8,650 KB. 2. Similarly, a Flat file (any database table) can be read and written or loaded into a Teradata table by following the above-mentioned steps. The config dbc file and the path of the output with the file name is specified. Fig. 20 Transforming of flat file data Thus, data can be loaded to any database table using Ab-Initio. 13. The next step would be to sort the data obtained in the order we want and according to the variable we want and storing them as flat files. This step is basic in most data warehousing steps. We then apply our business rules to the data and obtain our output, which here are the red flags. A detailed description of the steps involved is as follows: F E D C B A Fig. 14: This figure shows how a red flag can be triggered if any customer crosses 80% of its limit in any type of account. Ab-Initio Component to sort the input data on the basis of Cust_ID. This join component joins the sorted inputs of the customer details and it’s account types that he/she has according to the Cust_ID. The customer_details_summary is taken as the normal output from the output port and the unmatched_customer_details and the unmatched_customer_accounts are outputted from the first and second output port. This join function takes care of the business rule to check if the customer really exists or not. This join component has three input ports. The number of input ports can be specified in the count of the parameter tab in the join component. FD file used is a lookup file. IT has weight specified for fraud. With the help of this component, the customer_detailed_summary, unmatched_customer_accounts and the fd file are joined on the basis of Cust_ID. It is a reformat component, which is one of the forms of, transform component. The business rule of putting a red flag for the customers who have crossed their account limit with more than 80% is applied here in the transform function. In this example total of 16 records. The one with flagged is given a weight of 1 and the one without flag as 0. The details of the customers with the fraud detection. FD file is a lookup file that has a column with 1 and 0 which when combined with the customer_detailed_summary on the basis of cust_id helps flagging the customer who crosses more than 80% of their limit. 14. New graph with intention to apply a second business rule: 4 3 1 Fig. 15 This graph shows how the detailed checking transactions of a customer performed within 24 hours can be found out Ab-Initio Component to sort the input data on the basis of Cust_ID. The join component joins the sorted checking_account of customers with the sorted_checking_transactions on the basis of cust_id to get the checking_account_transactions in the normal output port, unused checking_ accounts in the first unused port and the unused_checking_transactions in the second unused port. This join function takes care of the business rule to check if the customer is valid or not. It sorts the output on the basis of the customer_id which is then, Rolled up on the basis of the day to get the transactions performed within 24hours. The reject_rollup_checking is also sent to a reject output port. Thus, all the checking transactions of a customer within 24 hours are calculated. Logical Model – 1 (Erwin) Conclusion In this project we have concentrated on the financial fraud detection in banking transactions. But the generic nature of the project architecture would enable us to use the same software with minor changes to input data, business rules etc. o satisfy the fraud detection in other areas such as company financial bookkeeping, telemarketing, pay-per-calls scam etc. Also in there is a wealth of information that can be inferred form the output of our project. The following are some points for the further scope of our project: How this architecture could be used not only to detect fraud but also to identify target customer groups for marketing. We can als o provide monthly or weekly reports of sensitive and high-risk customers and even ATM and banking locations, which are fraud prone. Improve the overall security standards of the company. Develop a progress report for individual bank branches and identify their individual areas of improvement. References Kimball, R. , Reeves, L. , Ross, M. and Thornthwaite, W. (1998). The Data Warehouse Lifecycle Toolkit, John Wiley Sons, Inc. Silverston, L. , Inmon, W. H. and Graziano, K. (1997). The Data Model Resource Book, John Wiley Sons, Inc. www. teradata. com www. abinitio. com www. ca. com/db www. fraud. org http://teradata. uark. edu/research/

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.