Thursday, May 30, 2013

Class diagram, Sequence, Use case, Activity, Component, and system diagram of warehouse based banking analytics

Outline
banking analytics is a warehouse based project. This is a project of analysis of transactional data in banks and tried to analyse the three objectives

  1. fraud detection
  2. churn analysis 
  3. customer relationship management
This project entitled “Warehouse based intelligent banking transaction analysis system” aims to develop a data warehouse with intelligent enough to analysis the day to day transaction occurring in the banks.  The key aims of the project  is to make the banking transaction more reliable and secure which has been a major problems in today’s age of increasing information technology. Furthermore this project aims to withstand a competitive market and bring itself in a strong position with the analysis of customer behavior and activities.


  •  1      System Architecture







The application system architecture consists of four blocks including the data sources, data warehouse servers, OLAP servers and reporting & data mining block. The functionality of each of the blocks is illustrated below:

                        Block –I: Operational & External Data Sources


For the implementation of a data warehouse & business intelligence system, the availability of reliable and actual data sources is essential and the most important without which the information reported, mined and forecasted may not be fruitful. For our system, the vendor for the operational & the external data sources is Nepal Investment Bank. The bank provided the bank’s customers’ profile and transaction databases in various format such as .txt and .sql. These data sources are flat files and need to be converted in multi-dimensional format for OLAP operations.

                         Block – II: Data Warehouse Servers

This block contains the staging area, warehouse database servers & metadata repository. There is physical data movement from source database to data warehouse database. Staging area is primarily designed to serve as intermediate resting place for data before it is processed and integrated into the target data warehouse. This staging are serves many purposes above and beyond the primary function:
                    The data is most consistent with the source. It is devoid of any transformation or has only minor      
                     format changes.

                     The staging area in a relation database can be read/ scanned/ queried using SQL without the  
                     need of logging into the source system or reading files (text/xml/binary).

                       It is a prime location for validating data quality from source or auditing and tracking down data  
                      issues.

                    Staging area acts as a repository for historical data if not truncated.

The next component is a warehouse database server that is almost always a relational database system. Back-end tools and utilities are used to feed data into the bottom tier from operational databases or other external sources (such as customer profile information provided by external consultants). These tools and utilities perform data extraction, cleaning, and transformation (e.g., to merge similar data from different sources into a unified format), as well as load and refresh functions to update the data warehouse.
This block also contains a metadata repository, which stores information about the data warehouse and its contents.

                     Block – III: OLAP Server

The middle block is an OLAP server that is typically implemented using either
(1) a relational OLAP (ROLAP) model, that is, an extended relational DBMS that maps operations on multidimensional data to standard relational operations; or
(2) a multidimensional OLAP (MOLAP) model, that is, a special-purpose server that directly implements multidimensional data and operations.

                Block – IV: Reporting & Data Mining Tools

The front-end client layer in data warehousing is the presentation phase which contains query and reporting tools, analysis tools and data mining tools for trend analysis, fraud detection and customer churn behavior analysis. The reporting tool that we’ve used for this purpose is Oracle Business Intelligence Enterprise Edition (OBIEE) 11g.

For providing the analytical result, we will be using some of the Online Analytical Processing (OLAP) operations such as slicing & dicing, roll up & roll down and pivoting. The analytical results will be provided in a multi-dimensional view using OLAP Cube Technology projected to assist decision makers such as visualization with comparison to different dimensions e.g. locations, time etc.

For trend & prediction analysis featuring churn analysis and CRM, we will be using some of the data mining algorithms such as CART, C5.0 & Rule based algorithms. Transactions made by fraudsters using counterfeit cards and making cardholder-not-present purchases will be detected through methods which seek changes in transaction patterns, as well as checking for particular patterns which are known to be indicative of counterfeiting. 


  • 2     USE CASE DIAGRAM

                          loading datawarehouse

use case diagram of loading datawarehouse


As shown in diagram above, the loadDataWarehouse subsystem contains two actors one primary actor- Database Administrator and the other supporting actor- Bash shell ETL module. Each of the actors show their one behaviour in context to the system. The relationships between the actors and the use case scenarios have been shown by the association lines and those between use cases have been shown by the dependency lines with appropriate stereotypes shown within guillemets. The specification of the actors with main success scenarios and other alternate scenarios have been shown as follows:

                                         Specification of Actors

element
Details
Description
Database Administrator is the primary actor who is responsible for the extraction, transformation and loading of the data to the staging area. This actor is also responsible for the handling of new transaction data.


element
Details
Description
Bash Shell ETL Module is the supporting actor who is responsible for actual extraction, transformation and loading of the data to the staging area. This actor is also responsible for the loading of the target area and handling updates, i.e, refreshment of the warehouse.






As shown in the figure above, the use case diagram modeling the report generation and data mining subsystem consists of two actors- the database administrator and the end users which are the executives belonging to the tactical level of the management in the bank. The relationships between the actors and the use case scenarios have been shown by the association lines and those between use cases have been shown by the dependency lines with appropriate stereotypes shown within guillemets. The specification of the actors with main success scenarios and other alternate scenarios have been shown as follows:

 Specification of Actors


element
Details
Description
Database Administrator is the supporting actor who is responsible for the handling of different multidimensional processing of data. This actor is also responsible for providing the supporting environment for the end users.


element
Details
Description
Bank Manager is the primary actor who is responsible for the generation of different analytical reports, dashboards and also perform the fraud detection and churn prediction including the CRM analysis provided by the subsystem.








  • SEQUENCE DIAGRAM
                    system sequence diagram




Data source are heterogeneous may have different storage file. Data is extracted from the distributed heterogeneous data sources and stored in DSA (Data Staging Area) and is used in the next level of ETL process, i.e., cleaning and transformation. The cleaned and transformed data is then loaded into the data warehouse

                          report generation sequence diagram








As in the figure there is three class OBI, OBIBackend, DataWarehouse and BankManger is user.  Synchronous message passing is done between classes which is shown by the solid and dotted line. OBI get the request from user and OBI send request to OBIBackend to generated report. OBIBackend gets necessary data from dataWarehouse class and generated report is send to OBI and report is displayed to the user.


ACTIVITY DIAGRAM


activity diagram of reporting system



The facts and dimension from the warehouse are arranged in snowflake schema. OLAP cube are formed through which operations like slicing, dicing, roll up & roll down are performed and then report are generated according to those fact tables.


CLASS DIAGRAM






COMPONENT DIAGRAM

The above component diagram displays the major components involved in this sub system and communication between them. The ATM Machine component denotes the ATM Machine where a transaction takes place, the users must authorize themselves before performing any transaction. The ATM transaction is passed to the Bank Database component via LAN interface. The data stored in Bank Database are accessed by ETL tools component with necessary credentials and authority to perform extraction, transformation and loading of the raw data from Bank Database.  Finally the transformed and refined data are loaded into the data warehouse.





Above component diagram displays the components involved in report generation using OBIEE tool. The data from Data Warehouse component are fed into the Oracle BI server. Oracle Bi server provides efficient processing of data and structure information intelligently. It uses metadata to direct processing. The Metadata Repository component stores the metadata used by Oracle BI server. The query client component contains two components, Analysis Editor Component and Dashboards component. Analysis Editor Component contains set of graphical tools that enable users to build, view, and modify analyses that provide analytical information. Dashboards display results the analyses and other items. The Administrator tool component exposes the Oracle BI repository as three separate panes of layers: Physical, Business Model and Mapping and Presentation






solution of This computer has no detected O.S and linux not showing up partition of windows in dual boot


Problem description:

I had windows 8 installed in my laptop Acer 4750G and I was intrested in running linux and windows side by side so I tried to dual boot linux mint along with windows 8 but as i was proceeding, linux showed that "This computer has no detected O.S " and showed the options of completely deleting the other files and installing linux and the other was something else. Then I clicked the something else and the suprising thing that there was no partition of the hard drive of the windows .that is the hard disk was shown with no partition, it showed the harddisk as a single.I had 500gb disk and it showed 500gb as a single partition.
  then I was worried what to do, If i selected that partition then window would be uninstalled  and beware that  dont install there we can't get dual boot there.


 this is how I solved the problem

 first I partitioned the hard drive through linux using Gparted available in linux
then i installed linux in one partition removing the windows. Now I had single linux in my pc.
then I again  clean install windows 8, linux partition was showed in windows. Then we can install the windows 8 and enjoy dual boot

 But I wanted to install windows first and install linux so I  clean installed windows 8
 after the above steps, Then i tried to install linux this time current O.S was surely  up with the partition of
 windows

how to solve windows 8 cannot shutdown


I had also faced the problem of windows 8 not shutting down properly.  I have Acer  4750G ,when i pressed the shutdown , it acts as if it is shutting in a normal way displaying "shutting down" in the screen and windows  screen goes black but the fan and power light keeps on running. Even the shut down in safe mode made the laptop  hanged up . So I had only options of pressing the power button to turn the laptop off. Even I was adviced to update the bios and I did it but no any benefit . windows 8 forum suggested that the problem was the some of  the windows 8 software might be preventing windows shutdown and asked to check if any software is doinf so but  same output.

 this is how I ran into this problem

 previously I was running windows 8 properly, then I was intrested in dual boot along with linux. So  I tried to install linux mint and windows 8 side by side but some problem ran and I could not successed so I again finallly clean install the windows 8 alone but now the problem bagan with windows 8 not shutting down. even i tried with  other version of windows like windows 7, xp and so on.... but the problem didnot solved.  even looking at the internet and trying several optoions couldnot get me out of this. I had to pressed the power button.

this is how ny problem was solved

I clean installed the linux mint removing the windows 8 and tried the shut down option .wait, wait,............. and my words were WOW I have successed. now my laptop was able to shut down properly.no any pressing of shut down button.Then again I installed the windows 8 removing linux , and It is not showing any problem.But remember that when I live boot linux and tried shutdown, It didn't shutdown but hanged up. so clean  linux install was needed , no dual boot.

hope this my experience of windows not shutting down can help others who might be suffering the same problem.

Now I am running dual boot with windows 8 and linux mint side by side and no problem in shutting down.

Do you find this helpful?


Friday, May 17, 2013

class diagram, use case diagram, activity , sequence,collaborative and state chart diagram of hostel management system



The best practice of starting any project proceeds with the modelling of the various aspect of the project. The project are visualized through the diagrams like class diagram, activity diagram,use case diagram and so on. such diagrams are very usefulin the sense that the actual flow and the order of the project is reflected through this diagrams. Each diagrams has its own
significance and are very important.

outline of the problem

we have a hostel named New Boys hostel. it has 3 blocks named Block-A, Block-B and Block-C with each block is 3 storied thus 
each blocks are identified with ground, mid and top like A-top,A-mid,A-ground, B-top,B-mid and so on............

Here now we want to make a website to manage the hostel. The website keeps the record of each students in a room. We have assumed that each room has 2 to 3 students. Admin has only previlage to add or delete the record. non admin can only view the data.
students registered in the hostel have access to the notice board facilities . students can post the notice if any materials is lost or have any information that is to be shared. Admin can delete the post if any violation of rules is identified.


1: Class Diagram





2:Use Case Diagrams
The use case diagram below shows that the actor User is inherited by three users and is allowed to access the particular part as depicted in the diagram. The User must login in in the system to get the right to use it. Here General User needs to register to the system first and if the User is already registered , they are identified as Admin or Registered User and are allowed the particular privilege.

In next diagram, it is shown that first the user are validated by matching their password and are allowed to view the students residing in the particular block only if they are either Admin or Registered User.



3:Activity Diagrams
Activity diagram below shows the flow from activity to Activity with the available of input in the project NB Hostel Management .It shows the overall activity diagram of the system which first checks the register user, Admin and the new user and give access to certain attributes according to their identity




The next  figure depicts the activity for message posting process. Admin and Registered_User can post the message and is displayed for both user .However In order to Delete the post, Admin can delete any post but User can delete their own post only.And to edit the post ,User who has posted the message can only edit it.



4:Sequence Diagrams

Following Diagram represents the user login process in our project. First, the user is checked whether he/she is logged in or not by authorizing with the DB server. The privileges provided in the site are different for different users such as an administrator can do updates, deletion and insertion of the students’ records in the database which the others can’t.



In the above diagram, we have used the ‘alt’ frame operator [UML 2.0] as alternative frame for mutually exclusive conditional messages.

Next diagram is the sequence diagram for the students’ records insertion process in the database by the administrator. The first message fillForm(record) also called found message represents that personal info, campus related info and permanent addresses of the student are filled and then sent to the admin for insertion.




5:Collaboration Diagrams
The following collaboration diagram models the message posting process by a student or by the admin.


Next, the following collaboration diagram is for the new user insertion and information update of the students



6: State chart Diagrams
The following state chart diagram represents the states while navigating through different pages in the project’s site indicating the overall features –insert, update, view messages/notices etc.


The next statechart diagram represents the states of a process while posting messages to the message board by a residential student or while  posting notices by the hostel administrator.




Conceptual Model for Database Design [ER Modelling]
The entity- relationship diagram below depicts the database design of the project:




Tuesday, May 14, 2013

How to load data in data warehouse of oracle database using sqlloader in bash scripting



Loading is an essential and important part in a data warehouse. It is a part of ETL process in data warehouse design. Once all the data has been cleaned and transformed into a structure consistent with the data warehouse requirements, data is ready for loading into the data warehouse. We will load the data in data warehouse using ETL scripting like Korn scripting, Bash scripting and others scripting options as well into Oracle Database.
The key point which we should keep in mind is that there is mechanism in the client tools like sql developer that imports the data form the file  but in the data warehouse we need a automatic data loading so we should avoid such manual insertion of data through the tools . We need to write the scripting program which loads the data.
Loading of data in the oracle database:
The insertion of data in the Oracle database is supported through the SQL LOADER. Here in this project we have used the bash shell scripting. Loading of data consists of two steps
1.      Creation of the control file
2.      Execution of control file using SQL LOADER

1.      Creation of control file:
Creating of control file is a first step in loading data in oracle database using sql loader.
 This sample control file (loader.ctl) will load an external data file containing delimited data:
                        

load data

infile 'c:\data\mydata.csv'

into table emp

fields terminated by "," optionally enclosed by '"'

( empno, empname, sal, deptno

2.      Execution of control file
One can load data into an Oracle database by using the sqlldr (sqlload on some platforms) utility as

sqlldr username@server/password control=loader.ctl

e.g sqlldr BANKING_TBL01/yomari@BANKINDEV control=loader.ctl  log=loader.log bad=income.bad discard=income.discard

The above steps is the basic steps in the loading of data in oracle database. However in the case of data warehouse loading follows the same basic steps however we  create the staging and target tables in the database and hence we need to load the from the  source to staging table then to the target table through scripting like korn or bash shell scripting.

1.      Loading data from source to the staging table
The staging table consists of the identifier. The data from the source are loader through two basic steps as mentioned above.
e.g  let us consider a  table ATM_LOC_STG with
               ATM_LOC_IDNT,
             DISTT_IDNT,­­­­­­
             ZNE_IDNT,­­
             ATM_PLACE,
             ATM_NUM,
             ATM_LOC_DESC


  •     Now we create a control file load-items.ctl as

OPTIONS (skip=1, errors=100, direct=True)
LOAD DATA
INFILE 'nibl_atm_loc.csv'
TRUNCATE into table ATM_LOC_STG
FIELDS TERMINATED BY ',' optionally enclosed by '"'
(
             ATM_LOC_IDNT,
             DISTT_IDNT,
             ZNE_IDNT,
             ATM_PLACE,
             ATM_NUM,
             ATM_LOC_DESC
)
NOTE:  if the datatype is a date type , an extra code is needed as

OPTIONS (skip=1, errors=100, direct=True)
LOAD DATA
INFILE 'nibl_atm_loc.csv'
TRUNCATE into table ATM_LOC_STG
FIELDS TERMINATED BY ',' optionally enclosed by '"'
(
             ATM_LOC_IDNT,
             DISTT_IDNT,
             ZNE_IDNT,
             ATM_PLACE,
             ATM_NUM,
             ATM_LOC_DESC,
            CUST_DATE  date ‘yyyy/mm/dd’ 
                                     
)
Ø if the data is like 2013/05/23,  place ‘yyyy/mm/dd’
Ø if date is like 2013/may/23, place ‘yyyy/mon/dd’
Ø if the date is like 2013-05-23, place ‘yyyy-mm-dd’
Ø if the date is like 2013-may-23, place’yyyy-mon-dd’



  •        Then we create the bash script load-items.sh as
!# /bin/sh
sqlldr BANKING_TBL01/yomari@BANKINDEV control=load_items.ctl log=load_items.log bad=load_items.bad discard=load_items.discard


  •        finally we execute the scripting as /.load-items.sh

2.      loading data from staging to the target tables
The staging tables can contain more columns like surrogate key. Surrogate  keys are generated through sequences and triggers like in the client like sql developer

create or replace
TRIGGER ATM_LOC_TRIGGER
BEFORE INSERT ON ATM_LOC_DM
REFERENCING NEW AS NEW
FOR EACH ROW
BEGIN

  if(:NEW.ATM_LOC_KEY is null) then
  select ATM_LOC_SEQUENCE.nextval
  into :NEW.ATM_LOC_KEY
  From dual;
  end if;
END;






Finally the target tables are loaded using the query as:

insert into atm_loc_dm(atm_loc_idnt,distt_idnt,zne_idnt,atm_place,atm_num,atm_loc_desc)
select atm_loc_idnt,distt_idnt,zne_idnt,atm_place,atm_num,atm_loc_desc
from atm_loc_stg;

Note: here the target tables contains following columns
                             ATM_LOC_KEY     
             ATM_LOC_IDNT,
            DISTT_IDNT,­­­­­­
            ZNE_IDNT,­­
            ATM_PLACE,
            ATM_NUM,
            ATM_LOC_DESC




How to insert Date datatype in Oracle database using bash script


Oracle database supports the representation of date like 2013/5/22 or 2013/feb/3 etc with datatype DATE datatype in Oracle. however we
can have the misconception that inorder to insert the data in the database the database must be in specific format like 2013/feb/4 but the truth
is the data can  be in format like year/month/day or day/month/year seperated by either / or -.

and the most important thing is


  1. if the data is like 2013/05/22 then be sure to represent as yyyy/mm/dd



OPTIONS (skip=1, errors=100, direct=True)
LOAD DATA
INFILE 'CustomerDataVs.csv'
TRUNCATE into table CUSTOMER_STG
FIELDS TERMINATED BY ',' optionally enclosed by '"'
(
CUST_FIRST_NAME,
CUST_LAST_NAME,
CUST_RELIGION_IDNT,
CUST_EDU_QUALI,
CUST_INCOME_IDNT,
CUST_IDNT,
ADDR_ZNE,
ADDR_DISTT,
ADDR_MUN_VDC,
CUST_MIDDLE_NAME,
CUST_DOB date 'yyyy/mm/dd',
MARITAL_STATUS,
CUST_CONTACT

)


2.           if the data is like 2013-05-22 then be sure to represent as yyyy-mm-dd


OPTIONS (skip=1, errors=100, direct=True)
LOAD DATA
INFILE 'CustomerDataVs.csv'
TRUNCATE into table CUSTOMER_STG
FIELDS TERMINATED BY ',' optionally enclosed by '"'
(
CUST_FIRST_NAME,
CUST_LAST_NAME,
CUST_RELIGION_IDNT,
CUST_EDU_QUALI,
CUST_INCOME_IDNT,
CUST_IDNT,
ADDR_ZNE,
ADDR_DISTT,
ADDR_MUN_VDC,
CUST_MIDDLE_NAME,
CUST_DOB date 'yyyy-mm-dd',
MARITAL_STATUS,
CUST_CONTACT

)



3           if the data is like 2013/may/22 then be sure to represent as yyyy/mon/dd and data of                   
              month must be 3 letter word like jan,feb.....
              but not janu or ja and so on.


OPTIONS (skip=1, errors=100, direct=True)
LOAD DATA
INFILE 'CustomerDataVs.csv'
TRUNCATE into table CUSTOMER_STG
FIELDS TERMINATED BY ',' optionally enclosed by '"'
(
CUST_FIRST_NAME,
CUST_LAST_NAME,
CUST_RELIGION_IDNT,
CUST_EDU_QUALI,
CUST_INCOME_IDNT,
CUST_IDNT,
ADDR_ZNE,
ADDR_DISTT,
ADDR_MUN_VDC,
CUST_MIDDLE_NAME,
CUST_DOB date 'yyyy/mon/dd',
MARITAL_STATUS,
CUST_CONTACT

)
                   4.  if the data is like 2013-may-22 then be sure to represent as yyyy-mon-dd


OPTIONS (skip=1, errors=100, direct=True)
LOAD DATA
INFILE 'CustomerDataVs.csv'
TRUNCATE into table CUSTOMER_STG
FIELDS TERMINATED BY ',' optionally enclosed by '"'
(
CUST_FIRST_NAME,
CUST_LAST_NAME,
CUST_RELIGION_IDNT,
CUST_EDU_QUALI,
CUST_INCOME_IDNT,
CUST_IDNT,
ADDR_ZNE,
ADDR_DISTT,
ADDR_MUN_VDC,
CUST_MIDDLE_NAME,
CUST_DOB date 'yyyy-mon-dd',
MARITAL_STATUS,
CUST_CONTACT

)