Wednesday, July 17, 2019
Data Mining and Data Warehouse Essay
nip breeding tap, the extraction of hidden prognostic teaching from humongous selective nurturebases, is a capacityy untried technology with great possible to help companies focus on the close to of the essence(predicate) instruction in their entropy w ar houses. info exploit tools count on hereafter trends and behaviors, every last(predicate)(a)owing workes to make proactive, acquaintance- driven decisions tabulines. information store is a computer system designed to give line of transmission line decision-makers eye blink access to information. The store copies its info from animate systems like order entry, general ledger, and gentlemans gentleman re radicals and stores it for use by executives rather than programmers. information storage w atomic number 18house users use special softwargon package that enables them to create and access information when they postulate it, as opposed to a coverage schedule defined by the information systems ( IS) department. This paper describes the meaning of selective information store and information dig basic architecture of selective information wargonho utilize and information mine, functions and working of entropy tap. It also presents entropy minelaying from information storage wargonhouseINTRODUCTION recent institutions argon under enormous wardrobe with recent development of the technology. Clearly we invite a rapid access to all kinds of information. To assist this we need to consider the ultimo and to identify relevant trend abstract. So to perform any trend analytic thinking we must(prenominal) cook a informationbase. In just about organizations you go out find truly large entropybases in operation for regulation perfunctory dealings. These types of selective informationbases are k straightwayn as available entropybases in most cases they discombobulate not been design to store historic entropy or to respond to queries tho simply to stak e all the applications for day to day dealings.The second type of informationbase put together in organizations is the selective information warehouse. This is designed for strategical decision support and is largely reinforced up from the selective informationbases that make up the working(a) selective informationbase. The basic characteristic of a entropy warehouse is that it contains vast amount of entropy which so-and-so mean billions of records. Smaller, local data warehouse are called data marts. A data warehouse is designed in contingent for decision support queries therefore save data that is needed for decision support is extracted from the running(a) data and stored in the data warehouse along with the time when it was retrieved from operational databases.DEFINITION entropy WAREHOUSINGA data warehouse is a subject-oriented, integrated, time-variant and non-volatile entreaty of data in support of focussings decision qualification transition. Subject-Orient ed A data warehouse flock be employ to analyze a particular subject area. For drill, sales raft be a particular subject. integrated A data warehouse integrates data from threefold data sources. For example, source A and source B may have diametrical ways of identifying a product, except in a data warehouse, there leave be alone a case-by-case way of identifying a product.Time-Variant Historical data is kept in a data warehouse. For example, one lav retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a operations system, where much just now the most recent data is kept. For example, a transaction system may hold the most recent cover of a node, where a data warehouse pile hold all addresses associated with a customer. Non-volatile Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered.The following are the typical steps involved in the data warehousing project cycle.* fatality Gathering* Physical Environment setup* info Modeling* ETL* OLAP Cube excogitation* Front End Development* subject Development* Performance Tuning* motion Optimization* Quality Assurance* furled out to Production* Production nutrition* Incremental EnhancementsBenefits of a data warehouseA data warehouse maintains a copy of information from the source transaction systems. This architectural complexity provides the opportunity to * agree data history, even if the source transaction systems do not. * Integrate data from nine-fold source systems, enabling a of import view across the enterprise. This benefit is evermore valuable, but particularly so when the organization has grown by merger. * Improve data quality, by providing consistent codes and descriptions, flagging or even fixing bad data. * feed the organizations information consistently.* contribute a single common data model for all data of avocation regardless of the datas sour ce. * reconstitute the data so that it makes sense to the business users. * Re coordinate the data so that it delivers splendiferous query performance, even for complex uninflected queries, without impacting the operational systems. * Add value to operational business applications, notably customer affinity setment (CRM) systems. data mine (DM) data dig, also known as fellowship denudation, refers to computer-assisted tools and techniques for sifting done and analyzing these vast data stores in order to find trends, patterns, and cor dealings that can guide decision making and emergence chitchating. information dig covers a abundant variety of uses, from analyzing customer purchases to discovering galaxies.In essence, data mining is the equivalent of finding gold nuggets in a mountain of data. The monumental caper of finding hidden gold depends heavily upon the power of computers The purpose of DM is to analyze and understand past trends and predict future trends.By predicting future trends, business organizations can better position their products and services for financial gain. Nonprofit organizations have also achieved significant benefits from data mining, much(prenominal) as in the area of scientific progress. The concept of data mining is simple yet powerful. The rest of the concept is deceiving, however. Traditional methods of analyzing data, involving query-and-report approaches, cannot handle tasks of such magnitude and complexity. Data mining consists of quintet major elements* Extract, transform, and load transaction data onto the data warehouse system. * Store and manage the data in a third-dimensional database system. * Provide data access to business analysts and information technology professionals. * Analyze the data by application software.* Present the data in a utilitarian format, such as a graph or table.Data mining services can be used for the following functions * question and surveys Data mining can be used for product search, surveys, securities industry research and epitome. Information can be garner that is quite expedient in control new selling foot races and promotions. * Information compendium Through the weather vane scraping regale it is possible to collect information regarding investors, investments and specie by scraping through think entanglementsites and databases. * Customer opinions Customer views and suggestions play an primary(prenominal) role in the way a company pass aways. The information can be readily be found on forums, blogs and former(a) resources where customers freely provide their views. * Data scanning Data collected and stored will be not be important unless scanned. Scanning is important to identify patterns and similarities contained in the data.* Extraction of information This is the processing of identifying the useful patterns in data that can be used in decision making process. This is so because decision making must be based on teleph one set information and facts. * Pre-processing of data Usually the data collected is stored in the data warehouse. This data needs to be pre-processed.by pre-processing it means rough data that may be deemed meaningless may therefore re withdraw manually be data mining experts.* vane data web data usually poses many challenges in mining. This is so because of its nature. For instance, web data can be deemed as dynamic meaning it keeps changing from time to time. Therefore it means the process of data mining should be restate in regular intervals. * Competitor abstract There is a need to understand how your competitors are fairing on in the business market. You need to know both their weaknesses and strengths. Their methods of marketing and distribution can be tap. How they snub their overall costs is also quite important.* Online research The internet is highly regarded for its Brobdingnagian information. It is evident that it is the largest source of information. It is possible to ruck up a lot of information regarding disparate companies, customers and your business clients. It is possible to detect drools through online means. * News Nowadays with almost all major newsworthinesspapers and news sources posting their news online it is possible to gather information regarding trends and other critical areas. In this way, it is possible to be in the better position of competing in the market. * Updating data This is quite important. Data collected will be unprofitable unless it is updated. This is to ensure that the information is relevant so as to make decisions from it.How does data mining work?While large-scale information technology has been evolving separate transaction and analytic systems, data mining provides the link mingled with the two. Data mining software analyzes relationships and patterns in stored transaction data based on open-ended user queries. Several types of analytic software are available statistical, simple machine eruditeness, and neural meshs. Generally, any of four types of relationships are sought * Classes Stored data is used to adjudicate data in predetermined groups. For example, a restaurant chain could mine customer purchase data to determine when customers visit and what they typically order.This information could be used to increase traffic by having daily specials. * Clusters Data items are grouped fit in to logical relationships or consumerpreferences. For example, data can be mined to identify market segments or consumer affinities. * Associations Data can be mined to identify associations. The beer-diaper example is an example of associative mining. * Sequential patterns Data is mined to anticipate behavior patterns and trends. For example, an outdoor equipment retailer could predict the likelihood of a take being purchased based on a consumers purchase of sleeping bags and hiking shoes.Industries/field where data mining is currently utilize are as follows 1. Data archeolo gical site in the Banking celestial sphere Worldwide, banking sector is in front of many other industries in using mining techniques for their vast customer database. Although banks have employed statistical analysis tools with some(a) success for several years, previously spiritual domain patterns of customer behavior are now coming into clear focus with the avail of new data mining tools. These statistical tools and even the OLAP find out the answers, but more travel data mining tools provide insight to the answer. Some of the applications of data mining in this industry are (i)Predict customer reaction to the change of interest rates (ii)Identify customers who will be most receptive to new product offers (iii)Identify loyal customers(iv) Pin point which clients are at the highest run a risk for defaulting on a bestow (v)Find out persons or groups who will select for each type of loan in the following year (vi)Detect dissimulatorulent activities in credit rag transacti ons (vii)Predict clients who are likely to change their credit card affiliation in the next drag (viii)Determine customer preference of the contrastive modes of transaction namely through teller or through credit cards, etc.2. Data Mining in the policy SectorInsurance companies can benefit from modern data mining methodologies, which help companies to shave costs, increase profits, retain current customers, take up new customers, and develop new products .This can be done through (1)Evaluating the risk of the assets being insured taking into cypher the characteristics of the asset as well as the owner of the asset. (2)Formulating Statistical Modeling of Insurance Risks(3)Using the Joint Poisson/Log-Normal Model of mining to optimise insurance policies (4)And finally finding the actuarial Credibility of the risk groups among insurers3. Data Mining in TelecommunicationAs on this date, every activity in telecommunication has used data mining technique.(1)Analysis of telecom serv ice purchases(2)Prediction of telephone calling patterns(3)Management of resources and network traffic(4)Automation of network management and tending using artificial intelligence to mention and repair network transmission problems, etc4. Data Mining in Fraud sensingData dredging has found wide and useful application in discordant fraud spying processes like (1)Credit card fraud detection using a unite parallel approach (2)Fraud detection in the voters list using neural networks in combination with symbolic and analog data mining. (3)Fraud detection in passport applications by designing a specific online learning diagnostic system. (4)Rule and analog based detection of false medical cl get hold ofs and so on.An computer architecture for Data MiningTo best present these advanced techniques, they must be richly integrated with a data warehouse as well as tensile interactive business analysis tools. numerous data mining tools currently operate outside of the warehouse, requiri ng extra steps for extracting, importing, and analyzing the data. Furthermore, when new insights require operational implementation, integration with the warehouse simplifies the application of results from data mining. The resulting analytic data warehouse can be utilise to improve business processes throughout the organization, in areas such as promotional campaign management, fraud detection, new product rollout, and so on. Figure 1 illustrates an architecture for advanced analysis in a large data warehouse.Figure 2 Integrated Data Mining ArchitectureFROM DATA WAREHOUSE TO DATA mineDM is a set of methods for data analysis, created with the aim to find out specific dependence, relations and rules related to data and making them out in the new, higher-level quality information. As rarified from the data warehouse, which has unique data approach, DM gives results that show relations and interdependence of data. Mentioned dependences are mostly based on various mathematical and s tatistic relations.Figure 3 Process of knowledge data discoveryEMERGING TRENDS IN DATA MININGWeb mining is the application of data mining techniques to discover patterns from the Web. correspond to analysis targets, web mining can be divided into three different types, which are Web exercising mining, Web topic mining and Web structure mining. Web usage miningWeb usage mining is the process of extracting useful information from server logs i.e. users history. Web usage mining is the process of finding out what users are looking for on Internet. Some users might be looking at only textual data, whereas some others might be interested in multimedia data. Web structure miningWeb structure mining is the process of using graph theory to analyze the node and connector structure of a web site. According to the type of web structural data, web structure mining can be divided into two kinds 1. Extracting patterns from hyperlinks in the web a hyperlink is a structural fraction that conne cts the web page to a different location. 2. Mining the document structure analysis of the tree-like structure of page structures to describe hypertext mark-up language or XML tag usage. Web theme miningWeb content mining is the mining, extraction and integration of useful data, information and knowledge from Web page contents. Data Stream Mining is the process of extracting knowledge structures from continuous, rapid data records. A data stream is an ordered sequence of instances that in many applications of data stream mining can be read only once or a clarified number of times using moderate computing and storage capabilities. Examples of data streams let in computer network traffic, phone conversations, automatic teller machine transactions, web searches, and sensor data.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.