How will the ETL processes be scheduled?.To what extent are source and target platforms interoperable?.What is the volume and frequency of load?.What in-house expertise does your shop have?.What technology does management feel comfortable with?.How “complex” are the data relationships?.What kinds of dependencies exist in the data?.How similar are the source and target data structures?.While an ETL solution belongs in the application architecture, the data, technology, and “person” architectures also influence the ETL approach. Products, features, and prices often come from the marketing Customer (people) data may come from customer care systems, marketing databases, or third-party databases. Common subject areas include customer, product, marketing, and sales. The data source also depends on what subject area(s) is implemented in the data warehouse. These days, almost all relational systems support ODBC or OLE-DB connections. (in EBCDIC) usually get translated to ASCII either by FTP or a gateway. File formats may include spreadsheets, text files, dBASE, or image files in JPG or GIF format. Providers, the Internet, or data from user desktops (such as a list of products). Other data sources may include files from external data Good data, application, and technology models are essential to creating a data warehouse that meets theĮTL processes populate SQL Server tables primarily from operational systems, such as Accounts Receivable, Customer Care, and Billing systems. If these models don’t exist, you’ll have to create many of them. The data, technology, and application model(s) that describe the source and target systems are extremely important to the success of the extraction, transformation, and load development State transition diagrams, and data flow diagrams. Application models include context diagrams, functional decomposition diagrams, The application architecture includes operational systems, including any support programs such as the ETL programs. Models include network diagrams, computer specifications, and technology standards such as TCP/IP, SQL, and ODBC. Technology architecture includes computers, operating systems, data management systems (e.g., Oracle, Sybase, Btrieve), networks, network elements, and the models that represent these. Data architecture models include conceptual data models, logical data models, physical data models, and physical representations such as COBOLĬopybooks, C structures, and SQL DDL statements. The data architecture includes the data itself and its quality as well as the various models that represent the data, data structures, business and transformation rules, and business meaningĮmbodied in the data and data structures. Staging table rows in a set-oriented fashion. A Perl program could parse a file and generate keys for use by bcp or DTS. Programs could transform code values as they read from DB2 tables or VSAM datasets. Note that these steps can potentially be performed many places in the system chain. Once the data is in anĪppropriate technical and business format, the ETL process can load it into target tables. “Convert values in field x to integer.” An example of a business rule is: “Customers must purchase products in the list of ‘Washer’, ‘Dryer’, “Refrigerator’.”Īpplying business, transformation, and technology rules to data means generating keys, transforming codes, converting datatypes, parsing, merging, and many other operations. After the process reads the data, it must transform the data by applying technology, transformation, and business rules to it. Apply business, transformation, and technical rulesįigure 1 shows this data flow.Similar to target data structures (e.g., flat files and normalized tables).Īt the risk of being a bit simplistic, extraction, transformation, and load requires three main steps: We assume that source data structures are generally not We’ll look at issues in extraction, transformation, and loading and common approaches to loading data. This paper addresses the extraction, transformation, and loadĬomponents of data warehousing. The data from operational systems into the data warehouse so that business intelligence tools can display those pretty pictures. But much of the work in an operational data warehouse involves getting They provide graphs, moving targets, drill-downs, and drill-through. Many of the Business Intelligence tools look way cool.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |