Foundations of BusinessIntelligence: Databasesand InformationManagement5C H A P T E RSTUDENT LEARNING OBJECTIVESAfter completing this chapter, you will be able to answer thefollowing questions:1.How does a relational database organize data, and how does itdiffer from an object-oriented database?2.What are the principles of a database management system?3.What are the principal tools and technologies for accessinginformation from databases to improve business performanceand decision making?4.What is the role of information policy and data administrationin the management of organizational data resources?5.Why is data quality assurance so important for a business?ISBN 1-269-41688-X146Essentials of Management Information Systems, Tenth Edition, by Kenneth C. Laudon and Jane P. Laudon. Published by Prentice Hall.Copyright 2013 by Pearson Education, Inc.

C HAPTER O UTLINEChapter-Opening Case: Banco de Crédito del PerúBanks on Better Data Management5.1 The Database Approach to Data Management5.2 Database Management Systems5.3 Using Databases to Improve BusinessPerformance and Decision Making5.4 Managing Data ResourcesBusiness Problem-Solving Case: Text Mining for Gold?BANCO DE CREDITO DEL PERU BANKS ON BETTER DATA MANAGEMENTBancode Crédito del Perú (BCP) is Peru's largest commercial bank and also offersinvestment and retail banking services, including bank accounts, insurance policies,credit cards, mortgage loans, mutual funds, and mobile banking services. With 326offices in Panama, Miami, and Bolivia, BCP employs 15,000 people and generatedUS 425 million in net income in 2010. Running this firm obviously requires manydifferent pieces of information.Although BCP has grown and prospered, its business performance had beenhampered by outdated information systems that made it difficult to use data efficientlyfor operational activities and reporting. Over time, different BCP departments hadacquired a large number of applications to support processes in administration, humanresources, and accounting that were not integrated. These isolated systems were verytime-consuming and expensive to update and use. Each kept track of many differentpieces of data. The same piece of data, such as customer name, might be found inmultiple systems, but use different spellings or formats or be updated at different times.Accessing the data from these multiple systems was a complicated undertaking, severely hindering reporting procedures and delaying business decision making.ISBN 1-269-41688-X Andrey Prokhorov, 2011, iStockPhoto LP.147Essentials of Management Information Systems, Tenth Edition, by Kenneth C. Laudon and Jane P. Laudon. Published by Prentice Hall.Copyright 2013 by Pearson Education, Inc.

148Part II: Information Technology InfrastructureBCP’s information systems staff was collecting data from 15 to 16 satellite applications andMicrosoft Excel spreadsheets before manually collating and analyzing the data in responseto simple queries from managers. The fact that the data were not always accurate and consistent added an extra layer of uncertainty and complexity to any reports that were prepared.What BCP needed was a single repository for business information that would supporta more streamlined set of business applications. The bank decided to replace its legacysystems with an integrated software suite from SAP that included modules for enterpriseresource planning (ERP) and a data warehouse supporting enterprise-wide reporting. BCPimplemented IBM’s DB2 version 9 database management system for Linux, UNIX andWindows to replace its older systems for managing data. DB2 organizes data so that theyappear to come from a single source, enabling individual business applications to extract thedata they need without having to create separate data files for each application. All of thissoftware runs on IBM Power 595, IBM BladeCenter JS43, and IBM Power 570 servers running the AIX operating system, IBM’s version of UNIX.Combining IBM DB2 and SAP NetWeaver Business Warehouse technology has givenBanco de Crédito del Perú comprehensive data management capabilities. The warehouseincludes tools for analyzing the data along multiple dimensions. The bank’s key decisionmakers can use real-time enterprise-wide data to keep track of company performance andcapitalize on opportunities immediately, boosting responsiveness and agility. Sophisticatedbusiness analytics and modeling tools are available.The new solution has also produced significant cost savings at BCP in terms of datamanagement and storage, while boosting performance. With fewer systems and data filesto manage, BCP cut its data management costs in half and lowered data storage costs by 45percent.Sources: IBM, “Banco de Crédito del Peru Vaults over the Competition with Help from IBM and SAP,” January 20,2011 and “Banco Credito Del Peru C,”, accessed April 19, 2011.Essentials of Management Information Systems, Tenth Edition, by Kenneth C. Laudon and Jane P. Laudon. Published by Prentice Hall.Copyright 2013 by Pearson Education, Inc.ISBN 1-269-41688-XThe experience of Banco de Crédito del Perú illustrates the importance of data management.Business performance depends on what a firm can or cannot do with its data. The bank hadgrown its business, but both operational efficiency and management decision making werehampered by fragmented data stored in outdated systems that were difficult to access. Howbusinesses store, organize, and manage their data has an enormous impact on organizationaleffectiveness.The chapter-opening diagram calls attention to important points raised by this caseand this chapter. Banco de Crédito del Perú’s management decided that the firm needed toimprove the management of its data. Pieces of data about customers, accounts, and employees had been stored in a bewildering number of systems that made it extremely difficultfor the data to be retrieved and analyzed. The data were often redundant and inconsistent,limiting their usefulness. Management was unable to obtain an enterprise-wide view of thecompany.In the past, Banco de Crédito del Perú had heavily used manual paper processes to reconcile its inconsistent and redundant data and to assemble data for management reporting.This solution was extremely time-consuming and costly and prevented the bank’s informationtechnology department from performing higher-value work. A more appropriate solution wasto install new hardware and software to create an enterprise-wide repository for businessinformation that would support a more streamlined set of business applications. Enterprisesoftware was integrated with an up-to-date database management system and a data warehousethat could supply data for enterprise-wide reporting. The bank had to reorganize its data into astandard company-wide format, eliminate redundancies, and establish rules, responsibilities,and procedures for updating and using the data.A modern database management system and data warehouse helps Banco de Créditodel Perú boost efficiency by making it easier to locate and assemble data for managementreporting and for processing day-to-day financial transactions. The data are more accurateand reliable, and costs for managing and storing the data have been reduced considerably.

Chapter 5: Foundations of Business Intelligence: Databases and Information Management5.1 The Database Approach to Data ManagementA database is a collection of related files containing records on people, places, or things.One of the most successful databases in modern history is the telephone book. The telephonebook is a collection of records on people and businesses who use telephones. The telephonebook lists four pieces of information for each phone user: last name, first name, address,and phone number. It also contains information on businesses and business categories, suchas auto dealers or plumbing suppliers. The telephone book draws its information from adatabase with files for customers, business classifications, and area codes and geographicregions.Prior to the development of digital databases, a business would use large filing cabinetsfilled with paper files to store information on transactions, customers, suppliers, inventory,and employees. They would also use lists, laboriously collated and typed by hand, to quicklysummarize the information in paper files. You can still find paper-based manual databasesin most doctors’ offices where patient records are stored in thousands of paper files.Needless to say, paper-based databases are extremely inefficient and costly to maintain,often contain inaccurate data, are slow, and make it difficult to access the data in a timelyfashion. Paper-based databases are also extremely inflexible. For instance, it would benearly impossible for a paper-based doctor’s office to combine its files on prescriptions withits files on patients in order to produce a list of all people for whom they had prescribed aspecific drug. For a modern computer database, this would be very easy. In fact, a powerfulfeature of computer databases is the ability to quickly relate one set of files to another.ISBN 1-269-41688-XENTITIES AND ATTRIBUTESHow do you start thinking about the data for your business and how to manage them? Thefirst step is to identify the data you will need to run your business. Typically, you will beusing data on categories of information, such as customers, suppliers, employees, orders,products, shippers, and perhaps parts. Each of these generalized categories representing aperson, place, or thing on which we store and maintain information is called an entity. Eachentity has specific characteristics, called attributes. For example, the entity SUPPLIERhas specific attributes, such as the supplier’s name and address, which would most likelyinclude street, city, state, and zip code. The entity PART typically has attributes such as partdescription, price of each part (unit price), and supplier who produced the part.Essentials of Management Information Systems, Tenth Edition, by Kenneth C. Laudon and Jane P. Laudon. Published by Prentice Hall.Copyright 2013 by Pearson Education, Inc.149

150Part II: Information Technology InfrastructureORGANIZING DATA IN A RELATIONAL DATABASEIf you stored this information in paper files, you would probably have a file on each entityand its attributes. In an information system, a database organizes the data much the sameway, grouping related pieces of data together. The relational database is the most commontype of database today. Relational databases organize data into two-dimensional tables(called relations) with columns and rows. Each table contains data on an entity and itsattributes. For the most part, there is one table for each business entity. So, at the most basiclevel, you will have one table for customers, and a table each for suppliers, parts in inventory,employees, and sales transactions.Let’s look at how a relational database would organize data about suppliers and parts.Take the SUPPLIER table, which is illustrated in Figure 5.1. It consists of a grid of columnsand rows of data. Each individual element of data about a supplier, such as the suppliername, street, city, state, and zip code, is stored as a separate field within the SUPPLIERtable. Each field represents an attribute for the entity SUPPLIER. Fields in a relationaldatabase are also called columns.The actual information about a single supplier that resides in a table is called a row.Rows are commonly referred to as records, or, in very technical terms, as tuples.Note that there is a field for Supplier Number in this table. This field uniquely identifieseach record so that the record can be retrieved, updated, or sorted, and it is called a keyfield. Each table in a relational database has one field designated as its primary key. Thiskey field is the unique identifier for all the information in any row of the table, and this primary key cannot be duplicated.We could use the supplier’s name as a key field. However, if two different suppliers hadthe same name (which does happen from time to time), supplier name would not uniquelyidentify each, so it is necessary to assign a special identifier field for this purpose. Forexample, if you had two suppliers, both named “CBM,” but one was based in Dayton andanother in St. Louis, it would be easy to confuse them. However, if each has a unique suppliernumber, such confusion is prevented.We also see that the address information has been separated into four separate fields:Supplier Street, Supplier City, Supplier State, and Supplier Zip. Data are separated intothe smallest elements that one would want to access separately to make it easy to select onlythe rows in the table that match the contents of one field, such as all the suppliers in Ohio(OH). The rows of data can also be sorted by the contents of the Supplier State field to get alist of suppliers by state regardless of their cities.So far, the SUPPLIER table does not have any information about the parts that aparticular supplier provides for your company. PART is a separate entity from SUPPLIER,and fields with information about parts should be stored in a separate PART table (seeFigure 5.2).Figure 5.1A RelationalDatabase TableISBN 1-269-41688-XA relational databaseorganizes data in theform of two-dimensionaltables. Illustrated hereis a table for the entitySUPPLIER showinghow it represents theentity and its attributes.Supplier Number is thekey field.Essentials of Management Information Systems, Tenth Edition, by Kenneth C. Laudon and Jane P. Laudon. Published by Prentice Hall.Copyright 2013 by Pearson Education, Inc.

Chapter 5: Foundations of Business Intelligence: Databases and Information Management151Figure 5.2The PART TableData for the entity PARThave their own separatetable. Part Number isthe primary key andSupplier Number is theforeign key, enablingusers to find relatedinformation from theSUPPLIER table aboutthe supplier for eachpart.Why not keep information on parts in the same table as suppliers? If we did that, eachrow of the table would contain the attributes of both PART and SUPPLIER. Because onesupplier could supply more than one part, the table would need many extra rows for a singlesupplier to show all the parts that supplier provided. We would be maintaining a great dealof redundant data about suppliers, and it would be difficult to search for the information onany individual part because you would not know whether this part is the first or fiftieth partin this supplier’s record. A separate table, PART, should be created to store these three fieldsand solve this problem.The PART table would also have to contain another field, Supplier Number, so that youwould know the supplier for each part. It would not be necessary to keep repeating all theinformation about a supplier in each PART record because having a Supplier Number fieldin the PART table allows you to “look up” the data in the fields of the SUPPLIER table.Notice that Supplier Number appears in both the SUPPLIER and PART tables. In theSUPPLIER table, Supplier Number is the primary key. When the field Supplier Numberappears in the PART table it is called a foreign key and is essentially a look-up field to finddata about the supplier of a specific part. Note that the PART table would itself have its ownprimary key field, Part Number, to uniquely identify each part. This key is not used to linkPART with SUPPLIER but might be used to link PART with a different entity.As we organize data into tables, it is important to make sure that all the attributes for aparticular entity apply only to that entity. If you were to keep the supplier’s address with thePART record, that information would not really relate only to PART; it would relate to bothPART and SUPPLIER. If the supplier’s address were to change, it would be necessary toalter the data in every PART record rather than only once in the SUPPLIER record.ISBN 1-269-41688-XESTABLISHING RELATIONSHIPSNow that we’ve broken down our data into a SUPPLIER table and a PART table, we mustmake sure we understand the relationship between them. A schematic called an entityrelationship diagram is used to clarify table relationships in a relational database. Themost important piece of information provided by an entity-relationship diagram is themanner in which two tables are related to each other. Tables in a relational database mayhave one-to-one, one-to-many, and many-to-many relationships.An example of a one-to-one relationship might be a situation where a human resourcessystem must store confidential data about employees. It might store data, such as theemployee name, date of birth, address, and job position in one table, and confidential dataEssentials of Management Information Systems, Tenth Edition, by Kenneth C. Laudon and Jane P. Laudon. Published by Prentice Hall.Copyright 2013 by Pearson Education, Inc.

152Part II: Information Technology InfrastructureFigure 5.3A Simple EntityRelationshipDiagramThis diagram shows therelationship between theentities SUPPLIER andPART.about that employee, such as salary or pension benefits, in another table. These two tablespertaining to a single employee would have a one-to-one relationship because each recordin the EMPLOYEE table with basic employee data has only one related record in the tablestoring confidential data.The relationship between the SUPPLIER and PART entities in our database is a one-tomany relationship: Each supplier can supply more that one part, but each part has only onesupplier. For every record in the SUPPLIER table, there may be many related records in thePART table.Figure 5.3 illustrates how an entity-relationship diagram would depict this one-to-manyrelationship. The boxes represent entities. The lines connecting the boxes represent relationships. A line connecting two entities that ends in two short marks designates a one-to-onerelationship. A line connecting two entities that ends with a crow’s foot preceded by a shortmark indicates a one-to-many relationship. Figure 5.3 shows that each part has only one supplier, but many parts can be provided by the same supplier.We would also see a one-to-many relationship if we wanted to add a table about ordersto our database because one supplier services many orders. The ORDER table would contain only the Order Number and Order Date fields. Figure 5.4 illustrates a report showingan order of parts from a supplier. If you look at the report, you can see that the informationon the top-right portion of the report comes from the ORDER table. The actual line itemsordered are listed in the lower portion of the report.Because one order can be for many parts from a supplier, and a single part can be orderedmany times on different orders, this creates a many-to-many relationship between the PARTand ORDER tables. Whenever a many-to-many relationship exists between two tables, it isnecessary to link these two tables in a table that joins this information. Creating a separatetable for a line item in the order would serve this purpose. This table is often called a jointable or an intersection relation. This join table contains only three fields: Order Number andPart Number, which are used only to link the ORDER and PART tables, and Part Quantity.Figure 5.4Sample OrderReportISBN 1-269-41688-XThe shaded areas showwhich data came fromthe ORDER, SUPPLIER,and LINE ITEM tables.The database doesnot maintain data onextended price or ordertotal because they canbe derived from otherdata in the tables.Essentials of Management Information Systems, Tenth Edition, by Kenneth C. Laudon and Jane P. Laudon. Published by Prentice Hall.Copyright 2013 by Pearson Education, Inc.

Chapter 5: Foundations of Business Intelligence: Databases and Information ManagementIf you look at the bottom-left part of the report, this is the information coming from theLINE ITEM table.We would thus wind up with a total of four tables in ou