Transcription

Data SheetInformatica EnterpriseData CatalogBenefitsUnleash the Power of Data With an Intelligent Data Catalog Automatically catalog andclassify all types of data acrossthe enterprise using anAI-powered catalogData is the lifeblood of our economy, and data-driven companies turn their data assets into Provide a metadata system ofrecord for the enterprise with acatalog of catalogslike you do with other significant capital and operational investments. Automatically extract the mostgranular metadata from a widearray of data sources, includingcomplex enterprise systems Find data assets through powerfulGoogle-like semantic search Discover and understand yourdata assets with a holistic viewincluding lineage, relationshipviews, and data profiling statsand quality scorecardsrevenue and profits. The first step in any data-driven digital transformation initiative is to manageyour data as an enterprise asset: take inventory of it, assess its value, and maximize its use—justData is diverse and distributed across many different departments, applications, and datawarehouses and data lakes (some on-premises, others in the cloud), making it a challenge toknow exactly what data you have and where. As data sources proliferate, the data landscapebecomes even more complex. Informatica Enterprise Data Catalog is an AI-powered data catalog that provides a machine-learning-based discovery engine to scan and catalog data assets across the enterprise—across multi-cloud and on-premises. Enterprise Data Catalog is powered by the CLAIRE engine, whichprovides intelligence by leveraging metadata to deliver recommendations, suggestions, andautomation of data management tasks. This enables IT users to be more productive and business Identify domains and entitieswith intelligent curationusers to be full partners in the management and use of data. Enrich data assets with governedand crowdsourced annotations,ratings, and reviewsInformatica Enterprise Data Catalog provides data analysts and IT users with powerful semantic Automatically associatebusiness glossary terms totechnical data assetsbusiness glossary. Open APIs to integrate intoyour environment and exposeintelligent metadata anywhere Measure and optimize the value ofyour data assets with Data AssetAnalyticssearch and dynamic facets to filter search results, detailed data lineage, profiling statistics, dataquality scorecards, holistic relationship views, data similarity recommendations, and an integratedCollaboration capabilities leverage subject matter expertise and social curation combined with thepower of AI to guide user experience and automate data curation. Users can quickly find data andeasily manage the life cycle of business terms, definitions, reference data, and more.With Data Asset Analytics in Enterprise Data Catalog, you get insights on the usage of data withinyour organization, enabling you to proactively manage and optimize the value of your data assets.1

Informatica EnterpriseData Catalog is anAI-powered datacatalog that providesa machine-learningbased discovery engineto scan and catalogdata assets acrossthe enterprise—acrosscloud and on-premises.Key FeaturesSemantic Search With Intelligent FacetsFind and discover the most relevant datasets for your analysis using powerful semantic searchwith intelligent facets. Advanced keyword search with token matching finds the most relevantdata assets, and semantic search is even applied to inferred data domains. Intelligent facets,based on the search results, allow users to narrow the search to the datasets of interest.Holistic Relationship DiscoveryGet a holistic view of data in a knowledge graph that lets you quickly search, discover, andunderstand enterprise data and meaningful data relationships. Automatically discover relateddatasets, technical, business, semantic, and usage-based relationships. The holistic data viewshows related datasets, tables, views, data domains, reports, and users. This aids in progressivediscovery of other datasets of interest.Automated Classifications With Intelligent Domain and Entity RecognitionAutomatically classify and identify domains and entities such as customer, product, order etc.across all structured and unstructured data assets at the field, column, and table level. This isa crucial step in the ability for companies to catalog, govern, and extract value from their dataassets. This classified data enables better search, filtering of search results, and businessglossary recommendations. Informatica provides over 60 packaged data domains such as email,credit card number, social security number, country, city, URL, and company name. Users can addtheir own custom domains too. Data assets can be classified using data rules (i.e., columns withdata that matches specific logic defined in the rule) or column name rules (i.e., finds columns thatmatch column name logic defined in the rule).Figure 1: Quickly find datasets with smart semantic search and dynamic facets. View ratings and certified datasets.Data Lineage and Impact AnalysisInteractively trace data origin through lineage views at any level—from business-friendly, systemlevel views that highlight the endpoints to granular views that include all the complex detailsin between. A drill-down lineage view expands any lineage path to show granular column- andmetric-level lineage. Users can perform detailed impact analysis on upstream and downstreamdata assets.2

Collaboration and Social CurationInformatica Enterprise Data Catalog empowers data analysts and data scientists to easily find themost relevant and trusted data for analytics by harnessing the combined power of AI and humanexpertise and collaboration. Data owners and subject matter experts can certify datasets. Dataconsumers can provide ratings and reviews for datasets enabling social curation of data.Users can follow datasets of interest and get notified of changes, and a Q&A platform allowssubject matter experts to answer common questions from users. In addition, users can addcustom attributes and annotations to datasets, further enhancing business-IT collaboration andsearch results.Figure 2: Enable collaboration with Q&A capabilities.Integrated Data QualityView data profiling statistics, data quality rules, scorecards, and metric groups alongside technicalmetadata to understand the quality of data assets before using data for analysis. Profilingstatistics include value distributions, patterns, and data type and data domain inference.Automatic Association of Business Glossary TermsInformatica Enterprise Data Catalog allows for easy import of business glossary assets such asterms, policies, and classifications from Informatica AxonTM as well as third-party tools. Add richbusiness context to the data by automatically associating business terms with the right technicalmetadata, eliminating a tedious manual process and allowing business and IT stewards tocollaboratively manage business metadata that includes efficient human workflow automation.Intelligent Data SimilarityAdvanced statistical and machine learning algorithms identify similar data and subsets of data.This powerful capability helps users find the most relevant and trusted data they need. Forexample, a telecom analyst interested in customer churn analysis might query data containingpre-paid customer activity for the current quarter. Informatica Enterprise Data Catalog canrecommend a cleaner version of the data (substitute data), data containing customer activityfor the previous quarter (union-able data), and a customer detail table to enrich the dataset(joinable data).3

Data Asset Analytics for Data ValueData Asset Analytics provides prepackaged reports and dashboards on data asset inventory,usage, enrichment, level of collaboration, and more. Reports are extensible and can be exported,enabling data leaders to share business adoption and value metrics with stakeholders. AutomatedData Value Calculator, a first-of-its-kind capability, allows an enterprise to measure and optimizethe value of its data assets based on key factors that impact data value.Universal Metadata Connectivity With Advanced ScannersExtract metadata from any type of data source across the enterprise such as databases, datawarehouses, cloud-based data lakes, BI tools, Hadoop clusters, NoSQL, and complex enterprisesystems including legacy and mainframe systems, multi-vendor ETL tools, SQL dialects, andvarious enterprise applications—across multi-cloud environments.With Enterprise Data Catalog Advanced Scanners, you can visually inspect every script, procedure,or process to fully understand its logic and internal data flow. You can obtain a complete columnlevel data lineage, including a full inventory of all the potential lineage sources with rich details.The Advanced Scanners allow you to scan both static and dynamic code,as well as perform language parsing to obtain automated data lineage.Below are some examples of data sources supported for metadata extraction: Databases/Data warehouses: Oracle, MS SQL Server, SQL Scripts, Sybase ASE, IBM Netezza,Teradata, JDBC, SAP HANA, SAP BW, SAP BW/4HANA, Stored Procedures Big Data: Cloudera Navigator, Hive (Cloudera/Hortonworks/MapR/IBM BigInsights/EMR), HDFS,Hortonworks Atlas, Cassandra, MongoDB, Kafka Mainframes: DB2 z/OS, DB2 i5/OS, COBOL, JCL BI and Analytics: SAP BusinessObjects, Tableau, Microsoft Power BI, Cognos, MicroStrategy,OBIEE, QlikView, Qlik Sense, Microsoft SSRS and SSAS, SAS ETL: Informatica PowerCenter , Informatica Data Engineering Integration, Informatica IntelligentCloud Servicessm, Informatica Data Integration Hub, Microsoft SSIS, IBM InfoSphere DataStage,Oracle Data Integrator, Talend Data Integration, AWS Glue Business Glossary: Informatica Axon Data Governance, Informatica Business Glossary Data Modeling: Erwin Data Modeler, SAP PowerDesigner Enterprise Applications: Salesforce, Oracle, Workday, Informatica MDM, SAP ECC, SAP S/4 HANA File Systems: Microsoft SharePoint, Microsoft OneDrive, Windows/Linux Filesystems File Formats: MS Excel, MS Word, MS PowerPoint, Adobe PDF, Flat Files, CSV, Delimited, XML,JSON, Avro, Parquet Cloud Platforms: AWS S3, AWS Redshift, Azure SQL DB, Azure Synapse Analytics, Azure ADLS,Azure ADLS Gen 2, Azure Blob, Google Cloud Storage, Snowflake, Google BigQuery4

Figure 3: Informatica Enterprise Data Catalog supports universal metadata connectivity.Self-Service Data ProvisioningAfter you find the relevant datasets for your analysis, easily move your dataset to the targetof your choice with simple click-through provisioning from within Informatica EnterpriseData Catalog. You can choose from a broad choice of sources and targets including AmazonRedshift, Azure Synapse Analytics, Google BigQuery, Snowflake, and BI tools like Tableau.This capability leverages the integration of Informatica Enterprise Data Catalog with InformaticaCloud Data Integration.Metadata APIs to Integrate Into Your EnvironmentInformatica Enterprise Data Catalog includes REST-based APIs that enable you to integrate itinto your environment and consume catalog content anywhere. Organizations can share anyintelligent metadata—applications, BI reports, and dashboards—with business users. Users canexport and share selected catalog content and associated enrichment metadata.Tableau Integration for Governed Self-Service AnalyticsThe Chrome browser plug-in and Tableau extension for Informatica Enterprise Data Catalogprovide two different options for Tableau users to access the full resources of InformaticaEnterprise Data Catalog from within the native Tableau user interface. Without leaving theTableau interface, users can leverage an intelligent search bar to find trusted data assets, accessbusiness and technical context, and collaborate with their peers.Resource-Level SecurityGrant user and group read/write permissions at the resource level to allow users to view or editcustom attributes, perform domain curation, and associate business glossary terms.5

Enterprise-Scale DeploymentsInformatica Enterprise Data Catalog is built for true enterprise-scale deployments with theability to scan tens of millions of datasets across hundreds of data sources. It supportsparallel metadata ingestion and high-speed distributed indexing to quickly update catalogcontent and deliver unmatched search performance and fault tolerant high availability for 24x7implementations. With Spark-based data profiling, you can profile massive amounts of data atscale to get a deeper understanding of enterprise data.Unified AdministrationManage and monitor the catalog resources, metadata extract schedules, profiling runs,and more from one unified admin console. A job control dashboard provides widgets for taskmonitoring and resource views. Email alerts assist administrators in proactively responding tocatalog issues.Figure 4: Understand your data with holistic data relationship views.BenefitsIntelligently Catalog All Types of Data Across the EnterpriseInformatica Enterprise Data Catalog intelligently discovers many types of data and theirrelationships across the enterprise. Pre-built scanners collect metadata from databases, datawarehouses, data lakes, cloud data stores, applications, BI tools, ETL tools, third-party metadatacatalogs, NoSQL, and more. All the metadata is indexed and cataloged in a highly-scalable graphdatabase architected for fast updates, smart search, and fast queries. As more and more datais created and propagated throughout the enterprise, similar and duplicate datasets inevitablyarise. Informatica Enterprise Data Catalog leverages advanced statistical and machine learningalgorithms to discover similar data and subsets of data, helping users find the most relevant andtrusted data they need.6

About InformaticaFind Data Assets Quickly Through Powerful, Google-Like Semantic SearchDigital transformationchanges expectations: betterservice, faster delivery, withless cost. Businesses musttransform to stay relevantand data holds the answers.Trying to find the data you need across hundreds of enterprise systems may sometimesAs the world’s leader inEnterprise Cloud DataManagement, we’re preparedto help you intelligently lead—in any sector, category, orniche. Informatica provides youwith the foresight to becomemore agile, realize new growthopportunities, or create newinventions. With 100% focus oneverything data, we offer theversatility needed to succeed.business users can search with business terms to find their data and then browse holisticWe invite you to exploreall that Informatica hasto offer—and unleash thepower of data to drive yournext intelligent disruption.seem futile. Only through powerful semantic search built on comprehensive metadata-drivenintelligence and a scalable infrastructure can one even hope to find relevant data. InformaticaEnterprise Data Catalog delivers semantic search with intelligent facets to further refine searchresults. Because Informatica uniquely associates business, technical, and operational metadata,relationship views to find related data assets.Discover and Understand Your Data Assets With Holistic Relationship Views and LineageThe classic saying, “You can’t manage what you can’t measure” is true when it comes tomanaging data assets. To get the most value from data, you need to understand what youhave, where it came from, how it has changed, and what level of trust you have in the data.Informatica Enterprise Data Catalog answers all these questions and more with complete endto-end summary and detailed lineage, profiling statistics, data quality scorecards, and holisticrelationship views, providing a clear picture of your data.Enrich Data Assets With Business Context Through Governed and Crowdsourced AnnotationsInformatica Enterprise Data Catalog maximizes the reuse and value of data by automaticallyclassifying enterprise data assets down to the field/column level. To further increase the value ofdata, Informatica Enterprise Data Catalog captures the context of who is using the data and forwhat purpose, along with crowdsourced tags, annotations, ratings, and reviews. This “wisdom ofcrowds” helps to enrich and curate data, making it even more valuable throughout the enterprise.Informatica Enterprise Data Catalog integrates with Informatica Axon for easy import of businessglossary assets such as business terms, definitions, and policies from Axon. This businessmetadata is automatically associated with technical metadata and operational metadata so thatbusiness analysts, data stewards, and other users can quickly find, understand, and collaborateon data assets.Gain Insight Into Data Usage, Share Best Practices, and Estimate Asset ValueWith Data Asset Analytics in Enterprise Data Catalog, you gain insights into data usage and users,with visibility into what data assets are in demand, who is using them, and more, enabling you todiscover the most valuable data assets within your enterprise. Visual dashboards and exportablereports empower data leaders to share best practices, socialize data catalog adoption, and drivedata-driven decision-making. By calculating data asset value—according to parameters youprovide—the Automated Data Value Calculator helps you proactively manage and optimize yourmost important data assets.Learn MoreTo learn more about Informatica Enterprise Data Catalog, please visit html.Worldwide Headquarters 2100 Seaport Blvd., Redwood City, CA 94063, USAPhone: 650.385.5000, Toll-free in the US: 1.800.653.3871IN06 0421 03238 Copyright Informatica LLC 2021. Informatica, the Informatica logo, CLAIRE, Axon, and PowerCenter are trademarks or registered trademarks of Informatica LLC in the United States and other countries.A current list of Informatica trademarks is available on the web at https://www.informatica.com/trademarks.html. Other company and product names may be trade names or trademarks of theirrespective owners. The information in this documentation is subject to change without notice and provided “AS IS” without warranty of any kind, express or implied.