杯子茶室

关注有趣的事物

數據倉庫建模

网络 1 评

DW Design and Data Modeling

  • Warehouse data are organized by important business subjects (customer, product, etc.). 仓库数据按重要业务主题(客户、产品等)进行组织。
  • Warehouse data are organized to facilitate ease of access and aggregation. 仓库数据经过组织,便于访问和聚合。
  • Warehouse structure is decomposed into dimensions and facts. 仓库结构分解为维度和事实。

    • Dimensions like ‘independent variables’, represent entities for analysis. 维度(如“独立变量”)代表要分析的实体。
    • Fact represents business data; relates to a set of dimensions. 事实代表业务数据;与一组维度相关。
    • E.g., customer, time, type of account are dimensions, and balances are facts. 例如,客户、时间、账户类型是维度,余额是事实。
  • The complex network of business entities and their relationships as depicted in an operational DB (using, say, ER model) is difficult for navigation and analysis. 业务实体及其关系的复杂网络(如操作数据库 (使用 ER 模型))难以导航和分析。
  • A ‘2-level’ structure defined by ‘star schema’ is performed where a fact is at the center and dimensions form ‘spokes’ 执行由“星型模式”定义的“2 级”结构,其中事实位于中心,维度形成“辐条”
  • Data are not stored in ‘normalized’ form. 数据不以“规范化”形式存储。
  • Data normalization aims at reducing data redundancy and improving data integrity 数据规范化旨在减少数据冗余,提高数据完整性

What is Data Modeling?

  • Data modeling is the process of creating a data model for the data to be stored in a DW/Database. 数据建模是为要存储在 DW/数据库中的数据创建数据模型的过程。
  • This data model is a conceptual representation of: 该数据模型是以下内容的概念表示:

    • Data objects; 数据对象;
    • The associations between different data objects; and 不同数据对象之间的关联;和
    • The rules. 规则
  • Data modeling helps in the visual representation of data and enforces business rules, regulatory compliances, and government policies on the data. 数据建模有助于直观地表示数据,并执行数据的业务规则、法规遵从性和政府政策。
  • Data Models ensure consistency in naming conventions, default values, semantics, security while ensuring quality of the data. 数据模型确保命名约定、默认值、语义、安全性的一致性,同时确保数据质量。
  • Data Model emphasizes on what data is needed and how it should be organized instead of what operations need to be performed on the data. 数据模型强调需要什么数据以及如何组织数据,而不是需要对数据执行哪些操作。
  • Data Model is like architect’s building plan which helps to build a conceptual model and set the relationship between data items 数据模型就像建筑师的建筑计划,有助于建立概念模型并设置数据项之间的关系

Why Use Data Model?

  • Ensures that all data objects required by the database are accurately represented. 确保数据库所需的所有数据对象都准确表示。
  • Omission of data will lead to creation of faulty reports and produce incorrect results. 遗漏数据将导致创建错误的报告并产生不正确的结果。
  • A data model helps design the database at the conceptual, physical and logical levels. 数据模型有助于在概念、物理和逻辑层面设计数据库。
  • Data Model structure helps to define the relational tables, primary and foreign keys and stored procedures. 数据模型结构有助于定义关系表、主键和外键以及存储过程。
  • It provides a clear picture of the base data. 它提供了基础数据的清晰图像。
  • It can be used by database developers to create a physical database. 数据库开发人员可以使用它来创建物理数据库。
  • It is also helpful to identify missing and redundant data. 它也有助于识别缺失和冗余数据。
  • Though the initial creation of data model is labor and time consuming, in the long run, it makes your IT infrastructure upgrade and maintenance cheaper and faster. 虽然最初创建数据模型很费力费时,但从长远来看,它可以让您的 IT 基础设施升级和维护更便宜、更快捷。

Types of Data Models

  1. Conceptual: This Data Model defines WHAT the system contains. 概念:此数据模型定义系统包含的内容。

    • This model is typically created by Business stakeholders and Data Architects. The purpose is to organize, scope and define business concepts and rules 此模型通常由业务利益相关者和数据架构师创建。目的是组织、确定范围和定义业务概念和规则
  2. Logical: Defines HOW the system should be implemented regardless of the DBMS 逻辑:定义无论使用哪种 DBMS,系统应如何实现

    • This model is typically created by Data Architects and Business Analysts. The purpose is to developed technical map of rules and data structures. 此模型通常由数据架构师和业务分析师创建。目的是开发规则和数据结构的技术图。
  3. Physical: This Data Model describes HOW the system will be implemented using a specific DBMS system. 物理:此数据模型描述如何使用特定 DBMS 系统实现系统。

    • This model is typically created by DBA and developers. The purpose is actual implementation of the database. 此模型通常由 DBA 和开发人员创建。目的是实际实现数据库。

Conceptual Model

  • defines WHAT the system contains.
  • Entity: A real-world thing 实体:现实世界中的事物
  • Attribute: Characteristics or properties of an entity 属性:实体的特征或属性
  • Relationship: Dependency or association between two entities. 关系:两个实体之间的依赖关系或关联。
  • Characteristics of a conceptual data model:

    • Offers organization-wide coverage of the business concepts. 提供组织范围内的业务概念覆盖。
    • This type of Data Models are designed and developed for a business audience. 此类数据模型是为业务受众设计和开发的。
    • The conceptual model is developed independently of hardware specifications like data storage capacity, location or software specifications like DBMS vendor and technology. 概念模型的开发与硬件规格(如数据存储容量、位置)或软件规格(如 DBMS 供应商和技术)无关。
    • The focus is to represent data as a user will see it in the “real world” 重点是将数据表示为用户在“现实世界”中看到的样子

Logical Model

  • HOW the system should be implemented
  • Logical data models add further information to the conceptual model elements. 逻辑数据模型向概念模型元素添加了更多信息。
  • It defines the structure of the data elements and set the relationships between them. 它定义了数据元素的结构并设置了它们之间的关系。
  • The advantage of the Logical data model is to provide a foundation to form the base for the Physical model. 逻辑数据模型的优点是提供了基础来形成物理模型的基础。
  • However, the modeling structure remains generic. 但是,建模结构仍然是通用的。
  • At this Data Modeling level, no primary or secondary key is defined. 在此数据建模级别,未定义主键或辅助键。
  • At this Data Modeling level, you need to verify and adjust the connector details that were set earlier for relationships. 在此数据建模级别,您需要验证和调整先前为关系设置的连接器详细信息。

Physical Model

  • A Physical Data Model describes the database specific implementation of the data model. 物理数据模型描述数据模型的数据库特定实现。
  • It offers an abstraction of the database and helps generate schema. 它提供数据库的抽象并帮助生成Schema。
  • This is because of the richness of meta-data offered by a Physical Data Model. 这是因为物理数据模型提供了丰富的元数据。

    • The physical data model describes data need for a single project or application though it maybe integrated with other physical data models based on project scope. 物理数据模型描述单个项目或应用程序的数据需求,尽管它可能根据项目范围与其他物理数据模型集成。
    • Data Model contains relationships between tables that which addresses cardinality and nullability of the relationships 数据模型包含表之间的关系,解决关系的基数和可空性

Advantages of Data Model

  • The main goal of a designing data model is to make certain that data objects offered by the functional team are represented accurately. 设计数据模型的主要目标是确保功能团队提供的数据对象得到准确表示。
  • The data model should be detailed enough to be used for building the physical database. 数据模型应足够详细,以用于构建物理数据库。
  • The information in the data model can be used for defining the relationship between tables, primary and foreign keys, and stored procedures. 数据模型中的信息可用于定义表、主键和外键以及存储过程之间的关系。
  • Data model helps document data mappings in ETL process. 数据模型有助于记录 ETL 过程中的数据映射。

Disadvantages of Data Model

  • To developer Data model one should know physical data stored characteristics. 对于数据模型开发人员来说,应该了解物理数据存储的特​​性。
  • Even smaller change made in structure require modification in the entire application. 即使结构上做出的较小改变也需要对整个应用程序进行修改。

Dimensional Models

a data structure technique optimized for Data warehousing tools
针对数据仓库工具优化的数据结构技术

Elements

Fact: the measurements/metrics or facts from your business process. 来自业务流程的测量/指标或事实。
Dimension: provides the context surrounding a business process event. 提供业务流程事件的上下文。
Attributes: the various characteristics of the dimension. 维度的各种特征。
Fact Table: a primary table in a dimensional model. 维度模型中的主表。
Dimension Table:contains dimensions of a fact, are de-normalized tables, offers descriptive characteristics of the facts with the help of their attributes. 包含事实的维度,是非规范化的表,借助其属性提供事实的描述性特征。

Data Model and Data Cube

data warehouse based on multidimensional data model with views data in data cube

Steps of Dimensional Modelling

The model should describe the Why, How much, When/Where/Who and What of your business process. 该模型应描述业务流程的为什么、多少、何时/何地/谁和什么。
  1. Identify Business Process: Why 确定业务流程:为什么
  2. Identify Grain (level of detail): How Much 确定粒度(详细程度):多少
  3. Identify Dimensions: 3 Ws 确定维度:What, Where, When
  4. Identify Facts: What 确定事实:什么
  5. Build Star 构建星型

Rules of Dimensional Modelling

  • Load atomic data into dimensional structures. 将原子数据加载到维度结构中。
  • Build dimensional models around business processes.围绕业务流程构建维度模型。
  • Need to ensure that every fact table has an associated date dimension table. 需要确保每个事实表都有一个关联的日期维度表。
  • Ensure that all facts in a single fact table are at the same grain or level of detail. 确保单个事实表中的所有事实都处于相同的粒度或详细程度。
  • It’s essential to store report labels and filter domain values in dimension tables.在维度表中存储报告标签和过滤域值至关重要。
  • Need to ensure that dimension tables use a surrogate key.需要确保维度表使用代理键。

Benefits of Dimensional Modelling

  • Standardization of dimensions 维度标准化
  • history of the dimensional information 维度信息的历史
  • introduce entirely new dimension without major disruptions to the fact table 引入全新的维度,而不会对事实表造成重大破坏
  • easier to retrieve the information from the data 更容易从数据中检索信息
  • dimensional table are easier to understand. 维度表更容易理解。
发表评论
撰写评论
    1. FrankJouck reply

      Big cocks of blacks are fake. A setup for HIV.

      In Africa, it is legal to store p**** only in South Africa. Where there are whites.

      That is, it doesn't work for them and they are ashamed of their real size.

      I plan to stop practicing when a Russian citizen goes to sleep with a black man. It will be possible, but it will cost a lot of money. We need to stop squandering women.

      How everything most likely works in the USA. Negroes supply drugs, and studios paint them huge sizes. The West is rotten. It's time to get out of there. They don't appreciate women.

      __

      How negroes get to know white girls

      Most likely they get hooked on drugs. They don't have much money. I suggest checking all Blacks for drugs, and the girls who date them.

      And also to introduce a life sentence for drug trafficking in Russia.

      __

      I propose to introduce a mandatory collection of money from 18+ sites where there is at least one video of a black man with a white woman. Including hentai.

      In case of refusal, exclusion from search engine results.

      Let's make it unprofitable, without blocking.

      If you censor such content, it will work the other way around, according to the principle of "forbidden fruit is sweet."