Article From:https://www.cnblogs.com/reycg-blog/p/9075597.html

The data warehouse is aThematicIntegratedNonvolatileTime varyingbe used forSupport managers’ decision makingA set of data.

 

Data warehouse data are usuallyBatch modeLoading and accessing, but not updating data in the data warehouse environment. Data warehouse

The data is loaded when it is loadedstatic snapshotThe format is carried out. When a subsequent change occurs, a new snapshot record will be written.

Data warehouse. So,The history of data is preserved in data warehouse.

The structure of data warehouse

Data warehouse environmentThere is a different layer of detail in the data.

  1. Early detail layer
  2. Current detail layer
  3. Mildly integrated data layer (data mart)
  4. Highly integrated data layer

The data is imported into the data warehouse by an operating environment. A considerable amount of data transformation usually occurs when data is transferred from the operation layer to the data warehouse layer.

Theme oriented

Data warehouses are oriented to enterprise themes that have been defined in the high-level enterprise data model, such as customers, products, transactions or activities, policies, claims, accounts.

DASD: Direct storage device direct access device

One theme will bestar schemaThe way of linking, such as customer theme, is linked by customer ID.

The phenomenon of first days to n days

data warehouseOnly step by step to design and load the data,That is to say, it is evolutionary rather than revolutionary.

granularity

granularityIt is the level of detail or degree of integration of data units in data warehouses.

The higher the degree of detail, the lower the particle size; the lower the detail, the higher the particle size.

In a data warehouse environment, granularity is the most important design problem because it will profoundly affect the size of data stored in data warehouses.

And the type of query that the data warehouse can answer. The lower the granularity level is, the wider the scope of query is, and conversely, the higher the granularity level, the less query.

 

When a data warehouse of a business or organization has large amounts of data, it is very meaningful to adopt double or multiple granularity levels in detail.

 

Live sample database

It is a subset of real file data or mildly integrated data from data warehouse.sampleIt means that it isA subset of a large databaseliveIt means that this database needs to be doneCyclic Refresh

The live sample data is used for statistical analysis and observation trends. When the data must be observed as a whole, the live sample database can provide very ideal results, but it is not suitable for processing single data records.

 

Zoning design method

Data partitioning is the dispersion of data to the possibleSeparate physical units for separate processingIn the middle.

The problem in the data warehouse environment isHow to partition the current detail data

 

Data organization of data warehouse

1. Simple stacking data

2. Rotation integrated data

3. Simple direct file

4. Continuous file

The lifecycle of data in a data warehouse containsData cleaning

Data cleaning or data transformation are mainly in the following ways.

  • The data is added to a round robin file that has lost its original details.
  • Data are transferred from high-performance media such as DASD to large capacity media.
  • Data is really removed from the system
  • Data transfer from one level of architecture to another.

 

Leave a Reply

Your email address will not be published. Required fields are marked *