<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.3.1" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>blobgle.com</title>
	<link>http://blobgle.com/blog</link>
	<description>Crea tu propio blog GRATIS</description>
	<pubDate>Wed, 23 Jan 2008 12:28:09 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.1</generator>
	<language>en</language>
			<item>
		<title>New Solutions Make BI Attractive to All Companies</title>
		<link>http://blobgle.com/blog/?p=110</link>
		<comments>http://blobgle.com/blog/?p=110#comments</comments>
		<pubDate>Wed, 23 Jan 2008 12:28:09 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Ciencia y Tecnologia]]></category>

		<category><![CDATA[bi]]></category>

		<category><![CDATA[bpm]]></category>

		<category><![CDATA[bsc]]></category>

		<category><![CDATA[business intelligence]]></category>

		<category><![CDATA[business objects]]></category>

		<category><![CDATA[business performance]]></category>

		<category><![CDATA[cargas]]></category>

		<category><![CDATA[datamarting]]></category>

		<category><![CDATA[datawarehouse]]></category>

		<category><![CDATA[etl]]></category>

		<category><![CDATA[etling]]></category>

		<category><![CDATA[gestion]]></category>

		<category><![CDATA[kimball]]></category>

		<category><![CDATA[manager]]></category>

		<category><![CDATA[managment]]></category>

		<category><![CDATA[mba]]></category>

		<category><![CDATA[microstrategy]]></category>

		<category><![CDATA[modelamiento multidimensional]]></category>

		<category><![CDATA[modelo]]></category>

		<category><![CDATA[ods]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[operational data store]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[PALO]]></category>

		<category><![CDATA[performance]]></category>

		<category><![CDATA[sap]]></category>

		<category><![CDATA[sdd]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=110</guid>
		<description><![CDATA[New Solutions Make BI Attractive to All Companies

William Copacino , Mark Pendrock
DM Review Magazine, November 2007

The ability to assemble and analyze information has become a powerful source of competitive advantage for many organizations. However, most companies have not harnessed this capability and are operating at a clear competitive disadvantage. The reasons? It has been excessively [...]]]></description>
			<content:encoded><![CDATA[<h1>New Solutions Make BI Attractive to All Companies</h1>
<ul>
<li><a href="http://blobgle.com/authors/2000020.html">William Copacino </a>, <a href="http://blobgle.com/authors/2000021.html">Mark Pendrock</a></li>
<li>DM Review Magazine, November 2007</li>
</ul>
<p>The ability to assemble and analyze information has become a powerful source of competitive advantage for many organizations. However, most companies have not harnessed this capability and are operating at a clear competitive disadvantage. The reasons? It has been excessively expensive and difficult to retrieve and analyze corporate information - until now. In particular, affordable business intelligence (BI) solutions have not been available to the small and medium-sized business (SMB) market, and while many large companies have been able to afford these solutions, they have been frustrated in their attempts to effectively implement and utilize BI capabilities within a reasonable timeframe. The BI environment is changing rapidly as new solutions are emerging that allow many more companies (and a broader group of users within these companies) to access key information and create significant business value. These next-generation solutions offer full BI capabilities, rapid implementation with many fewer demands on a company’s business and technology resources;,dramatically lower cost and lower risk than traditional solutions, and the flexibility to accommodate business changes as they occur. These new solutions will have a major impact on the BI marketplace, providing companies with critical business information and a valuable competitive edge. Increasingly, many leading companies recognize that access to information and analytics is integral to successfully operating and growing their businesses. By being able to view and analyze information throughout their organization, they are able to understand the profitability of market segments and sharpen marketing approaches; improve their ability to serve their customers; better manage their supply base; more rapidly integrate acquisitions; and enhance their overall efficiency and operational performance. Companies such as Wal-Mart, Dell and Amazon have focused analytics on supply chain costs and inventory optimization; Harrah’s, Capital One and Barclay’s have leveraged analytic and BI tools to enhance customer selection, loyalty and service; and MCI, Marriott and Verizon have focused on pricing, profitability and financial analysis.<sup>1</sup> The abundance of data now available from both internal and external sources presents a tremendous opportunity for companies to leverage information for competitive advantage. Over the past two decades, large and small companies have implemented new enterprise software, particularly transaction systems, to help them manage their businesses. These new systems include broad enterprise resource planning (ERP) or related management capabilities such as financial management and accounting, order management, asset management, customer relationship management, marketing and sales management, supply chain management, transportation, procurement and related functionality. Transaction systems provide essential capabilities and allow companies to operate much more efficiently, but also generate large volumes of very valuable data. Paradoxically, access to this transaction data is often limited. The data that is organized and stored to optimize the transaction system has fundamentally different requirements than the data needed for rapid access, analysis and reporting. Unfortunately, end users cannot easily extract data, combine it with data from other transaction systems, analyze it or generate reports - precluding their ability to look at data in new ways or gain new insights. The compelling need to access and leverage this valuable data, which is resident but not available to many companies, has driven the growth of BI.</p>
<h4>Challenges of Traditional BI</h4>
<p>Historically, implementing BI solutions has not been for the weak of heart or those lacking money, resources or time. BI programs are large, lengthy, complex and expensive undertakings that demand careful management. According to industry experts, more than 50 percent of large BI programs fail to deliver to expectations, and many of these projects are abandoned as a direct result of their implementation complexities and/or their failure to deliver anticipated value. Traditional approaches to BI systems necessarily lead to multiphased and very complex projects. Providing information from multiple transaction systems to senior executives and managers for business analytics involves an extensive set of activities - data movement, data transformation, presentation layers, data analytics tools, master data management (MDM), metadata management and overarching data governance. And the lifecycle for implementing BI systems will progress through many steps, including requirements analysis, source systems analysis, gap analysis against existing capabilities, solution architecture, data model development, change management for data governance and implementation. Serving the analytic needs across all functions of the company complicates the design of data models and data transformation processes. Inevitably, companies will start with a goal of a single data management framework, but then implement multiple instances at any or all of the layers of the operational data store (ODS), dimensional data warehouse and data marts. Solution architecture and design must involve the evaluation of multiple special purpose technologies for extract, transform and load (ETL), standard and ad hoc reporting, custom analytics, MDM and metadata management. The project lifecycle is extended by the need for validation of findings at each phase of the project and the complexity of vendor technology selection for multiple best-of-breed technologies or integrated BI platforms. Further, the organization is faced with obtaining the core competencies for each of the technologies and layers of the data management process. In the best case, the typical implementation of BI entails lengthy, multiphase project lifecycles and the investment in multiple technologies. The obvious issues with this are the investment in project resources, the investment in technology, managing data quality and organizational acceptance. The more subtle risk is the ability to respond to changing business strategy, both in the initial project lifecycle as well as for system upgrades. When a company attempts to seize new opportunities, the BI infrastructure must keep pace with the need to evaluate metrics of market entry and performance management. Faced with this challenge, business analysts in both large and small companies often choose to move some or much of reporting and analysis to technologies that they can maintain and understand, most notably Excel spreadsheets and Access databases. While the data management and data governance process is severely compromised when data lands in the desktop tools environment, the more serious issue is how the data is sourced. Business users will extract data from whichever environment is available to them: source transaction systems, ODS, data warehouse or data marts. Inevitably, companies lose any attempt to achieve a single version of the truth. Data management and traditional data warehouse techniques (e.g., conjoined dimensions) are lost because there is no consistency (and no predictability) on the sourcing of data for business analysis. So while these solutions fill a gaping hole in information access, they are neither an enduring nor an ideal answer.</p>
<h4>Broader Audience for BI</h4>
<p>As you can see, the challenges of creating practical, flexible and economical data access and data integration are significant. These challenges have become preemptive barriers for SMBs and major failure points for too many large companies. But these challenges also form principles that must be incorporated in the next generation of BI solutions. We need simplified data integration and a much faster time frame to implement a BI solution. Why must we procure multiple technologies (ETL tool, data model/database and analytic/reporting software), spend many months (or years) and require enormous business and technical resources for data mapping and weaving these applications together? We need more flexibility to accommodate changes in business strategy and company direction. Why must we distract and waste considerable business and technical resources to revisit our entire data integration framework when we have a new acquisition or an upgrade to a new version of our BI software? We need easier user adoption and the ability to make BI available to a broader group of users. Why can’t we have more easy-to-use, intuitive user interfaces that do not require endless training in order to be fully utilized? We need the ability to rapidly access the full data set and to drill down to the most detailed level of data without losing the ability to scale. We need the ability to report off of large data sets (down to the individual transaction and not from some high-level data summary in a data cube), so we can solve business problems and not just identify that we have a problem. Why must we organize our data in subsets of data marts and data cubes and be limited to the data we can access? We ask these questions to be somewhat provocative, but also to break through traditional barriers that many in the market assumed were immovable. The good news is that we are finding that these barriers can be overcome. New technologies, unfettered by the constraints of the old approach, are now designed to address the needs of business users. There are several next-generation BI software products emerging on the market today. Some of these solutions provide an integrated data extraction, data schema and analytics/reporting solution built from the ground up as a single, integrated product. This approach provides a tremendous advantage in the cost, resources, time and risk of implementation because little to no data mapping/integration is required across the solution stack. And a few of the next-gen BI solutions have revolutionized database design, offering reporting with drill downs off the full data set. These solutions eliminate the need for data marts and data cubes, while providing powerful information to users. Some of the more mature next-gen products have demonstrated repeatedly that a full BI solution can be implemented as quickly as in six or eight weeks, versus the six to 12-plus months required for traditional approaches. These solutions have proven to be robust and scalable, handling up to billions of rows of data in several customer implementations. These next-generation BI solutions deliver a truly integrated BI platform that offers great promise for the SMB market and for larger companies as well. The SMB model, with its need for flexibility, will benefit in additional ways from the rapid implementation and “resource lite” approach that some of the next-generation solutions provide. Some solutions enable companies to incorporate changes to their business model (e.g., acquisitions, new data sources, etc.) even more rapidly than the six-week time frame it takes for the initial implementation. BI technology has achieved great success through the years, enabling many large companies to leverage their information assets to compete on analytics. Today, next-generation BI solutions are providing these same capabilities to SMBs - and at the same time, delivering rapid implementation, lower costs, the flexibility to accommodate business changes and dramatically lower risk of failure. Some of these new solutions are delivered through a software as a service (SaaS) model and offer a very intuitive, easy-to-learn and easy-to-use Web interface that simplifies and accelerates user adoption. These solutions enable SMBs to grow profitability without adding IT headcount for training, software management and troubleshooting. Understandably, the benefits of next-gen solutions are valuable to companies of all sizes, giving them a 360-degree view into what’s happening in their organizations today and enabling them to realize business results immediately. The next wave of BI solutions are meeting evolving industry challenges, delivering a powerful value proposition to the market and providing cost-effective, competitive advantage for companies of all sizes.<em>References:</em></p>
<ol>
<li>Thomas H. Davenport. “Competing on Analytics.” <em>Harvard Business Review,</em> January 2006.</li>
</ol>
<p><em>Bill Copacino is the president and CEO of Oco, Inc. He can be reached at (781) 810-2100.</em></p>
<p><em>Mark Pendrock is president of Pendrock Consulting LLC. He can be reached at (914) 924-1273.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=110</wfw:commentRss>
		</item>
		<item>
		<title>The Operational Data Store (ods)</title>
		<link>http://blobgle.com/blog/?p=109</link>
		<comments>http://blobgle.com/blog/?p=109#comments</comments>
		<pubDate>Wed, 23 Jan 2008 12:23:42 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Ciencia y Tecnologia]]></category>

		<category><![CDATA[bi]]></category>

		<category><![CDATA[bsc]]></category>

		<category><![CDATA[business intelligence]]></category>

		<category><![CDATA[cargas]]></category>

		<category><![CDATA[datamarting]]></category>

		<category><![CDATA[etl]]></category>

		<category><![CDATA[etling]]></category>

		<category><![CDATA[kimball]]></category>

		<category><![CDATA[modelamiento multidimensional]]></category>

		<category><![CDATA[modelo]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[PALO]]></category>

		<category><![CDATA[sdd]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=109</guid>
		<description><![CDATA[Designing the Operational Data Store

Bill Inmon
DM Review Magazine, July 1998

Recently there has been controversy over the validity and makeup of the architectural structure known as the operational data store (ODS). Some skeptics question the existence of the ODS. This argument is quite strange because the ODS is one of the most pervasive architectural structures found [...]]]></description>
			<content:encoded><![CDATA[<h3><a href="http://blobgle.com/blog/wp-content/uploads/2008/01/quinta.gif" title="Imagen 5"></a>Designing the Operational Data Store</h3>
<ul>
<li><a href="http://blobgle.com/authors/30038.html">Bill Inmon</a></li>
<li>DM Review Magazine, July 1998</li>
</ul>
<p>Recently there has been controversy over the validity and makeup of the architectural structure known as the operational data store (ODS). Some skeptics question the existence of the ODS. This argument is quite strange because the ODS is one of the most pervasive architectural structures found in information systems today. The notion that the ODS is not a legitimate structure is news to SAP, Oracle Financials and PeopleSoft&#8211;three of the most widely implemented pieces of software in the 1990s which happen to contain major components which are decidedly ODSs. While some aspects of these software packages are beyond the bounds of an ODS, many of the features of these software packages squarely fit the paradigm of an ODS.As further evidence of the health of the ODS, in a recent private conference, seven information systems directors of large, well-known companies spent time describing their environment. The ODS was a prominent feature of each of these companies&#8217; information systems architecture. So it is peculiar that industry experts are questioning the validity of the ODS. Perhaps these experts simply do not understand what an ODS is and what functions it performs.</p>
<h4>The Architectural Positioning</h4>
<p>In order to have a discussion about ODSs, the conversation best begins with a schematic that shows how an ODS is architecturally positioned. Figure 1 shows the classical positioning of the ODS.</p>
<p> <a href="http://blobgle.com/blog/wp-content/uploads/2008/01/primera.gif" title="Imagen 1"><img src="http://blobgle.com/blog/wp-content/uploads/2008/01/primera.gif" alt="Imagen 1" /></a></p>
<p>In Figure 1 the ODS is seen to be an architectural structure that is fed by integration and transformation (i/t) programs. These i/t programs can be the same programs as the ones that feed the data warehouse or they can be separate programs. The ODS, in turn, feeds data to the data warehouse.</p>
<p>Some operational data traverses directly into the data warehouse through the i/t layer while other operational data passes from the operational foundation into the i/t layer, then into the ODS and on into the data warehouse.</p>
<p>An ODS is an <em>integrated, subject- oriented, volatile (including update), current-valued structure </em>designed to serve operational users as they do high performance integrated processing. (Note: For a comprehensive discussion of the subject of operational data stores, refer to the book,<em> Building the Operational Data Store,</em> by W. H. Inmon, Claudia Imhoff and Greg Battas, published by John Wiley &amp; Sons. This article will not try to restate concepts and descriptions that have been in the public domain for quite a while.)</p>
<p>The essence of an ODS is the enablement of integrated, collective on-line processing. An ODS delivers consistent high transaction performance&#8211;two to three seconds. An ODS supports on-line update. An ODS is integrated across many applications. An ODS provides a foundation for collective, up-to- the-second views of the enterprise. And, at the same time, the ODS supports decision support processing.</p>
<p>Because of the many roles that an ODS fulfills, it is a complex structure. Its underlying technology is complex. Its design is complex. Monitoring and maintaining the ODS is complex.</p>
<p>The ODS takes a long time to implement (e.g., SAP). The ODS requires changing or replacing old legacy systems that are unintegrated.</p>
<h4>The Dual Role of the ODS</h4>
<p>There is a very dual role played by the ODS. On the one hand, the ODS is decidedly operational. The ODS provides high response time and high availability and is certainly qualified to act as the basis of mission-critical systems. On the other hand, the ODS has some very clear DSS characteristics. The ODS is integrated, subject oriented and supports some important kinds of decision support.</p>
<h4>The Users&#8211;Farmers and Explorers</h4>
<p>This article will focus on one of the more misunderstood aspects of the ODS&#8211;the foundation of the design. In order to understand the foundation of the design of the ODS, you first need to understand that two very different types of users are attracted to the ODS&#8211;farmers and explorers.</p>
<p>The first user of the ODS is a user who can be called a &#8220;farmer.&#8221; Farmers are those people who do the same task repetitively. Farmers know what they want when they set out to search for something. Farmers look at small amounts of data with each transaction. Farmers almost always find what they want. Farmers usually find small flakes of gold, not huge nuggets, at the completion of their transaction. Farmers operate in a world of structure&#8211;structured data, structured processing, structured procedures and so forth.</p>
<p>The other type of user that is served by the ODS is the quot;explorer.&#8221; The explorer is the antithesis of the farmer. The explorer operates in a random manner. The explorer does not know what he/she is looking for at the outset of the analysis. Explorers operate in a heuristic mode. Explorers look at very large sets of data. Explorers look for associations between types of data, patterns that are useful and relationships that have heretofore never been discovered. The explorer often finds nothing as a result of an analysis, but occasionally the explorer finds huge nuggets of gold. Explorers operate in a pattern that defies prediction. The explorer operates in an almost completely unstructured manner.</p>
<h4>The ODS and Explorers and Farmers</h4>
<p>The ODS must satisfy the needs of both the farmer and the explorer; and because of this paradox, the design of the ODS is a difficult task in the best of circumstances.</p>
<h4>The Basis of Design in DSS</h4>
<p>The classical design of the structures found in the DSS environment begins with a data model, which reflects the informational needs of the corporation. Figure 2 shows the steps leading to a DSS design.</p>
<p> <a href="http://blobgle.com/blog/wp-content/uploads/2008/01/segunda.gif" title="Imagen 2"><img src="http://blobgle.com/blog/wp-content/uploads/2008/01/segunda.gif" alt="Imagen 2" /></a></p>
<p>Normalized tables are generated from the data model. These tables constitute what can be described as a logical design. The many normalized tables are combined into a form of physical design that can be described as lightly normalized design. In a lightly normalized design, tables are combined on the basis of containing common keys and general common usage.</p>
<p>The design technique of creating normalized/lightly normalized structures based on a data model that has been described here fits many instances of DSS design. But there is a fly in the ointment of this approach. When the issues of performance<em> where many tables must be joined, </em>performance <em>where there are many occurrences of data that will populate the design, </em>and simplicity<em> where users find it unnatural to join many tables together to represent data in a form comprehensible to the end user each time the end user does a transaction are considered</em>, the design technique of light normalization yields marginal results.</p>
<p>An alternate design approach is to take into consideration the volume and usage of the data. When the volume and usage of the data are factored into the design, a mutant form of normalization is achieved. The light normalization turns into heavy normalization, and a structure known as the &#8220;star join&#8221; is created. (See Figure 3.)</p>
<p> <a href="http://blobgle.com/blog/wp-content/uploads/2008/01/tercera.gif" title="Imagen 3"><img src="http://blobgle.com/blog/wp-content/uploads/2008/01/tercera.gif" alt="Imagen 3" /></a></p>
<p>There are two essential parts to a star join&#8211; fact tables and dimension tables. (Note: For an in-depth discussion of the subject of multidimensional design, refer to Ralph Kimball&#8217;s book, <em>The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses,</em> published by John Wiley &amp; Sons. This book is the definitive source for the subject of multidimensional database design.) The fact table represents the structure that holds the majority of the occurrences of the data. Fact tables typically combine data and cross reference keys from a variety of other tables.</p>
<p>The other type of table that participates in a star join is the dimension table. Dimension tables contain data which is not terribly voluminous. Dimension tables are related to fact tables by means of a foreign key relationship.</p>
<p>Fact tables are efficient to access because data has been prejoined into the table at the moment of loading. The end user is able to access fact tables efficiently because the fact tables are extremely streamlined in their design. In addition, the fact table is familiar to the end user, in terms of the day-to- day structuring of data that the end user is accustomed to seeing.</p>
<p>By building star joins, the designer has created a structure for efficient access, large volumes of data and natural end-user viewing. However, there is a problem with star joins. In order to know how to create the star join, the designer must make assumptions about the usage of the data. Stated differently, without knowing the predominant pattern of access and usage of the data, you cannot create a star join. At the heart of the design of any star join is the implicit understanding of how the data in the star join is to be used. Unfortunately, one department will look at data very differently from another department. The star join for finance will be very different than the star join for production, for example.</p>
<p>There is a second problem with star join structures, and that problem is that on-line update plays havoc with the underlying data management required to make the star join complete. In a DSS world where there is no update, this is not a problem. But in an ODS world where on-line update is a normal event, the inability of the star join to gracefully handle updates presents a special challenge.</p>
<h4>A Dilemma</h4>
<p>Thus, the ODS designer has a dilemma. On the one hand, the designer wishes to have efficiency of access and the ability to handle large amounts of data. On the other hand, the ODS designer must design the system to be able to accommodate a wide variety of users. The following table illustrates the dilemma of the ODS database designer:</p>
<p>The designer in the ODS environment faces Hobson&#8217;s choice. Neither design approach&#8211;normalized or star join&#8211;is optimal for the ODS. Both approaches have their strengths and weaknesses.</p>
<p>The way the sophisticated designer goes about solving this apparent contradiction is to go back to the users of the system. For those parts of the system used primarily by explorers, a normalized design is optimal. Explorers do not know how they are going to use the system, so normalization suits them just fine. For those parts of the system used primarily by farmers, a star join approach is optimal. Since farmers have a predictable and repetitive usage pattern, a star join can be created to allow them optimal access. Figure 4 shows this dual design approach for the ODS.</p>
<p> <a href="http://blobgle.com/blog/wp-content/uploads/2008/01/cuarta.gif" title="Imagen 4"><img src="http://blobgle.com/blog/wp-content/uploads/2008/01/cuarta.gif" alt="Imagen 4" /></a></p>
<p>The next factor that must be accounted for is the issue of update or pure DSS processing. Some farmers do no update. They are the &#8220;pure&#8221; DSS processors. Other farmers do update as a regular part of their ODS processing.</p>
<p>Explorers, however, seldom do on-line update. If explorers do update at all, it is by creating sweeping batch programs that march across entire tables and make massive changes. But explorers are not known for making changes, certainly not on-line updates. Figure 5 shows that the proper basis of design for an ODS is entirely dependent on who is using the ODS and what kind of work they are doing.</p>
<p> <a href="http://blobgle.com/blog/wp-content/uploads/2008/01/quinta.gif" title="Imagen 5"><img src="http://blobgle.com/blog/wp-content/uploads/2008/01/quinta.gif" alt="Imagen 5" /></a></p>
<p>If the ODS is used <strong>only </strong>by farmers doing DSS processing, then an exclusive star join approach is in order for the entire ODS. But if update processing is being done by farmers or if there is usage of the ODS by explorers to any extent, then one or the other form of normalization is in order. If the ODS is used<strong> only </strong>by explorers, then a normalized approach is in order for the entire ODS.</p>
<p>This article has addressed the architectural structure of an ODS and how it is architecturally positioned. The ODS has a dual design objective, which is quite different from other database structures found in the world of DSS and operational systems.<br />
<em>Bill Inmon is universally recognized as the father of the data warehouse. He has more than 35 years of database technology management experience and data warehouse design expertise. His books have been translated into nine languages. He is known globally for his seminars on developing data warehouses and has been a keynote speaker for many major computing associations. For more information, visit <a href="http://www.inmongif.com/">www.inmongif.com</a> and <a href="http://www.inmoncif.com/">www.inmoncif.com</a>. Inmon may be reached at (303) 681-6772.</em></p>
<p>For more information on related topics, visit the following channels:</p>
<ul>
<li><a href="http://blobgle.com/channels/dw_basics.html">DW Basics</a></li>
<li><a href="http://blobgle.com/channels/operational_data_store.html">Operational Data Store</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=109</wfw:commentRss>
		</item>
		<item>
		<title>what is Operational Data Store (ODS)</title>
		<link>http://blobgle.com/blog/?p=103</link>
		<comments>http://blobgle.com/blog/?p=103#comments</comments>
		<pubDate>Wed, 23 Jan 2008 12:04:46 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Ciencia y Tecnologia]]></category>

		<category><![CDATA[bi]]></category>

		<category><![CDATA[bsc]]></category>

		<category><![CDATA[business intelligence]]></category>

		<category><![CDATA[cargas]]></category>

		<category><![CDATA[datamarting]]></category>

		<category><![CDATA[etl]]></category>

		<category><![CDATA[etling]]></category>

		<category><![CDATA[kimball]]></category>

		<category><![CDATA[modelamiento multidimensional]]></category>

		<category><![CDATA[modelo]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[PALO]]></category>

		<category><![CDATA[sdd]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=103</guid>
		<description><![CDATA[According to Bill Inmon, an operational data store (ODS) is a subject-oriented, integrated, volatile, current-valued, detailed-only collection of data in support of an organization&#8217;s need for up-to-the-second, operational, integrated, collective information.
An operational data store (or &#8220;ODS&#8220;) is a database designed to integrate data from multiple sources to facilitate operations, analysis and reporting. Because the data [...]]]></description>
			<content:encoded><![CDATA[<p>According to <a href="http://en.wikipedia.org/wiki/Bill_Inmon" title="Bill Inmon">Bill Inmon</a>, an <strong>operational data store</strong> (<strong>ODS</strong>) is a subject-oriented, integrated, volatile, current-valued, detailed-only collection of data in support of an organization&#8217;s need for up-to-the-second, operational, integrated, collective information.</p>
<p>An <strong>operational data store</strong> (or &#8220;<strong>ODS</strong>&#8220;) is a <a href="http://en.wikipedia.org/wiki/Database" title="Database">database</a> designed to integrate data from multiple sources to facilitate operations, analysis and reporting. Because the <a href="http://en.wikipedia.org/wiki/Data" title="Data">data</a> originates from multiple sources, the integration often involves cleaning, redundancy resolution and business rule enforcement. An ODS is usually designed to contain low level or atomic (indivisible) data such as transactions and prices as opposed to aggregated or summarized data such as net contributions. Aggregated data is usually stored in the <a href="http://en.wikipedia.org/wiki/Data_warehouse" title="Data warehouse">Data warehouse</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=103</wfw:commentRss>
		</item>
		<item>
		<title>Pinceladas sobre el diseño de un Data Warehouse</title>
		<link>http://blobgle.com/blog/?p=102</link>
		<comments>http://blobgle.com/blog/?p=102#comments</comments>
		<pubDate>Tue, 22 Jan 2008 16:02:26 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[blobgle]]></category>

		<category><![CDATA[bi]]></category>

		<category><![CDATA[bsc]]></category>

		<category><![CDATA[business intelligence]]></category>

		<category><![CDATA[cargas]]></category>

		<category><![CDATA[datamarting]]></category>

		<category><![CDATA[etl]]></category>

		<category><![CDATA[etling]]></category>

		<category><![CDATA[kimball]]></category>

		<category><![CDATA[modelamiento multidimensional]]></category>

		<category><![CDATA[modelo]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[PALO]]></category>

		<category><![CDATA[sdd]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=102</guid>
		<description><![CDATA[Partiendo de la base de que ya se ha hecho un buen análisis de requerimientos con análisis de informes existentes, entrevistas con usuarios clave, etc., y de que ya se tiene claro cuáles son los diferentes focos sobre los que se va a centrar el DWH, nos planteamos cuáles son los hechos más importantes para [...]]]></description>
			<content:encoded><![CDATA[<p>Partiendo de la base de que ya se ha hecho un buen análisis de requerimientos con análisis de informes existentes, entrevistas con usuarios clave, etc., y de que ya se tiene claro cuáles son los diferentes focos sobre los que se va a centrar el DWH, nos planteamos cuáles son los hechos más importantes para la empresa y para los usuarios, los analistas de negocio que han de explotar la información.<br />
Sobretodo se ha de tener siempre en mente que el objetivo del DWH no es permitir sacar informes de detalle, sino que el usuario pueda analizar la información, navegar y ver información de negocio agrupada bajo diferentes puntos de vista.</p>
<p>Volviendo a los focos, estos suelen ser relativamente fáciles de identificar, y estar muy unidos a la actividad principal de departamentos de la empresa. Pongo algunos ejemplos:<br />
Dpt. Comercial &#8211;&gt; Ventas, Comisiones<br />
Dpt. Compras &#8211;&gt; Compras, Proveedores<br />
Dpt. Marketing &#8211;&gt; Campañas, Promociones<br />
Dpt. Contabilidad &#8211;&gt; Pagos, Gastos<br />
Dpt. Personal &#8211;&gt; Presencia, Formación, Contratación<br />
Dpt. Logística &#8211;&gt; Stocks, Distribución<br />
Cada foco se va a convertir en la tabla de hechos y centro de una estrella del Data Warehouse.</p>
<p>La mayoría de las metodologías recomiendan comenzar la implementación del Data Warehouse centrándose en un solo foco, y después ir ampliándolo con los demás, pero siempre uno a uno. Así se irían creando Data Marts separados y orientados al negocio. Aunque se creen por separado van a compartir muchas de las dimensiones, que son los puntos de vista bajo los que se analiza la información, y se puede crear un repositorio común que se irá ampliando con cada Data Mart. Este repositorio también es conocido como ODS (Operational Data Store).<br />
Así, en el ODS se almacena toda la información común, ya con marcas temporales, y se prepara la misma para alimentar a los Data Marts. Este entorno tampoco es obligatorio, pero sí recomendable, ya que al centralizar toda la información corporativa puede servir también para la creación de determinados informes no analíticos.</p>
<p>A nivel de diseño lógico lo primordial es diseñar bien la estrella, definir la tabla de hechos con todos los indicadores importantes para la parte de negocio que abarcará el Data Mart, y elegir la granularidad mínima con la que se van a almacenar los datos. Es importante hacer un esfuerzo para no almacenar más detalle del necesario, o el consumo de espacio se disparará y tendremos problemas de rendimiento cuando queramos acceder a los datos. Hay que elegir esta granularidad para cada &#8216;clave&#8217; de una tabla de hechos.<br />
Por ejemplo, podríamos diseñar un Data Mart de ventas con la información agrupada por Vendedor, Tienda, Cliente, Producto y Dia. Tendrían que darse unas necesidades de análisis muy especiales para justificar la necesidad de sobrecargar el sistema almacenando las ventas a nivel de detalle de horas, o minutos.</p>
<p>Estas mismas claves ya nos han definido el inicio de cinco dimensiones, sólo falta completar la jerarquía con el resto de niveles, que no van a ser más que agrupaciones de los elementos del nivel inferior. Habría que pensar si puede interesar analizar la información con respecto a alguna dimensión más, como la geográfica, que es una de las dimensiones más recurrentes, junto con la de tiempo, que nunca falta.</p>
<p>A partir de aquí ya se entra en el diseño físico. Se suele crear una estructura para recogida de las instantáneas de los sistemas operacionales llamada Stage Area, donde se conserva la estructura exactamente igual que en el sistema origen. Después viene el ODS donde ya se realizan las primeras transformaciones de datos y se comienza a integrar la información. En el ODS se suelen crear ya las marcas temporales necesarias, se unifican datos y estructuras, y se organiza la información de manera que después sea sencillo alimentar las diferentes estrellas. Se puede comenzar a desnormalizar algunas entidades, realizando algunas agrupaciones en tablas por dimensiones, aunque el modelo sigue siendo más parecido a un relacional.</p>
<p>Finalmente se creará una estrella para cada Data Mart (si se sigue la técnica del diseño en estrella), y las dimensiones correspondientes, que para el primer Data Mart serán todas nuevas y, a partir del segundo algunas comenzarán a ser compartidas, y puede que rediseñadas con los nuevos requerimientos del modelo entrante.</p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=102</wfw:commentRss>
		</item>
		<item>
		<title>¿Cómo Almacenar un Arreglo de dos Dimensiones ( 2D ) como un Resultado en una Base de Datos en TestStand?</title>
		<link>http://blobgle.com/blog/?p=101</link>
		<comments>http://blobgle.com/blog/?p=101#comments</comments>
		<pubDate>Tue, 22 Jan 2008 16:00:33 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Ciencia y Tecnologia]]></category>

		<category><![CDATA[database]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[recordset]]></category>

		<category><![CDATA[visual basic]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=101</guid>
		<description><![CDATA[extraido de: http://digital.ni.com/public.nsf/allkb/F6BA269D2958B065862571A300277074 
Problema:
He creado un paso personalizado que devuelve un arreglo de enteros de dos dimensiones (2D) como resultado. Me gustaría almacenar el resultado de este paso en una base de datos de resultados en TestStand. Esto parece posible con un array de una dimension ( 1D), pero ¿cómo lo hago con un arreglo de [...]]]></description>
			<content:encoded><![CDATA[<p><strong>extraido de: <a href="http://digital.ni.com/public.nsf/allkb/F6BA269D2958B065862571A300277074">http://digital.ni.com/public.nsf/allkb/F6BA269D2958B065862571A300277074</a> </strong></p>
<p><strong>Problema:</strong><br />
He creado un paso personalizado que devuelve un arreglo de enteros de dos dimensiones (2D) como resultado. Me gustaría almacenar el resultado de este paso en una base de datos de resultados en TestStand. Esto parece posible con un array de una dimension ( 1D), pero ¿cómo lo hago con un arreglo de dos dimensiones (2D)?</p>
<p><strong>Solución:</strong><br />
TestStand tiene un método para almacenar arreglos de dos dimensiones (2D) en un base de datos. Este es diferente del método utilizado para arreglos de una dimensión (1D) donde cada elemento del arreglo es almacenado en un registro de la tabla. Para almacenar un arreglo de dos dimensiones (2D), debe almacenarlo en una columna de tipo binario de su tabla. Para hacer esto, cree una nueva tabla con las siguientes propiedades:</p>
<ul>
<li><strong>Type:</strong> Recordset</li>
<li><strong>Command Text:</strong> &#8220;SELECT * from [Nombre de la Tabla]&#8221;</li>
<li><strong>Apply to:</strong> Step Result</li>
<li><strong>Types to Log:</strong> [STipo de paso con el que estás adquiriendo tu arreglo de dos dimensiones (2D)]</li>
<li><strong>Lock Type:</strong> Optimistic</li>
</ul>
<p>El resto de las propiedades de la tabla se pueden quedar con los valores por defecto. Una vez tenga su tabla, añada una columna ID como clave primaria ( primary key ), y una segunda columna que contendrá el arreglo de dos dimensiones ( 2D ). Esta columna debería tener las siguientes propiedades:</p>
<ul>
<li><strong>Type:</strong> Binary</li>
<li><strong>Size:</strong> 1.5 veces el tamaño del arreglo en bytes</li>
<li><strong>Expected Properties:</strong> Logging.StepResult.[Propiedad que contiene el arreglo]</li>
<li><strong>Expression:</strong> Logging.StepResult.[Propiedad que contiene el arreglo]</li>
</ul>
<p>El resto de las propiedades pueden dejarse a sus valores por defecto.</p>
<p>Para ver los valores almacenados en la base de datos, abra el Visor de Base de Datos ( Database Viewer ) (<strong>Tools » Database Viewer</strong>) de la tabla que haya creado recientemente. El campo que contiene los arreglos mostrarán un valor de &#8221; Binary Data &#8220;. Haga click derecho sobre un valor y seleccione <strong>Evaluate Data</strong>. La ventana View Binary Data le permitirá ver todos los valores almacenados en el arreglo</p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=101</wfw:commentRss>
		</item>
		<item>
		<title>Quien comprara a Microstrategy?</title>
		<link>http://blobgle.com/blog/?p=100</link>
		<comments>http://blobgle.com/blog/?p=100#comments</comments>
		<pubDate>Sun, 25 Nov 2007 00:54:09 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Curiosidades]]></category>

		<category><![CDATA[Informatica]]></category>

		<category><![CDATA[Juegos]]></category>

		<category><![CDATA[bi]]></category>

		<category><![CDATA[bsc]]></category>

		<category><![CDATA[business intelligence]]></category>

		<category><![CDATA[cargas]]></category>

		<category><![CDATA[datamarting]]></category>

		<category><![CDATA[etl]]></category>

		<category><![CDATA[etling]]></category>

		<category><![CDATA[kimball]]></category>

		<category><![CDATA[modelamiento multidimensional]]></category>

		<category><![CDATA[modelo]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[PALO]]></category>

		<category><![CDATA[sdd]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=100</guid>
		<description><![CDATA[
//&#8211;>

Ya que estamos en la corriente de compras de herramientas de Business Intelligence, Bitool.com y Datamarting Institute en conjunto con Blobgle.com ha lanzado esta graciosa apuesta, donde a todos los que acierten se sorteara entre ellos $US1000 en efectivo, asi que lo unico que deben hacer es enviar por este blob sus apuestas&#8230;
Suerte muchachos.
]]></description>
			<content:encoded><![CDATA[<p><script type="text/javascript"><!--
google_ad_client = "pub-4742144224172228";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text_image";
google_color_border = "fdfdf4";
google_color_bg = "fdfdf4";
google_color_link = "238e23";
google_color_url = "888888";
google_color_text = "000000";
//2007-06-10: barraarriba1
google_ad_channel = "4977554465";
//-->
//&#8211;></script><script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></p>
<p>Ya que estamos en la corriente de compras de herramientas de Business Intelligence, Bitool.com y Datamarting Institute en conjunto con Blobgle.com ha lanzado esta graciosa apuesta, donde a todos los que acierten se sorteara entre ellos $US1000 en efectivo, asi que lo unico que deben hacer es enviar por este blob sus apuestas&#8230;</p>
<p>Suerte muchachos.</p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=100</wfw:commentRss>
		</item>
		<item>
		<title>Datamarting Day 2007 - Peru</title>
		<link>http://blobgle.com/blog/?p=99</link>
		<comments>http://blobgle.com/blog/?p=99#comments</comments>
		<pubDate>Sun, 25 Nov 2007 00:25:45 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Informatica]]></category>

		<category><![CDATA[bi]]></category>

		<category><![CDATA[bsc]]></category>

		<category><![CDATA[business intelligence]]></category>

		<category><![CDATA[cargas]]></category>

		<category><![CDATA[datamarting]]></category>

		<category><![CDATA[etl]]></category>

		<category><![CDATA[etling]]></category>

		<category><![CDATA[kimball]]></category>

		<category><![CDATA[modelamiento multidimensional]]></category>

		<category><![CDATA[modelo]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[PALO]]></category>

		<category><![CDATA[sdd]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=99</guid>
		<description><![CDATA[
//&#8211;>

By: Datamarting Institute
El dia 16 de Noviembre del  2007 se dio el datamarting day 2007 en Peru, este evento que año a año se viene realizando en distintas ciudades de Peru y Bolivia, tratar sobre como implementar correctamente un proyectos de Business Intelligence Analitico, donde se dictar seminarios de notables expertos en la materia.
Este año [...]]]></description>
			<content:encoded><![CDATA[<p><script type="text/javascript"><!--
google_ad_client = "pub-4742144224172228";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text_image";
google_color_border = "fdfdf4";
google_color_bg = "fdfdf4";
google_color_link = "238e23";
google_color_url = "888888";
google_color_text = "000000";
//2007-06-10: barraarriba1
google_ad_channel = "4977554465";
//-->
//&#8211;></script><script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></p>
<p>By: Datamarting Institute</p>
<p>El dia 16 de Noviembre del  2007 se dio el datamarting day 2007 en Peru, este evento que año a año se viene realizando en distintas ciudades de Peru y Bolivia, tratar sobre como implementar correctamente un proyectos de Business Intelligence Analitico, donde se dictar seminarios de notables expertos en la materia.</p>
<p>Este año se presentaron expositores de las empresas: Arson Group (Peru), IdeaSoft (Uruguay), Urudata (Uruguay) entre otras.</p>
<p>Esperamos que el 2008 este evento se realice en Santiago (Chile) y Lima (Peru).</p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=99</wfw:commentRss>
		</item>
		<item>
		<title>JPALO Open Source</title>
		<link>http://blobgle.com/blog/?p=98</link>
		<comments>http://blobgle.com/blog/?p=98#comments</comments>
		<pubDate>Sun, 25 Nov 2007 00:06:23 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Informatica]]></category>

		<category><![CDATA[bi]]></category>

		<category><![CDATA[bsc]]></category>

		<category><![CDATA[business intelligence]]></category>

		<category><![CDATA[cargas]]></category>

		<category><![CDATA[datamarting]]></category>

		<category><![CDATA[etl]]></category>

		<category><![CDATA[etling]]></category>

		<category><![CDATA[kimball]]></category>

		<category><![CDATA[modelamiento multidimensional]]></category>

		<category><![CDATA[modelo]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[PALO]]></category>

		<category><![CDATA[sdd]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=98</guid>
		<description><![CDATA[
//&#8211;>


Hace unos dias encontre esta interesante herramienta Open Source llamada PALO 2.0, La herramienta es un PIVOT Gratuito, que funciona con un motor muntidimensional hecho en JAVA, esta pieza de software es de fabricacion alemana, PALO es un motor orientado a celdas, multidimensional, que está especificamente diseñado para mostrar información desde excel, para todo tipo de [...]]]></description>
			<content:encoded><![CDATA[<p><script type="text/javascript"><!--
google_ad_client = "pub-4742144224172228";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text_image";
google_color_border = "fdfdf4";
google_color_bg = "fdfdf4";
google_color_link = "238e23";
google_color_url = "888888";
google_color_text = "000000";
//2007-06-10: barraarriba1
google_ad_channel = "4977554465";
//-->
//&#8211;></script><script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></p>
<p><a href="http://blobgle.com/blog/wp-content/uploads/2007/11/palo.JPG" title="palo.JPG"></a><a href="http://blobgle.com/blog/wp-content/uploads/2007/11/palo.JPG" title="palo.JPG"><img src="http://blobgle.com/blog/wp-content/uploads/2007/11/palo.JPG" alt="palo.JPG" /></p>
<p></a>Hace unos dias encontre esta interesante herramienta Open Source llamada PALO 2.0, La herramienta es un PIVOT Gratuito, que funciona con un motor muntidimensional hecho en JAVA, esta pieza de software es de fabricacion alemana, <strong>PALO</strong> es un motor orientado a celdas, multidimensional, que está especificamente diseñado para mostrar información desde excel, para todo tipo de análisis. Luego comentamos que tambien existe una versión sobre Eclipse y via web.</p>
<p>Los datos quedan almacenados de forma jerarquica (multidimensional), lo que permite realizar las consultas a gran velocidad. Permite hacer ‘write back’, lo que posibilita hacer presupuestaciones, simulaciones y todo tipo de inclusión y generación de nuevos escenarios.</p>
<p>Si desea probarlo ingresen a esta URL:</p>
<p><a href="http://www.tensegrity-services.com:8080/web-palo">http://www.tensegrity-services.com:8080/web-palo</a><br />
User: guest<br />
Password: pass</p>
<p>La verdad que cuando entre al URL anterior el producto a simple vista se ve bonito, tiene buen buena apariencia, es facil de entender y sobretodo parece que funciona bien.</p>
<p>Hace las funcionalidades basica que todo Visor debe tener, si la comparamos a simple vista se ve en pañales con respectos a las herramientas TOPS pagadas del mercado (Microstrategy, BO, Cognos, BIquery, O3 Business Performance, etc), sin embargo hace el 90% de lo que los usuarios utilizan asi que no es mala opcion evaluarla antes de comprar un producto, En estos dias voy a ver un video que encontre en esta ruta (<a href="http://www.jedox.com/assets/files/downloads/misc/videotour1/index.html">http://www.jedox.com/assets/files/downloads/misc/videotour1/index.html</a>) luego lo instalare (<a href="http://www.jedox.com/en/enterprise-spreadsheet-server/excel-olap-server/palo-server_download.html">http://www.jedox.com/en/enterprise-spreadsheet-server/excel-olap-server/palo-server_download.html</a>) y pasare a comentar mi experiencia.</p>
<p>Espero recibir comentarios de alqguien que lo haya probado.</p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=98</wfw:commentRss>
		</item>
		<item>
		<title>Analisis Multifechas en una Fact Table</title>
		<link>http://blobgle.com/blog/?p=95</link>
		<comments>http://blobgle.com/blog/?p=95#comments</comments>
		<pubDate>Wed, 21 Nov 2007 12:30:06 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Informatica]]></category>

		<category><![CDATA[bi]]></category>

		<category><![CDATA[bsc]]></category>

		<category><![CDATA[business intelligence]]></category>

		<category><![CDATA[cargas]]></category>

		<category><![CDATA[datamarting]]></category>

		<category><![CDATA[etl]]></category>

		<category><![CDATA[etling]]></category>

		<category><![CDATA[kimball]]></category>

		<category><![CDATA[modelamiento multidimensional]]></category>

		<category><![CDATA[modelo]]></category>

		<category><![CDATA[oracle]]></category>

		<category><![CDATA[sdd]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=95</guid>
		<description><![CDATA[
//&#8211;>

by: Jose Zarate Sousa  
Problema
Es comun que en una tabla de hechos nos encontremos con diversa cantidad de fechas, por ejemplo: Fecha de Pedido, Fecha de Entrega, Fecha de Orden, Fecha de Facturacion, Fecha de Cobranza, etc.
Estas fechas son lo bastantes ricas como para excluirlas del modelo, por ejemplo la diferencia de dias entre la fecha [...]]]></description>
			<content:encoded><![CDATA[<p><script type="text/javascript"><!--
google_ad_client = "pub-4742144224172228";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text_image";
google_color_border = "fdfdf4";
google_color_bg = "fdfdf4";
google_color_link = "238e23";
google_color_url = "888888";
google_color_text = "000000";
//2007-06-10: barraarriba1
google_ad_channel = "4977554465";
//-->
//&#8211;></script><script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></p>
<h5>by: Jose Zarate Sousa  </h5>
<h2>Problema</h2>
<p>Es comun que en una tabla de hechos nos encontremos con diversa cantidad de fechas, por ejemplo: Fecha de Pedido, Fecha de Entrega, Fecha de Orden, Fecha de Facturacion, Fecha de Cobranza, etc.</p>
<p>Estas fechas son lo bastantes ricas como para excluirlas del modelo, por ejemplo la diferencia de dias entre la fecha de entrega o despacho vs el dia de facturacion nos permite medir el nivel de atencion al cliente, o la fecha de facturacion con la fecha de cobranza nos indica el tiempo de financiamiento.</p>
<h2>Solucion</h2>
<p>Antes que nada las fechas deben ser agrupadas por contexto, lo que conlleva a que &#8220;podrian&#8221; estar incluidas en alguna de las dimensiones o sea un dato de otra tabla de hechos, por ejemplo:<br />
 Fecha de Pedido      =&gt; Fecha en que el cliente pide formalmente el producto.<br />
 Fecha de Entrega  =&gt; Fecha en que el cliente recibe el producto.<br />
 Fecha de Orden   =&gt; Fecha en que se emite la orden en el sistema.<br />
 Fecha de Facturacion  =&gt; Fecha en que se emite la factura al cliente.<br />
 Fecha de Cobranza  =&gt; Fecha en que se realiza el aviso de cobranza al cliente.  <br />
 Fecha de Despacho  =&gt; Fecha en que sale el producto del almacen.<br />
 Fecha de Pago   =&gt; Fecha en que el cliente realiza el pago.<br />
 Fecha de Vencimiento  =&gt; Fecha de vencimiento de la factura.<br />
 Fecha de Aprobacion de Linea =&gt; Fecha en que al cliente se le aprobo una linea de credito.<br />
Veamos que pasa si los agrupamos por procesos de negocios:</p>
<p> Pedido  Fecha de Pedido      </p>
<p> Orden  Fecha de Orden   </p>
<p> Entrega  Fecha de Entrega  </p>
<p> Ventas  Fecha de Facturacion  </p>
<p> Cobranza Fecha de Cobranza  </p>
<p> Almacen  Fecha de Despacho  </p>
<p> Pagos  Fecha de Pago   </p>
<p> Cobro  Fecha de Vencimiento  <br />
  <br />
 <strong>Que paso?,</strong></p>
<p> Pues, cada fecha se encuentra en un proceso de negocio diferente,</p>
<p><strong> ¿Por que? </strong><br />
 Por que una fecha representa a un proceso de negocio diferente que se da en el tiempo.<br />
Entonces&#8230;.</p>
<p>Si cada proceso de negocio tiene sus propias dimensiones de analisis (entre ellos sus fechas), ademas cuenta con sus propios indicadores, no deberia haber mas de 1 entrada de tipo fecha en mi tabla de hechos, en otras palabras se deberia crear una tabla de hechos por cada proceso de negocio.</p>
<p>En realidad esto es 100% cierto solo que no es posible hacer tantas tablas de hechos debido a que no se dispone del tiempo, recursos o animo para hacerlo, ademas es probable que por problemas de performance o espacio en disco no se creen tantas tablas de hechos y se trate de unificarse en una sola, por lo tanto se deberia agrupar por contexto de la siguiente manera:<br />
 TRANSACCION DE VENTAS<br />
  Fecha de Pedido      <br />
  Fecha de Orden   <br />
  Fecha de Entrega  <br />
  Fecha de Facturacion  <br />
  Fecha de Despacho  <br />
  <br />
 COBRANZA<br />
  Fecha de Cobranza  <br />
  Fecha de Pago   <br />
  Fecha de Vencimiento  <br />
Vemos las diferentes formas de diseñarlo:</p>
<p><font color="#800000">Solucion 1:</font></p>
<p>Supongamos que deseamos diseñarlo en una sola tabla de hechos el modelo quedaria asi:<br />
 ========================================<br />
 FACT_VENTAS<br />
 ========================================<br />
 ClienteID : Grans<br />
 ProductoID : Grans<br />
 ZonaVentaID : Grans <br />
     .<br />
     .<br />
     .  : All Grans</p>
<p> Fecha_Pedido_ID      <br />
 Fecha_Orden_ID   <br />
 Fecha_Entrega_ID  <br />
 Fecha_Facturacion_ID  <br />
 Fecha_Despacho_ID  <br />
 Fecha de Cobranza  <br />
 Fecha de Pago   <br />
 Fecha de Vencimiento <br />
Ventajas:<br />
 . Rapido tiempo de consultas.<br />
 . Facilidad para crear las agregaciones.<br />
Deventajas<br />
 . Dificultad para cargar la tabla de hechos ya que los datos de fechas pueden venir de diferentes tablas origen.<br />
 . Aumento del espacio en disco ya que se guarda informacion poco consultada.<br />
 . Posibilidad de dar 2 datos diferentes ya que al tener tantas fechas se podria confundir el usuario.<br />
 . Si se crea la Tabla de hechos de Cobranzas es posible que haya confusiones por el browser y/ usuario.<br />
Solucion 2:</p>
<p>Supongamos que deseamos diseñarlo en una sola tabla de hechos pero tratarlo como dimensiones chatarras (Junk Dimensions) el modelo quedaria asi:</p>
<p> ========================================<br />
 DIM_TRANSACCION_VENTAS<br />
 ========================================<br />
 Transaccion_SK<br />
      .<br />
 Fecha_Pedido_ID      <br />
 Fecha_Orden_ID   <br />
 Fecha_Entrega_ID  <br />
 Fecha_Facturacion_ID  <br />
 Fecha_Despacho_ID  <br />
 Fecha_Cobranza_ID  <br />
 Fecha_Pago_ID   <br />
 Fecha_Vencimiento_ID<br />
 ========================================<br />
 FACT_VENTAS<br />
 ========================================<br />
 ClienteID : Grans<br />
 ProductoID : Grans<br />
 ZonaVentaID : Grans <br />
     .<br />
     .<br />
     .  : All Grans<br />
   <br />
 Transaccion_Sk</p>
<p>OJO que en la Dimension Transaccion se puede incluir ciertos atributos chatarras que son parte de la transaccion en si (cuidado con incluir las dimensiones degeneradas que podrian perjudicar el modelo como Num_Pedido, Num_Factura, etc)<br />
Ventajas:<br />
 . Facilidad para crear las agregaciones.<br />
 . Reducir el espacio en disco, ya que se agrupan muchos campos en solo un campo de 4-bytes.<br />
 . Simplifica el proceso de carga de datos (ETL)<br />
 . Mejor comprension del modelo por parte del usuario al estar agrupado en un contexto de fechas.</p>
<p>Deventajas<br />
 . El tiempo de consulta esta en funcion a las agregadas que se creen<br />
Solucion 3:<br />
En realidad es una mejora de la solucion 2, en el cual se agrupan las fechas en el contexto de analisis al que pertenece, por ejemplo:<br />
Es probable que en este mismo modelo se requiera el ID del empleado que cobro el dinero, el ID del Almacen que despacho el producto, etc</p>
<p>Como vemos ninguno de esos atributos son directamente relacionados a las ventas por lo tanto son atributos chatarras que deberias agruparse contextualmente de la siguiente manera:<br />
 ========================================<br />
 DIM_TRANSACCION_VENTAS<br />
 ========================================<br />
 Transaccion_SK<br />
      .<br />
 Fecha_Entrega_ID  <br />
 Fecha_Facturacion_ID  <br />
 Fecha_Vencimiento_ID<br />
 ========================================<br />
 DIM_COBRANZA<br />
 ========================================<br />
 Cobranza_SK<br />
      .<br />
 CobradorID<br />
      .<br />
      .  <br />
 Fecha_Pago_ID   <br />
 Fecha_Cobranza_ID  <br />
 ========================================<br />
 DIM_ENTREGA<br />
 ========================================<br />
 Entrega_SK<br />
      .<br />
 AlmacenID<br />
 AlmaceneroID<br />
 SectorID<br />
 VehiculoID<br />
      .<br />
      .  <br />
 Fecha_Entrega_ID  <br />
 Fecha_Despacho_ID  <br />
 </p>
<p> ========================================<br />
 FACT_VENTAS<br />
 ========================================<br />
 ClienteID : Grans<br />
 ProductoID : Grans<br />
 ZonaVentaID : Grans <br />
     .<br />
     .<br />
     .  : All Grans<br />
   <br />
 Transaccion_Sk<br />
 Entrega_Sk<br />
 Cobranza_Sk</p>
<p align="center"><font color="#333399"><em>NOTA: Recordemos que el agrupar las dimensiones chatarras por contexto ayudan a enriqueser el modelo con mayor capacidad analitica.</em></font><br />
 </p>
<p><a href="http://www.blogsperu.com"><br />
<img border="0" src="http://blogsperu.com/images/boton_blogsperu10.gif" alt="BlogsPeru.com" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=95</wfw:commentRss>
		</item>
		<item>
		<title>Uso de tablas puentes en BI (Bridge Table)</title>
		<link>http://blobgle.com/blog/?p=94</link>
		<comments>http://blobgle.com/blog/?p=94#comments</comments>
		<pubDate>Tue, 20 Nov 2007 15:11:30 +0000</pubDate>
		<dc:creator>jzarate</dc:creator>
		
		<category><![CDATA[Informatica]]></category>

		<guid isPermaLink="false">http://blobgle.com/blog/?p=94</guid>
		<description><![CDATA[
//&#8211;>

By: Jose Zarate SousaProblema:
Cuando estamos diseñando un modelo de datos multidimensional nos encontramos que muchos usuarios desean agrupar algunas tablas de forma diferente como por
ejemplo: los productos, las zonas, los clientes, etc
No cabe duda que tratar de homologar las agrupaciones de estas dimensiones para todos los usuarios de diferentes areas es casi imposible, Por lo [...]]]></description>
			<content:encoded><![CDATA[<p><script type="text/javascript"><!--
google_ad_client = "pub-4742144224172228";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text_image";
google_color_border = "fdfdf4";
google_color_bg = "fdfdf4";
google_color_link = "238e23";
google_color_url = "888888";
google_color_text = "000000";
//2007-06-10: barraarriba1
google_ad_channel = "4977554465";
//-->
//&#8211;></script><script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></p>
<h4>By: Jose Zarate Sousa<font color="#800000">Problema:</font></h4>
<p>Cuando estamos diseñando un modelo de datos multidimensional nos encontramos que muchos usuarios desean agrupar algunas tablas de forma diferente como por</p>
<p>ejemplo: los productos, las zonas, los clientes, etc</p>
<p>No cabe duda que tratar de homologar las agrupaciones de estas dimensiones para todos los usuarios de diferentes areas es casi imposible, Por lo tanto si no</p>
<p>podemos crear una &#8220;agrupacion de datos corporativa&#8221;(1) debemos hacer todas las agrupaciones.</p>
<p>Hasta el momento podriamos identificar 2 o 3 agrupaciones diferentes, sin embargo cuando estes analizando estas agrupaciones con los usuarios te diran:</p>
<ul>
<li> ahhh pero olvide decirte que estas agrupaciones las cambiamos cada año.</li>
<li> Te dejo constancia que a veces creamos otras agrupaciones para productos o zonas.</li>
<li> Nos gustaria mantener la nueva agrupacion y no perder la antigua.</li>
<li> Ojo que si &#8220;re-agrupamos&#8221; los productos deseamos comparaslos con la nueva agrupacion y con la antigua agrupacion.</li>
<li> Que pasa con las agregadas, las snapshop y las particionadas.</li>
</ul>
<h3><font color="#800000">Solucion:</font></h3>
<p>Los diseñadores de modelos de datos siempre obtan por usar alguno de estos modos:<br />
MODO 1.- Crear todas las tablas, es decir crear tantas tablas existan como agrupaciones diferentes se quieran crear, por ejemplo si tenemos deseamos agrupar</p>
<p>a clientes por zona, ya que marketing y operaciones la tienen agrupadas de forma diferente, entonces la agrupacion final seria asi:</p>
<p>DIMENSION_CLIENTE</p>
<p>CLIENTESK<br />
CLIENTEDESC  <br />
GRUPO_ZONA_MKT_ID : Agrupacion de Marketing<br />
GRUPO_ZONA_OPE_ID : Agrupacion de Operaciones<br />
GRUPO_ZONA_REP_ID : Agrupacion para los reportes corporativos</p>
<p>Desventajas:</p>
<p>1.- No tiene capacidad para manejar otra agrupacion, solo aumentando columnas.<br />
2.- No tiene capacidad para manejar la reagrupacion historica, solo aumentando columnas.</p>
<p>Ventajas:</p>
<p>. Pero el tiempo de respuesta es muy bueno debido a que va directamente a los datos, los cuales pueden estar agregados por esa agrupacion.<br />
. Permite cruzar facilemente los datos entre agrupaciones.<br />
. Las agrupaciones de cada area pueden tener diferentes niveles de agrupamiento.</p>
<p>MODO 2.- Crear una tabla puente, es decir crear una tabla que contenga las diferentes jerarquias que se desean crear y por otro lado crear una tabla puente</p>
<p>que contenga la relacion del grano que va a ser agrupado, por ejemplo:</p>
<p>DIMENSION_CLIENTE<br />
================<br />
CLIENTEID<br />
CLIENTEDESC<br />
CODIGOPOSTAL : Grano para el puente</p>
<p>JERARQUIASCLIENTEZONA<br />
================<br />
CODIGOPOSTAL<br />
JERARQUIA_ID<br />
SUBGRUPO_ID<br />
GRUPO_ID<br />
    .<br />
    .<br />
    .</p>
<p>DIM_JERARQUIA<br />
================<br />
JERARQUIA_ID<br />
JERARQUIA_DESC<br />
JERARQUIA_OWNER<br />
JERARQUIA_DATE<br />
CASO ROLAP: Estas tablas deben usarse solamente en las Dimensiones de Detalle, ya que para las facts agregadas, particionadas o snapshop estas deben crearse</p>
<p>en forma directa para tener un mejor tiempo de respuesta, las cuales deben recrearse si es que existe un reproceso¡.<br />
Si se desea evitar el reproceso ante algun reagrupamiento, entonces se deben crear agregadas por cada grano de tal forma que jamas requiera reprocesarse.<br />
Desventajas:</p>
<p>1.- Si no se crean correctamente las agregadas se corre el riesgo de ir directamente a la fact de detalle.<br />
2.- El cruze de las agrupaciones puede ser mas lento salvo que se creen las agregadas correctas.<br />
Ventajas:</p>
<p>. Maneja &#8220;N&#8221; agrupaciones<br />
. Maneja la capacidad de reprocesos<br />
. Evitar recrear las agregadas y las snapshop<br />
. Maneja la capcidad de mantener la historia</p>
<p>(1) Este termino lo escuche mencionar por primera vez a un consultor de BI de un Banco Peruano.</p>
]]></content:encoded>
			<wfw:commentRss>http://blobgle.com/blog/?feed=rss2&amp;p=94</wfw:commentRss>
		</item>
	</channel>
</rss>
