Data   Model   Overview     
          Components and Outline Functionality  






A Data Model consists of the following components:


*  Data Sets

*  Data Structures

*  Parameters

*  Lists of Values

*  Flexfields

*  Bursting Definitions

*  Event Triggers

*  Data Properties

*  Sample XML


For an overview of the purposes they serve and how they are interrelated see Figure 1.

      Figure 1


Solid lines show run-time flows, while dashed lines show design-time references. The heavy solid lines denote the key flows during report generation.


This is a “woods-for-trees” article that attempts to provide a quick overview of the components so that you can gain an understanding of what’s possible and what’s not. Each individual data model component will be considered in more details in subsequent articles.



Data Sets


A Data Set consists of a query, or equivalent instructions, for extracting data from a single data source. While the input to the query will be data source dependent and will often be complex (such as multiple table joins), the Query Row Set produced always consists of a simple collection of rows, or records, that contain column values (think of rows queried from a single database table or records retrieved from a single file).



Data Structures


A Data Structure (or Output) consists of a set of instructions to consolidate and enhance the query row sets extracted from multiple data sources to produce a suitably formatted XML file. Query row sets can be merged, split into groups, filtered, and sorted. Aggregated calculations, such as sums and averages, can be applied to the column values within a particular group and used to create a new column within the parent group or as an overall report total. Calculations performed on the column values within the rows of a particular group can be used to create a new column within that group. Defaults may be substituted for null values, and customized aliases can be given to the query row set columns when they appear as elements within the XML output file. Display or business names can be given to query row set columns that act as defaults when the XML is subsequently used to format and layout reports. The formatted XML is then passed on to the formatting engines and used to produce an appropriate collection of report instances.


So, at its core, creating a data model consists of preparing one or more queries and then specifying any additional data restructuring that may be required. For reasons of efficiency, it is best to minimize the amount of restructuring performed by BI Publisher, and to do that restructuring within the data source where possible. In the case of a file data source or when data must be drawn from two or more data sources, most of the restructuring will, of necessity, have to be performed by BI Publisher. However, when sourcing data from a single relational database, such as the Oracle RDBMS, it should be possible to do almost all of the work within the data source: joining tables, filtering and sorting rows, computing aggregates and calculations, and assigning null value defaults, so the creation of customized display / business aliases may be the only data structure amendments that will need to be performed by BI Publisher (XML element naming restrictions mean that XML aliases are usually not best suited for use as display names).





Flexfields (or lexicals) are only relevant if data is being sourced from Oracle Applications. These ERP applications are used by businesses to implement non-core functionality, for which the cost of a bespoke development could not be justified – for example, an accounting package for use by any business other than a firm of accountants. However, a single package can never fit any one business perfectly, so the key issue with these ERP applications is how to allow them to be customized or tailored to implement a client’s business practices at a low cost. Oracle’s solution to this problem involves the concept of the flexfield, a field than appears to be a single field, but which actually consists of disparate segments that are stored separately.


When designing a data model that draws data from Oracle Applications a developer needs to inform BI Publisher that a particular query select column is actually a reference to a flexfield (by adding a “&” character as a prefix), and by specifying elsewhere the details of that flexfield, such as its Application Short Name, ID Flex Name, and ID Flex Number. These flexfield details are stored in the FlexFields (Lexicals) component (in Figure 1, the arrow pointing from Flexfields towards Data Sets indicates that the query has been modified to contain flexfield lexical references and that BI Publisher has to lookup each flexfield reference when preparing a runnable query).



Sample XML


Whereas all the other components of a Data Model consist of metadata - data about data - the Sample XML component is different in that it consists of actual XML data. Its purpose is twofold: to facilitate data model testing and report design.


As BI Publisher is a large and complex product it is important to ensure that the data extraction components are working correctly, independently of the report formatting components. The BI Publisher GUI provides a means for running a data model on a standalone basis (often with a very restrictive filter in place to ensure that only a modest number of rows are returned) and then viewing the resulting XML to ensure that the data model components have been correctly specified.


If data model testing were all that was required then there would be no need to save the Sample XML (as indicted in Figure 1). However, when it comes to designing a report it is much easier if the report fields do not appear as empty blocks on the screen, but instead contain representative data values. Doing so ensures that each field is correctly mapped to the relevant XML element. It also ensures that formatting issues can be resolved during design as the use of sample data ensures that fields can be repositioned on the screen based on representative field lengths. So, to facilitate report design, the Sample XML is attached to the data model and can be imported into the report design component when building any reports that are based upon it.



Parameters and Lists of Values


Many reports are dynamic in nature, in that the exact nature of the data required cannot be determined exclusively from the data held in data sources, but depends in addition on user input. Report parameters are variables that are used to dynamically customize a report’s output, so that different report instances can be produced on each run if required. In Figure 1, the dashed line between Parameters and Data Sets indicates that data set queries contain references to run-time parameter values (bind variables prefixed by colons). The Lists of Values (or Value Sets) component contains queries, each of which returns a list of values that is associated with a particular parameter. At run-time, the relevant queries are run and the associated parameter value lists are displayed on the user’s screen. Once the user has made a selection, the selected parameter values are passed back to extraction engine and the report is run. When the extraction engine encounters a bind variable reference while parsing the query it substitutes the context parameter values selected by the user.


So, as far a query parsing is concerned, parameters work in much the same way as flexfields: the parser encounters a reference to something external within the body of a query, and must look elsewhere in order to resolve this reference and build a runnable query. The difference is that whereas flexfield definitions are static and specified at design time, parameter values are dynamic and specified at runtime.



Bursting Definitions


Bursting is a mechanism for dealing with large numbers of reports – say account statements for clients based in different countries – that differ from one another based on a small number of factors: reports may have different layouts, may need to apply different date and numeric display conventions based on different locales, may need to be delivered in different output formats, and may need to use different delivery channels. Rather than writing each report individually, it makes sense to write one generic report that is parameterized to take the varying factors into account, and then add wrapper functionality around the reporting engine so that its runs the same generic report repeatedly, substituting in the appropriate parameter values for each report – in a sense this approach is just a variant on the dynamic behaviour afforded by the use of parameters and lists of values as described above, except that the focus here is on customizing report features other than data content. In principle, this wrapper functionality could work on the basis of generating one query per report, but it is more efficient to generate one master query that extracts all the data needed to prepare the complete set of reports, and then burst or split this data set into components based on some key (for example, client ID). This fragmenting of one large data set into smaller components is the key feature of bursting.


So, the Bursting Definitions component contains all the information needed to produce a disparate set of reports from a single query, information such as the key to be used to split the master data set into report sized chunks, and dependent on that key the corresponding layout template, locale specific conventions, output format, and delivery channel to be employed.



Event Triggers


Event Triggers only apply to Oracle databases. When certain events occur - for example, immediately before or after a data extraction - a call is made to the database to run a PL/SQL packaged function. What the function does will depend on the design requirements, but it might record status information in a logging table, or if report production is part of a workflow it might initiate a dependent workflow process, or it might retrieve some volatile data that needs to be incorporated into the report being run (for example, if a table containing stock market prices or currency exchange rates has stale data, a web service call might retrieve the latest values so that a calculation based on these statistics and included in the report would contain timely values – the next time the report is run a web service call might not be required, avoiding the inefficiencies associated with building the web service call directly into the report’s data model).



Data Properties


Data Properties describe a miscellany of properties associates with the data extraction process, such as the data source connection to use, the default database PL/SQL package to use, whether or not to make use of disk caching, and details of the XML element tag case formatting to be applied.