Введення в спеціальність / Додаткова література / data mining with ontologies
.pdf
Ontology-Based Data Warehousing and Mining Approaches in Petroleum Industries
schemas and or combination of schemas in complex petroleum business situations (Figure 14).
Oil-field data objects are reused and interconnected (interoperability) among multiple petroleum fields (Figure 14), so that knowledge built from data warehouse integration process, is reused and interpreted for drillable exploratory or development well plans. As demonstrated in an OLAP model in Figure 12. “H” is an appropriate place to drill a well for exploring and exploiting for oil and gas. This is a successful data implementation model in a petroleum industry. Exploration managers are responsible for providing the valued processed exploration information, so that critical decisions made in planning new exploratory or development wells in the frontier oil bearing sedimentary basins are accurate and precise. Figure14 exhibits the interoperability of petroleum data objects among basins, when data objects are conceptualized and integrated using petroleum ontology (Nimmagadda et al., 2005a;
Nimmagadda et al., 2006).
petroleum Ontology system design
Benefits
•Reusability: Same entities (or if objects interpreted in classes) and their relationships are used in different domain applications and or modeling languages. Entities described in data structures in a basin can be reused in other basins. Even structures used in one basin can be used in other basins elsewhere.
•Search: Searching or viewing a piece of basin data from the warehoused metadata through data mining.
•Reliability: Ontology approach makes the petroleum business data more reliable, because of interoperability and automation of consistency in data structures.
•Specification: The ontology can assist the process of identifying requirements and defining specifications for data warehousing.
•Maintenance: Documenting ontology of petroleum business data is of immense help to reduce the operation and maintenance costs.
Figure 14. Interoperability of data objects among petro-fields
Mapping of Petroleum Field Objects of Gulf Basins
|
Field2 |
|
Gulf Basins |
|
Field1 |
of |
|
|
Field Objects |
Field3 |
|
Field4 |
|
|
Mapping of Petroleum |
|
|
|
|
|
|
Field5 |
Field6 |
|
|
Petroleum bearing field objects of Gulf Basins
Petro Data
Warehouse
|
Data Mining |
||
Knowledge Mapping |
|||
|
|
|
Exploration |
January |
Facts |
|
PeriodPetroDataMining |
|
Facts |
||
February |
|
|
|
March |
|
|
|
April |
c ts |
ct s |
c ts |
|
F a |
F a |
F a |
May |
|
|
|
June |
|
|
|
|
Production |
||
Ontology-Based Data Warehousing and Mining Approaches in Petroleum Industries
•Knowledgeacquisition:Speedandreliability of ontology supportive data warehouse will facilitate the data-mining task much easier and faster in building intelligence from petroleum business data.
Nimmagadda et al. (2005a) address issues of evolving ontology application for effective design ofdatawarehousinganddataminingforpetroleum industries with a focus on ontology acquisition and understanding. Hierarchical structural data content(asisthenatureofpetroleumorganization structure) facilitates the evolutionary process of ontology application in petroleum data warehouse designanddevelopment.Afinerhierarchicaldata structure (described at its atomic level) that carries knowledgeandinformationexchangethroughontologicalstructuringprocesshasdefiniteedgetodata miningapplication.Ontologyknowledgebuilding process also assists in understanding the integra- tionofthefine-grainedrelationaldatastructureof petroleum industry. Data delivery agents (such as
XML and HTML) that respond to easy distribu- tionoffine-grainedmeaningfulstructuralcontent and semantics, appear to be essential knowledge base data distribution tools, also aiding software developers to distribute oil and gas business data in several geographic locations. Petroleum data ontologydiscussedinthispapercanfacilitatequery formulation with several merits such as:
•Sharpen user requests by query expansion while understanding the finer specialized hierarchical structures.
•Widen query scopes of petroleum data contextsinaccordancewithgeneralizedcoarser hierarchical structures.
•Minimizing the ambiguity of user requests by finding the contexts relevant to the requests.
•Reformulatequerieswithinthesamecontexts or by one chosen context to contexts of alternateontologiesthatmaybefittingforbetter actual queries of users’ requests.
Exploration and interpretation of queries or views are accessed from the ontology-based warehouse in a manner to develop and predict the future petroleum reserves in, for example, the Western Australian resources industry. This approach will facilitate establishing the unexplored and unexploited petroleum inventory in any world-wide basin.
The following significant issues will be resolved by the proposed research methodologies:
•Companieshandlingmultifacetedpetroleum data need data integration and information sharing among various operational units—ontology-based data warehousing and data mining technologies will satisfy these needs.
•A common ontological framework will assist to understand and perceive petroleum industry’s complex data entities and attri- butes—facilitatetostructurethesedatainto simpler conceptual models.
•Past experience (Nimmagadda & Rudra,
2004d) suggests that multidimensional logicaldatamodels,suchasstar,snowflake, and fact constellation schemas are ideally suited for designing data warehouse for oil and gas industries.
•Warehouse data models possess flexibil- ity—accommodating future changes in complex company situations.
•Petroleum data considered here are of both technicalandbusinessinnature;focusedon building knowledge on petroleum systems of different basins of Western Australia, with an intention to minimize the risk of well planning and also to make sound economic decision while drilling a prospect (for example data on formation pressures and reservoir qualities with depth will dramatically change economic situation of drilling an expensive well).
•Fasteroperationalanduserresponsetimes— minimizing operational exploration costs.
Ontology-Based Data Warehousing and Mining Approaches in Petroleum Industries
•Frequently, drilled wells ‘go sick,’ which undergo for capital work-over jobs. Present study will improve the health and economics of a drilled well.
•Fromoilandgasindustrypointofview,many moreoilfieldscanbediscoveredorexplored by exploiting the existing exploration and production data of petroleum systems in
WesternAustralia—supportedbyontology, data warehouse and data mining technologies.
•Petroleumdatahavethefollowingsignificant additional dimensions:
1.Longitudinal (time-variant historical data, say 60 years of exploration and production data).
2.Lateral (data from several sedimentary basins, each of around 5000 sq km in area).
The additional dimensions depicted in the data warehouse schemas will enhance the scope of data mining of petroleum data. So far, nobody has investigated these data volumes in terms of ontology, data warehousing and data mining, especially the exploration and production data associated with period dimension. Time series data, represented in 4D time-lapse domain can significantly minimize expensive well drilling plans. These technologies provide immense future scope in both private and public sectors of petroleum industries. There is an opportunity to extend and apply these proposed technologies in overseas basins.
Ontologies have become very common on the World Wide Web (WWW). Future work is aimed to develop a WWW consortium of petroleum resources description framework (PRDF), a language for encoding knowledge on Web pages to make it understandable to electronic agents searching for information. Ontologies of the petroleum systems range from large taxonomies categorizing petroleum systems of North West Shelf (NWS) to categorizations of basins, fields,
and their features and properties. Many areas nowdevelopstandardizedontologiesthatdomain experts can use to share and annotate information in their domain fields.
fUtUre direCtiOns Of researCh
With the growing demand and need to store volume of resources data more intelligently, ontology based warehousing approach handles applications ranging across resources industries, computing industries to geographic information systems, and especiallydatasetsthatcarrymultidimensionswith a key temporal dimension. The present chapter provides an immense future scope, particularly realizing the complexity and heterogeneity of resources business operations. Future research is warranted in building ontology-based metadata that can cater for multidisciplinary datasets from differentoperationalunitsofintegratedoperational companies (such as BP, Shell, ChevronTexaco,
ExxonMobil, ONGC, TATA Industries). These multinational companies manage several contractors and subcontractors, who handle hundreds of day-to-day business operations, costing millions of dollars.
Resources data are often heterogeneous and modeling such data is of enormous value to these industries. Exploration is one of the key (entity/ object) business operations of oil and gas producing industries. Similar subtype class-objects can be interpreted from other business entities, such as drilling, production and marketing operations.
Ontology based warehouse modeling application has already been demonstrated in resources indus- tryscenariosfacilitatingefficientdata-miningand knowledge-mappingapplications.Thesetechnolo- gieshavegreatfuturescopeinindustry-automation anddata-engineering.Ourfutureresearchstudies includemappingand modelingnaturaldatasets for earthquake-prediction analysis, weather-forecast and climate-change studies.
0
Ontology-Based Data Warehousing and Mining Approaches in Petroleum Industries
Mapping and modeling data warehouses in longitudinalandcross-sectionalresearchdomains (Neuman, 1999 pp. 30-31) are very effective for modeling knowledge in multidimensions. Multiple dimensions depicted in the data warehouse schemas, enhance the scope of data mining of heterogeneous data sources. So far, nobody has investigated these data volumes in terms of ontology, data warehousing and data mining, especially technical and commercial data (such as resources data, active and dynamic exploration and production super entities) associated with time dimension. Time series data represented in 4D time-lapse (Dodds and Fletcher 2004) domain
(spatio-temporal)cansignificantlyminimizethe riskinthemakingofcrucialtechnicalandfinancial decisions in resources industries. These technologies can return huge dividends to both private and public sectors of petroleum industries in Australia and other developing industrial countries.
Keeping in view the recent popularity of ontologies on WWW, future work is aimed to developaWWWconsortiumofpetroleumresources description framework (PRDF), a language for encoding knowledge on web pages, to make it understandable to electronic agents searching for information. For example, ontologies of the petroleum systems range, from large taxonomies categorizing petroleum systems of NWS (North
West Shelf, a group of Western Australian producing basins) to classification of basins, fields, besides grouping their geological and reservoir attribute properties. Multidisciplines now develop standardizedontologiesthatdomainexpertscan use to share and annotate information in their domain fields.
What these industries have in common is the needtoanalyzeextremelylargedatasets,produced and collected by a diverse range of methods, to understand particular phenomena, such as earthquake-predictionandclimate-change-global warming analysis. As in the scientific field, an effectivemeanswithwhichtoanalyzethesedata
istouse,visualizationtosummarizethedataand interpret data correlations, trends and patterns.
referenCes
Biswas, G., Weinberg, J., & Li, C. (1995). ITERATE: A conceptual clustering method for knowledgediscoveryindatabases.ArtificialIntelligence in the Petroleum Industry (pp. 111-139). Paris.
Cheung, D.W., Wang, L., Yiu, S.M., & Zhou,
B. (2000). Density based mining of quantitative association rules. PAKDD, LNAI, Vol. 1805 (pp. 257-268).
Dodds, K., & Fletcher, A. (2004). Interval probability process mapping as a tool for drilling decisions analysis: The R&D perspective. The Leading Edge, 23(6), 558-564.
Dunham, H.M. (2003).Datamining,introductory and advanced topics. Prentice Hall.
Erdmann,M.,&Rudi,S.(2001).Howtostructure andaccessXMLdocumentswithontology.IEEE Data & Knowledge Engineering, 36, 317-335.
Gilbert,R.,Liu,Y.,&Abriel,W.(2004).Reservoir modeling: integrating various data at appropriate scales. The Leading Edge, 23(8), 784-788.
Gornik, D. (2000). Datamodelingfordatawarehouses.A rational software white paper. Rational E-development Company.
Guan,S.,&Zhu,F.(2004).Ontologyacquisition and exchange of evolutionary product-brokering agents. Journal of Research and Practice in Information Technology, 36(1), 35-45.
Guha,S.,Rastogi,R.,&Shim,K.(1999b).CURE: An efficient algorithm for clustering large databases. In ProceedingsofACM-SIGMODInterna- tional Conference on Management of Data.
Hadzic,M.,&Chang,E.(2005).Ontology-based support for human disease study. In Proceedings
Ontology-Based Data Warehousing and Mining Approaches in Petroleum Industries
