
Rivero L.Encyclopedia of database technologies and applications.2006
.pdfMultiparticipant Decision Making and Balanced Scorecard Collaborative
is particularly important when the group includes members with different capacities, interests, or activities (Miyake, 2002).
6.It determines a collaborative learning system (Keen, 1991).
7.Group and individual learning are increased by the multiplicity of perspectives and approaches and their integration in the information treatment (Shih-Jen & McKay, 2002).
8.Speeds up the feedback for the organization and its members (Olve, Roy & Wetter, 1999).
9.Stimulates the “network” effect that benefits the “open systems” (Thierauf, 1999).
10.Integrates the decision makers in a unified process with others that without being part of the organization or company have a bearing in the results of the strategy. Hence, the external influences are systematically considered and not only when its accidental bearing modifies the strategic results (Malina & Selto, 2001).
11.Defines a systemic model of strategic control (Ballvé, 2000).
12.Facilitates the disposition to share risks and rewards (Hamel, 2000).
13.Clarifies the purposes of the participants since it expresses them in the reference indicators.
14.Improves the strategies scope, broadcasting its knowledge and understanding among the members of the organization (Miyake, 2002).
15. Increases the specialization of workers, businesspeople, analysts, and executives (Moore, Row & Widener, 2001).
16.Generates great knowledge and domain of strengths, opportunities, weaknesses, and threats.
17.Generates a style of articulation in groupwork, exceeding organizational structures.
18.Establishes standards of qualities benefiting the application of techniques such as benchmarking.
19.Admits the possibility of applying simulation models to generate and analyze alternatives.
20.Defines and holds ethical conceptions in management organization (Fainstein, 1997).
21.Benefits the integration of the established goals.
22.Allows rational integration of personal goals to the general goal.
23.Reduces internal contradictions in organizations (HR Focus, 2002).
24.Enlarges motivation through learning and knowledge.
25.Increases the morality of work and cooperation, strengthening bonds and the shared satisfaction of something done well.
FUTURE TRENDS
The development and application of BSC in decisionmaking groups foresees and opens interesting fields to study its structures and functioning models. Some relevant topics arise (Barnes, 2002; Bohn, 1994; Frank, 2002; Kaplan & Norton, 2001; Neidorf, 2002; Probst, Raub & Romhardt, 2000):
a.Develop system architectures for MDM decisions
b.Analyze information structures as regards comprehension levels, analysis and interpretation levels, and decision levels
c.Represent information in multi-layer systems
d.Develop dynamic indicators
e.Develop scalability of indicators
f.Integrate organizational memory of the company
CONCLUSION
In this way, the modes of participation in the decisions are enriched with tools that permit reaching agreement and making negotiations possible in relation to the shared knowledge from diverse perspectives.
These kinds of work will create new dynamics and concepts to develop decision models.
Together with the relevance that new computing technologies have, there are aspects characteristic of the applications of the methodology and the relations which are developed as a consequence of its usage.
Available informatic technologies power management models and open possibilities to new developments in administrative sciences.
In the new management models, substantial improvements related to technical and ethical aspects are evidenced.
Technically, the high increase of the available information, selection, storing, formal elaboration (quantitative and qualitative), and generated knowledge application possibilities.
Ethically, modifications to behaviors that, as a consequence of general knowledge, will be reflected in new consensus and responsibility mechanisms, derived from the increase in participation in organizational decisions.
REFERENCES
Abecker, A., Bernardi, A., Hinkelmann, K., & Sintek, M. (2002). Infraestructuras de información empresaria para la entrega activa de conocimiento sensible al contexto.
400
TEAM LinG

Mulitparticipant Decision Making and Balanced Scorecard Collaborative
In S. Barnes (Ed.), Sistemas de gestión de conocimiento:Teoria y práctica (pp. 175-219). Madrid, Spain: Thomson.
Alavi, M., & Leiner, D. (1999). Knowledge management systems: Emerging views and practices from the field.
Proceedings of the Hawai International Conference on Information Systems, Maui, Hawaii.
Andreu, R., Ricart, J., & Valor, J. (1996). Estrategia y sistemas de información. Madrid: McGrawInteramericana de España.
Ballvé, A. (2000). Tablero de control. Buenos Aires: Ed Macchi.
Barnes, S. (2002). Knowledge management systems: Theory and practice. Thompson Learning.
Bohn, R.E. (1994). Measuring and managing technological knowledge. Sloan Management Review, 36(1), 61-73.
Byrnes, W., & Chesterton, B. (1978). Decisiones y estrategia. Buenos Aires: El Ateneo.
Chaffey, D. (1998). Groupware, workflow and intranets.
Reengineering the enterprise with collaborative software. Digital Press.
Davenport, T., & Prusak, L. (1998). Working knowledge. Boston: Harvard Business School Press.
DeSanctis, G., & Gallupe, R. (1987). A foundation for the study of group decision support systems. Management Science, 33(5), 589-609.
Easterby-Smith, M. (1997). Disciplines of organizational learning: Contributions and critiques. Human Relations, 50(9), 1085.
Fainstein, H. (2000). La gestión de equipos eficaces. Buenos Aires: Ed Macchi.
Fleischer, C., & Mahaffy, D. (1997). A balanced scorecard approach to public relations management assessment. Public Relations Review, 23(2), 117-143.
Frank, U. (2002). Knowledge management systems: Theory and practice (pp. 115-131). Thomson Learning.
Hamel, G. (2000). Liderando la revolución. Barcelona: Harvard Business School-Gestión.
Holsapple, C., & Joshi, K. (2001). An investigation of factors that influence the management of knowledge in organizations. Journal of Strategic Information Systems.
HR Focus. (2002). How to blend learning and knowledge management. Institute of Management & Administra- M tion, 79(7), 5.
Kaplan, R., & Norton, D. (1997). The balanced scorecard: Translating strategy into action. Boston: Harvard Business School Press.
Kaplan, R., & Norton, P. (2001). Transforming the balanced scorecard from perfomance measurements to strategy management. Accounting Horizons, 15(1), 87.
Keen, P. (1991). Shaping the future: Business design through information technology. Boston: Harvard Business School Press.
Keen, P., & Morton, S. (1978). Decisión support systems: An organizational perspective. New York: AddisonWesley.
KPMG. (1998). Knowledge management. KPMG Research Report.
Landon, K. & Landon, J. (1996). Administración de los sistemas de información. Mexico: Prentice HallHispanoamérica.
Malina, M., & Selto, F.H. (2001). Communicating and controlling strategy: An empirical study of the effectiveness of the balanced scorecard. Journal of Management Accounting Research, 47-90.
Marakas, G. (1999). Decisión support systems. NJ: Prentice Hall.
Mintzberg, H., and Quinn, J.B. (1993). El proceso estratégico. Conceptos, contextos y casos (2nd ed.). Mexico: Prentice Hall.
Miyake, D. (2002). Beyond the numbers: After years of evolution, balanced scorecard applications now integrate strategy and management for competitive advantage. Intelligent Enterprise, 5(12), 24-30.
Moore, C., Rowe, B.J., & Widener, S. (2001, November). HCS: Designing a balanced scorecard in a knowl- edge-based firm. Issues in Accounting Education, 16(4), 569-601.
Neidorf, R. (2002, September-October). Knowledge management: Changing cultures changing attitudes. Online, 26(5), 60-62.
Nonaka, I., & Takeuchi, H. (1995). The knowledegecreating company. Oxford: Oxford University Press.
Olve, N.G., Roy, J. & Wetter, M. (1999). Perfomance drivers: A practical guide to using the balanced scorecard. New York: John Wiley & Sons.
401
TEAM LinG
Multiparticipant Decision Making and Balanced Scorecard Collaborative
Pigott, S. (2000, May). Knowledge management systems for business. Business Information Alert, 12(5), 11.
Probst, G., Raub, S., & Romhardt, K. (2000). Managing knowledge. Chichester, UK: John Wiley & Sons.
Serra, R., & Kastika, E. (1994). Re-estructurando empresas. Buenos Aires: Ed Macchi.
Shih-Jen, K.H. & McKay, R.B. (2002, March). Balanced scorecard: Two perspectives. The CPA Journal, 72(3), 21-25.
Simon, H.A. (1960). The new science of management decisión. New York: Harper and Row.
Steiner, G. (1997). Strategic planning. Mexico: CECSA.
Thierauf, R.J. (1999). Knowledge management systems for business. Quorum Books.
Tissen, R., Andriessen, D., & Lakanne, F. (2000). The knowledge dividend. New York: Prentice Hall.
KEY TERMS
Balanced Scorecard Collaborative: It is a strategic management system that measures, by means of quantitative relations of different selected variables, the behavior of the organization taking into account the settled aims which are established in different perspectives (Increase, Internal Processes, Customers, Finances). The analysis is based on the cause-effect relations between the variables and ratios that represent them.
Benchmarking: Procedure to compare and improve the manufacturing quality and services based on the comparison of operations, methods, procedures, and processes inside and outside the organization.
Cognitive Learning: It is a consequence of the vision of the situation in the light of a new aspect that enables the comprehension of logic relations or the perception of relations between means and aims.
Feedback Learning: It deals with learning based on the input-process-output-feedback process in which three
laws shape the learning process: exercise law: reiteration strengthens the connection between response and stimulus; effect law: the succession of stimulus-response is not enough for learning; reinforcement is needed; and disposition law: achieving goals is a reinforcement particular to every action that has a clear aim.
GDSS: A collective of computer-based technologies that are specifically designed to support the activities and processes related to multiparticipant decision making.
Group Decisions: Decisions adopted by a group of people with complete interaction under majority or consensus conditions.
Groupware: Informatic technologies which allow a group of people to work on a common task by giving them a shared-environment interface (Chaffey, 1998). Its main features are interaction among users (video, text, and sound); centralized and shared information; and groupwork conscience.
Knowledge: It is the application of a combination of instincts, ideas, rules, procedures, and information to guide the actions and decisions of a problem solver within a particular context.
Knowledge Management (KM): Set of activities which deals with knowledge acquisition, selection, internalization, and usage.
Social Learning: Learning based on attention, perception, and memory capacities is significantly influenced by the socialization and education context and particularly by language so as to create human knowledge.
Virtual Collaborative Environments: Computing applications that include systems of groupware in order to assist work groups with a common goal, where participants work in their own computers but share data and information by means of a user’s interface.
Workflow: Systems which are meant to automatize and control business processes. Among its functions are task assignment, alerts, common tasks cooperation, align resources with the strategy, automatization of business processes, and tracking and oversight.
402
TEAM LinG
|
403 |
|
Natural Language Front-End for a Database |
|
|
|
N |
|
|
|
|
|
|
|
Boris Galitsky
Birkbeck College University of London, UK
INTRODUCTION
Whatever knowledge a database contains, one of the essential questions in its design and usability is how its users will interact with it. If these users are human agents, the most ordinary way to query a database would be in the natural language (Gazdar, 1999; Popescu, Etzioni, & Kautz, 2003; Sabourin, 1994). Natural language question answering (NL Q/A), wherein questions are posed in a plain language, may be considered the most universal but not always the best (i.e., fastest) way to provide the information access to a database. One should be aware that approaches to data access, such as visualization, menus and multiple choice, FAQ lists, and so forth, have been successfully employed long before the NL Q/A systems came into play. In the following, I discuss situations in which a particular information access approach is optimal. The five basic means to access (i.e., search) the data with respect to the search methodology is highlighted:
1.Looking through the data itself
2.Consulting the explicit enumeration of choices (e.g., menus, lists, combo boxes)
3.Structured language-based database querying
4.Keyword search
based query specification is appropriate (i.e., rather than NL front-end):
1.A homogeneous domain includes the description of objects, identified by their names (e.g., cars, flowers, people).
2.A domain is completely unstructured; semantic links between its entities and objects are nonsystematic.
3.A domain is oriented to professional users and includes specific terminology.
4.Domain structure is very clear and a user is closely familiar with it.
5.The domain itself is almost unstructured, but the objects of search fall into clusters in accordance to their features. A Boolean combination of keywords is then well-suited for the search of objects in such a domain.
BACKGROUND
An NL front-end for a database implements query translation from NL to SQL. The difficulty in this task is that, first, NL is ambiguous and, second, its understanding
5.NL Q/A (as a front-end to a database or to unstrucrequires domain knowledge that is not represented in a
tured data)
For many data access problems, 1 and 2 compete with 3 through 5 in terms of efficiency. Frequently, explicit browsing of the data or using intermediate steps is sufficiently convenient; however, 1 is good for a limited amount of data, 2 requires data structuring and is not flexible, and 3 is used for fully structured knowledge with relational links. Note that database querying may involve the other (i.e., non NL) means: 1, 2, and 4 to build a query that runs against the data source. The most powerful approach to data management seems to be 3, wherein the retrieval is easily and naturally combined with the update. Also, it does not require data modification to provide the NL search, thereby transitioning to 5.
Nowadays, in a majority of applications, the number of queries that are run is quite limited and can be initiated via a form. However, this will not likely be the case in the future for semistructured knowledge representations. Next I enumerate the situation in which the traditional form-
relational database (i.e., meanings of involved objects as words). The NL processing components that are required to achieve an accuracy desired under database querying are as follows (Galitsky, 2003; Gayatri & Raman, 2001; Wallace, 1984):
1.Morphological and syntactic analyses that produce links between words (i.e.a parser). This kind of analysis uses linguistic but not domain knowledge data.
2.Semantic analysis that establishes a mapping between some words or multiwords and table names, column names, and record values as well as SQL operators. Semantic analysis is bases on the formal treatment of meanings of involved entities (Allen, 1995).
3.Pragmatics analysis, which evaluates the consistency of the obtained query against the database and then filters the hypotheses of syntactic and pragmatic analyses in case an input query is ambiguous.
Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited.
TEAM LinG
In a traditional architecture of database front-end, the components include an index, a lexicon, and a parser (Adam & Gangopadhyay, 1997). The index is used to uniquely identify each form in the system through a conceptual representation of its purpose. The form fields specify database or nondatabase fields whose values are either entered by the user (i.e., user defined) or are derived by the form (i.e., system defined) in response to user input. A set of grammar rules is associated with each form. The lexicon consists of all words recognized by the system, their grammatical categories, roots, their associations (if any) with database objects and forms. The parser scans a natural language query to identify a form in a bottom-up fashion. The information requested in the user query is determined in a top-down manner by parsing through the grammar rules associated with the identified form.
The enumeration of the features for a database NL front-end (About.com; Beck, Mobini, & Kadambari; Gayatri & Raman, 2001) follows:
•Independent domain creation: Given a database, a system may be capable of automatically adjusting its syntactic and semantic unit to it, processing the names of columns and tables. Indeed, automatic creation of entities which correspond to lexical units and relationships between them is quite unreliable for a natural language front-end to a database. This is in contrast to a natural language interface to a collection of documents (automatic annotation), which is quite reliable.
•Follow-up questions: For example, if one initially asks, Which store in California sold the most coffee in 1997? she can then ask a follow-up question, such as, Which one sold the most coffee? and the NL front-end system will understand this to mean Which store in California and In 1997 sold the most coffee.
•Specifying additional phrasings: This is necessary so that syntactically different but semantically similar questions are converted into the same SQL query.
•Maximum complexity of queries: These can be measured as a number of entities in a query, assuming that they are interconnected. If a question can be split into a conjunction or disjunction of simpler questions, a user is expected to do it. For a database front-end, a query complexity can be measured as a number of tables mentioned.
•Automatic superposition of syntactic and semantic templates: If a NL system knows two entities (or a respective pair of phrasings) it can combine it online to represent a query. This is a quite desirable feature to increase the complexity of queries and also to merge various databases each having its own front-end.
Natural Language Front-End for a Database
TECHNIQUE OF SEMANTIC
HEADERS
The technique of semantic headers (SH; Galitsky, 2003) is intended to be the means of conversion of a relational database of abstract textual documents into a form, appropriate to be associated to a question and to generate an advice. There are two opposite common approaches to this problem. The first one assumes that complete formal representation of any textual document is possible, and the second one assumes that the textual information is too tightly linked to NL, and it cannot be satisfactorily represented without it. The former approach relies on the match of formalized query with the full-knowledge representation for answers, and the latter is based on the syntactic match between the question and sentences from answers. An important role of machine-learning-based technique for question answering is worth mentioning as well (Ng, Lai Pheng Kwan, & Xia, 2001).
The technique intermediate in respect to the degree of knowledge formalization. Only the data, which can be explicitly mentioned in a potential query, occur in semantic headers. The rest of the information, which would be unlikely to occur in a question but can potentially form the relevant answer, does not have to be formalized.
SH technique is based on logical programming, taking advantage of its convenient handling of semantic rules on one hand, and explicit implementation of the domain commonsense reasoning on the other hand. The declarative nature of coding semantic rules, domain knowledge, and generalized potential queries introduces logical programming as a reasonable tool. At the same time, the machinery of text annotation by the set of keywords has been proven to leverage the machine learning technique. Instead of using the keywords as semantic means to represent the meaning of a short textual document (answer), we use the logical formula where the keywords serve as atoms. Therefore, SH technique is a way of merging potential results of statistical approach to Q/A with the logical programming way of matching the formal representation of a query with the formal representation of an answer (semantic header of this answer). In legal domains, when the semantic of conversational language can only be ambiguously mapped into the semantic of the legal language, using just the statistical annotation by keyword does not lead to satisfactory results.
Consider the Internet auction domain, which includes the description of bidding rules and various types of auctions.
•Restricted-Access Auctions: This separate category makes it easy for you to find or avoid adultonly merchandise. To view and bid on adult-only items, buyers need to have a credit card on file
404
TEAM LinG

Natural Language Front-End for a Database
with eBay. Your card will not be charged. Sellers must also have credit card verification. Items listed in the Adult-Only category are not included in the New Items page or the Hot Items section, and currently, are not available by any title search.
What is this paragraph about? It introduces the “re- stricted-access” auction as a specific class of auctions, explains how to search for or avoid a selected category of products, presents the credit card rules, and describes the relations between this class of auctions and the highlighted sections of the Internet auction site. Rather than changing the paragraph to adjust it to the potential questions answered within it, consider all the possible questions this paragraph can serve as an answer to. Building the semantic headers of a textual document is based on the posing of a query understanding problem as the recognition of the best pattern (e.g, document, answer). For example, if there is a question such that the stated paragraph is a more appropriate answer than any other paragraph from the whole domain, then the stated paragraph should serve as the answer, or at least part of the answer.
Evidently, knowledge of the semantic model of the whole domain is required to build the set of semantic headers for a given paragraph. This paragraph serves to answer the following kinds of questions:
•What is the restricted-access auction?
•What kind of auctions sells adult-only items?
Figure 1. The information flow of the technique of semantic headers. The input query is subject to natural language processing (NLP, on the left). The answers are subject to the procedure of SH assignment, performed while preparing the Q/A domain (on the right). The essence of finding an answer is the unification of the translation formula with all the SHs of the domain.
NLP of a query
Input sentence
{a', b’,c’,d’,…}
Revealed atoms after substitution of multiwords and synonyms and ignoring insignificant words
{a, b, c}
Built translation formula a(b(_),c)
Domain preparation
Document, prepared as an answer
{…,a’', …,c’,…}
Atoms, representing the essential idea of an answer.
{a, b, e}
Built semantic header a(b(e),_):-iassert(
$…,a’', …,c’,…$).
Matching the translation formula against the set of semantic headers
Unification a(b(_),c) = a(b(e),_) gives the answer …,a’', …,c’,….
•How do I avoid adult-rated products for my son?
• |
How does one sell adult items? |
N |
•When does a buyer need a credit card on file? Who needs to have a credit card on file? Why does a seller need credit card verification.
Below is the list of semantic headers for these answers.
auction(restricted_access,_):-restrictedAuction. product(adult,_):-restrictedAuction.
seller(credit_card(verification,_),_):-restrictedAuction. sell(credit_card(reject(_,_),_),_):-restrictedAuction. seller(credit_card(_,_),_):-restrictedAuction.
what_is(auction(restricted_access,_),_):-restrictedAuction.
Then the call to restrictedAuction will add the stated paragraph to the current answer, which may consist of the multiple pre-prepared ones.
Matching the query translation with the totality of SHs is shown at Figure 1.
FUTURE TRENDS
As to the particular implementation of a database NL front-end, we consider Microsoft English Query (About.com; Microsoft.), the integrated tool for building, adjusting, and deploying the SQL server database. Using English Query one can turn a relational database into an application, which provides end users with an option to pose questions in NL instead of forming a query with an SQL statement. The English Query project wizard allows users to automatically create an English Query project and model (see Figure 2). After the basic model is created, a developer can refine, test, and compile it into an English Query application and then deploy it (for example, to the Web). As to the database
Figure 2. The database structure (table names, field names, keys and joins) is incorporated into a project and a model
|
Database |
|
|
|
|
|
|
|
|
|
|
|
|
|
SQL |
|
|
|
|
|
|
|
|
|
|
|
|
||
structure |
|
English |
|
|
|
|
Model |
|||||||
Server |
|
|
|
|
|
|
|
|||||||
Data |
|
Query |
|
|
|
|
knowledge |
|||||||
|
|
|
|
|
|
|||||||||
|
|
|
|
|
|
|
|
|
Save |
|
|
|
|
|
|
SQL |
|
Model |
|
|
|
|
Project |
|
|||||
|
|
Editor |
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Test tool |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Compiled |
|
|
|
SQL |
|
|
Question |
|
|
English |
|
||||||
|
|
|
|
|
|
|||||||||
|
|
|
|
|
|
|
|
|
|
|
|
Query |
|
|
|
|
|
English |
|
|
|
||||||||
|
|
|
|
|
|
|
|
model |
|
|||||
|
|
|
Query run- |
|
|
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
Load |
|
|
|
|
|
||||
|
|
|
time engine |
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
405
TEAM LinG

type of a Q/A domain, the knowledge and semantic model development tools are quite important.
Such a model contains all the information needed for an English Query application, including the database structure, or schema, of the underlying SQL database and the semantic objects (i.e., entities and relationships). A database front-end developer can also define properties for an application and add entries to the English Query dictionary as well as manually add and modify entities and relationships while testing questions and set other options to expand the model.
With the wizards, semantic objects are automatically created for the model. These include entities and relationships (with phrasings such as customers buy products or
Customer_Names are the names of customers). Entities are usually represented by tables and fields (see Figures 3 and 4).
An entity is a real-world object, referred to by a noun (i.e., person, place, thing, or idea), for example: customers, cities, products, shipments and so forth. In databases, entities are usually represented by tables and fields. Relationships describe what the entities have to do with
Natural Language Front-End for a Database
one another, for example, customers purchase products. Command relationships are not represented in the database but refer to actions to be executed. For example, a command to a compact disc player can allow requests such as “Play the album with song X on it.”
CONCLUSION
The idea of natural language access to databases is neither a new idea nor a fundamentally different approach to information retrieval. NL front-end has been recently attempted for a video database (Gayatri & Raman, 2001). With over 3 decades of research and development into natural language processing (American Association for Artificial Intelligence, 1999; Pasca, 2003), it is still a hard task to provide information access through simple dialogues. For example, it would not seem too advanced for a user to ask simple questions such as Who was the founder of Oracle? or What is the distance to the moon? and get a simple, direct answer. These questions, though not very complex in structure nor ambiguous in phrasing,
Figure 3. Semantic objects of English Query. OLAP, online analytical processing, is accessible by natural language question answering as well.
Figure 4. Relationships between entities
CusCustomersm rs
Sa Salesreps sesell reps
pr o duc ts to PrProductsduc ts product to customers
cus to m ers
Customersto m ershavea v e phonee numbersu m b ers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sa Salesrepsrep |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
YTDYTDsa sales |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
P hPhonen e |
|
|
|
|
|
|
|
|
|
|
|
nnumbersm b ers |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Customersto m ershavea v e |
|
Sa lesSalesrepsrepsh ahavee YTD |
|
|
|
|
phonee numbersu m b ers |
|
sa les YTD sales |
|
|
|
|
|
|
|
|
E Employeep lo y ee IDss |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
406
TEAM LinG

Natural Language Front-End for a Database
cannot be handled effectively by the current search techniques of the World Wide Web (Maybury, 2000, 2004). More intelligent search engines will break these queries into Boolean keyword searches to give back founder and Oracle or distance and moon, but others would require the user to formulate the queries into a formal language. Even so, keyword searches are known to be inaccurate and require the users to look through many answers, which point to articles that users must read rather than directly answer their questions.
REFERENCES
About.com. Microsoft English query. Retrieved June 15, 2004, from http://databases.about.com/library/weekly/ aa032402a.htm
Adam, N. R., & Gangopadhyay, A. (1997). A form-based natural language front-end to a CIM Database. IEEE Transactions on Knowledge and Data Engineering, 9(2), 238-250.
Allen, J. F. (1995). Natural language understanding (2nd ed.). Redwood City, CA: Benjamin/Cummings.
American Association for Artificial Intelligence. (1999).
AAAI fall symposium on question answering systems
(Tech. Rep. FS-99-02). Menlo Park, CA: AAAI Press.
Beck, H. W., Mobini A. M., & Kadambari, V. A. Word is worth 1000 pictures: Natural language access to digital libraries. Retrieved June 15, 2004, from http:// archive.ncsa.uiuc.edu/SDG/IT94/Proceedings/Searching/beck/beckmain.html
Galitsky, B. (2003). Natural language question answering system: Technique of semantic headers. Advanced Knowledge International, Adelaide, Australia. Retrieved from http://www.dcs.bbk.ac.uk/~galitsky/NL/book/
Gayatri, T. R., & Raman, S. (2001). Natural language interface to video database. Natural Language Engineering, 7(1), 1-27.
Gazdar, G. (1999). Natural language interfaces to databases. Retrieved from http://www.cogs.susx.ac.uk/lab/ nlp/gazdar/teach/nlp/nlpnode157.html
Maybury, M. T. (2000). Adaptive multimedia information access—Ask questions, get answers. First International Conference on Adaptive Hypertext (AH ’00), Trento, Italy.
Maybury, M. T. (Ed.) (2004). New directions in question answering. Cambridge, MA: MIT Press.
Microsoft. English query. Retrieved June 15, 2004, from http://www.microsoft.com/sql/evaluation/features/ english.asp
Ng, H. T., Lai Pheng Kwan, J., & Xia, Y. (2001). Question answering using a large text database: A machine learning N approach. Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP 2001), Pittsburgh, PA, June 3-4.
Pasca, M. (2003). Open-domain question answering from large text collections, center for the study of language and information. Lecture notes.
Popescu, A.-M., Etzioni, O., & Kautz, H. (2003). Towards a theory of natural language interfaces to databases.
Intelligent User Interfaces.
Sabourin, C. F. (1994). Natural language interfaces: Bibliography. Interfaces to databases, to expert systems, to robots, to operating systems, and to question-answering systems. Infolingua, Montreal, Canada.
Wallace, M. (1984). Communicating with databases in natural language. Chichester, UK: Ellis Horwood.
KEY TERMS
Keyword Search: A search for documents containing one or more words that are specified by a user.
Natural Language Search: Search in which one can ask a question in natural English (such as, Where can I find information on William Shakespeare?) as opposed to formulating a search statement (such as, su:Shakespeare, William).
Natural Language Understanding: A problem of conversion of a natural language expression into its formal representation.
Phrasing: The manner in which something is expressed in words.
Pragmatics: The study of the contribution of contextual factors to the meaning of what language users say.
Semantics: The branch of linguistics that studies meaning in language. One can distinguish between the study of the meanings of words (lexical semantics) and the study of how the meanings of larger constituents come about (structural semantics). In the study of language, semantics is concerned with the meaning of words, expressions, and sentences, often in relation to reference and truth. Metasemantic theories study key semantic notions such as meaning and truth and how these notions are related.
Syntax: The grammatical arrangement of words in sentences.
407
TEAM LinG
408
Normalizing Multimedia Databases
Shi Kuo Chang
University of Pittsburgh, USA
Vincenzo Deufemia
Università di Salerno, Italy
Giuseppe Polese
Università di Salerno, Italy
INTRODUCTION
Multimedia databases have been used in many application fields. As opposed to traditional alphanumeric databases, they need enhanced data models and DBMSs to enable the modeling and management of complex data types. After an initial anarchy, multimedia DBMSs (MMDBMS) have been classified based on standard issues, such as the supported data model, the indexing techniques to support content-based retrieval, the query language, the support for distributed multimedia information management, and the flexibility of their architecture (Narasimhalu, 1996).
A conspicuous number of MMDBMS products have been developed. Examples include CORE (Wu, Mehtre, Lam, & Gao, 1995), OVID (Oomoto & Tanaka, 1993), VODAK (Löhr & Rakow, 1995), QBIC (Flickner et al., 1995), ATLAS (Sacks-Davis, Ramamohanarao, Thom, & Zobel, 1995), each providing enhanced support for one or more media domains among text, sound, image, and video. Some of these products support specific data models, whereas others support the object-oriented data model or even the canonical relational data model. Moreover, extensible relational DBMSs have been introduced to extend relational DBMSs with objec-oriented features, such as the capability to manage complex data types, including multimedia data. In particular, they implement the concept of the object-relational universal server, providing means to enable the construction of user defined data types (UDTs), and functions for manipulating them (UDFs). In addition, SQL3 has become the standard for relational DBMSs extended with object-oriented capabilities. The standard includes UDTs, UDFs, LOBs (a variant of BLOBs), and type checking on user-defined data types, which are accessed through SQL statements. Early examples of extensible RDBMSs include Postgres, IBM/DB2 version 5, Informix, and ORACLE 8.
As MMDBMSs technology has become more mature, the research community has been seeking new method-
ologies for multimedia software engineering. Independently from the data model underlying the chosen MMDBMS, multimedia software engineering methodologies should include techniques for database design, embedding guidelines and normal forms to prevent anomalies that might arise while manipulating multimedia data.
In this paper, we describe a general-purpose framework to define normal forms in multimedia databases. The framework applies in a seamless way to images as well as to all the other different media types. The semantics of multimedia attributes is defined by means of generalized icons (Chang, 1996), previously used to model multimedia languages in a visual language fashion. In particular, generalized icons are used here to derive extended functional dependencies, which are parameterised upon the similarity measure used to compare multimedia data (Santini & Jain, 1999). Based on these new dependencies, we define three normal forms aiming to reach a suitable partitioning of multimedia data and to derive database schemes that prevent possible manipulation anomalies.
BACKGROUND
The normalization of multimedia databases needs to account for many new issues as opposed to alphanumeric databases. Many different types of complex data need to be analysed. However, in the literature we find many database design techniques focusing on image databases. In particular, a technique for normalizing image databases focuses on the partitioning of images so as to enhance image search and retrieval (Santini & Gupta, 2002). To this end, the technique aims to define dependencies among image features, which suggest to the designer how to efficiently map them into a database schema.
Database designers use their understanding of the semantics of attributes to specify functional dependencies among them (Elmasri & Navathe, 2003). As we know, other than alphanumeric data, multimedia databases can
Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited.
TEAM LinG

Normalizing Multimedia Databases
store complex data types, such as text, sound, image, and video, which were initially modelled as binary large objects (BLOBs) in early MMDBMSs.
In the following we introduce a framework to model the semantics of multimedia attributes aiming to derive functional dependencies between them. To this end, we have exploited the framework of generalized icons (Chang, 1996). Generalized icons are dual objects (xm, xi), with a logical part xm, and a physical part xi. They can be used to describe multimedia objects such as images, sounds, texts, motions, and videos. A generalized icon for modeling images is like a traditional icon, whereas those for modeling sounds, texts, motions, and videos are earcons, ticons, micons, and vicons, respectively. For all of them we denote the logical part with xm, whereas the physical parts will be denoted with xi for icons, xe for earcons, xt for ticons, xs for micons, and xv for vicons. The logical part xm always describes semantics, whereas xi represents an image, xe a sound, xt a text, xs a motion, and xv a video. Furthermore, a multicon is a generalized icon representing composite multimedia objects (Arndt, Cafiero, & Guercio, 1997). Generalized icons can be combined by means of special icon operators. The latter are dual objects themselves, wherein the logical part is used to combine the logical parts xm of the operand icons, whereas the physical part is used to combine their physical parts xi. For instance, by applying a temporal operator to several icons and an earcon, we might obtain a vicon, with the physical part representing a video, and the logical part describing the video semantics.
In our framework we associate a generalized icon to each complex attribute, using the logical part to describe its semantics and the physical part to describe the physical appearance based on a given storage strategy.
The logical parts of generalized icons will have to be expressed through a semantic model. Conceptual graphs are an example of a semantic model that can be used to describe logical parts of generalized icons (Chang, 1996). Alternatively, the designer can use frames, semantic networks, or visual CD forms (Chang, Polese, Orefice, & Tucci, 1994). As an example, choosing a frame-based representation, an image icon representing the face of a person may be described by a frame with attributes describing the name of the person, the colors of the picture, objects appearing in it, including their spatial relationships. A vicon will contain semantic attributes describing the images of the video photograms, the title of the video, the topic, the duration, the temporal relationships, and so forth.
Based on the specific domain of the multimedia database being constructed, the designer will have to specify the semantics of simple and complex attributes according to the chosen semantic model. Once he or she has accomplished this task, the generalized icons for the multimedia
database are completely specified, which provides a semantic specification of the tuples in the database. N
As an example, to describe semantics in a database of singers we might use the attributes name, birth date, and genre as alphanumeric attributes; picture as an icon representing the singer’s picture; one or more earcons to represent some of the singer’s songs; and one or more vicons to represent the singer’s video clips. A tuple in this database might describe information about a specific singer, including his or her songs and video clips. This provides a complete semantic specification of the tuple.
NORMAL FORMS IN MULTIMEDIA DATABASES
In traditional relational databases, a functional dependency is defined as a constraint between two sets of attributes from the database. Given two sets of attributes X and Y, a functional dependency between them is denoted by X→Y. The constraint says that, for any two tuples t1 and t2 having t1[X] = t2[X], then t1[Y] = t2[Y]. This concept cannot be immediately applied to multimedia databases because there are no similar simple, efficient methods to compare multimedia attributes. In other words, we need a method for defining equalities between groups of attributes involving complex data types.
Generally speaking, the matching of complex attributes needs to be based on an approximate match paradigm, such as those used in content-based retrieval from multimedia databases (Schaubl, 1997). In particular, we extend the definition of functional dependency by selecting a specific similarity function and thresholds to perform approximate comparisons of complex data types. Thus, the functional dependencies change if we use different similarity functions. As a consequence, we enrich the notation used for functional dependencies to include symbols representing the chosen similarity function. In what follows, we introduce some basic concepts of similarity theory (Santini & Jain, 1999).
Tuples of a relation can be compared by means of a set of relevant features Φ. For instance, images can be compared using attributes such as color, texture, and shape; audio data can be compared using loudness, pitch, brightness, bandwidth, and harmonicity. The values of each feature F Φ belong to a domain D = dom(F).
The similarity between two elements x and y in a tuple is based on distance measures in feature spaces (that are assumed to be metric spaces) or, equivalently, on similarity functions. In the following, we will always refer to distance functions, but it should be understood that the same considerations apply to similarity functions, given the symmetry between distance and similarity functions.
409
TEAM LinG