
Rivero L.Encyclopedia of database technologies and applications.2006
.pdf
Service Mechanism Quality for Enhanced Mobile Multimedia Database Query Processing
Figure 2. Technologies for QoS-based query processing in MM-DBMSs
system to connect to the UMTS wireless network. The server uses a wireless LAN (Wi-Fi) to connect to the wireless network. There are two problems existing in the client site: (1) The display capacity is too low to present the captured image object (the original dimension is 490×269 pixels, while the screen can only accommodate 128×128 pixels); (2) Even if the client can solve the screen resolution problem, the response time is still a problem due to the “out of the upper bound” delay time.
Using extended QoS-enhanced query processing in MM-DBMSs under wireless mobile environments, both issues mentioned above can be solved. The basic idea is that we reduce the final quality of picture to 12% of that of original one for the compensation, so that we can obtain the simplified object size from 101 KB to 12 KB. In this case, the data transmission time could be possibly controlled within 60 ms (since tt = 64KB/ 512KB×50% = 0.5 sec). However, such kind of adaptive QoS mechanism still does not exist.
BACKGROUND
Numerous studies have been carried out for using QoS for mobile database query processing (Bordbar, Derrick, & Waters, 2002; Cao, 2003; Ecklund, Goebel, Plagemann, & Ecklund, 2002; Kazantzidis, 2002; Miloucheva & Tartarelli, 2002; Watson 2004). Note existing studies typically assume that all the QoS requirements are specified by users in advance. However, in multimedia applications, it is difficult to predict the size of the targeted object to be retrieved. Existing approaches usually stop the query if the required QoS conditions cannot meet the related statistical or empirical resource utilizations (Braumandl & Kemper, 2003). Apparently, stopping the query under an adverse condition is an unreasonable restriction, and adaptive query processing should be supported. Consequently, a critical issue to be investigated is concerned with how to extend existing QoS principles to deal with wireless mobile environments for multimedia applications. Technologies for QoS-based query processing in mobile, mul-
timedia DBMS are summarized in Figure 2. Extending query processing in MM-DBMSs to wireless mobile environment must consider the following: (1) Multimedia application, especially multimedia object retrieval, needs resource consumption often exceeding the available resource capacities of the deployed wireless/mobile networks and portable devices; (2) the precautions of these extra resource requirements cannot be taken by the server, the client, or even the network infrastructure in advance; and (3) we need to extend QoS management in a mobile environment to specify a range of acceptable QoS levels to allow for scaling of multimedia query processing, rather than trying to guarantee specific values or to stop the querying.
A Quantitative Approach
Based on these considerations, we have explored a new quantitative approach to achieve the trade-off between querying results and qualities according to application priorities and capacities, because QoS requirements are preferably described by using some quantitative figures. Instead of requiring a good network service, the user is asked to specifically request some measures such as connection speed or delay, which can be described by a numerical value, for example, 256 Kbps or 60 ms, respectively, for the speed and delay. Having a quality term such as good or bad described by a quantitative metric simplifies the process of allocation of that quality to a particular service by the provider and also prevents any possible ambiguity during the user request and service fulfillment process (Ganz, Ganz, & Wongthavarawat, 2004). We have explored an approach to tackle this issue by specifying a range of acceptable QoS requirements for multimedia query processing. We have proposed a QoS-based matrix to support adaptive query processing of object-relational multimedia databases in the context of wireless mobile environments. The proposed QoS-based querying processing precision matrix (QQPPM) is based on real-time QoS conditions in wireless networks, the multimedia database’s object properties, and mobile client-site data processing
620
TEAM LinG

Service Mechanism Quality for Enhanced Mobile Multimedia Database Query Processing
capability. In our research, we have focused on examining and establishing a server-site QoS conditions profile, wireless network QoS services profile, and client-site QoS requirements profile. Based on these QoS profiles, we have developed a QoS-based query processing precision matrix to approach this issue. The matrix representation is based on the following categorized characteristics.
•The wireless network QoS service profile is the real-time QoS for the particular application based on the conditions of on-site wireless networks;
•The server-site QoS profile is the properties of the multimedia object retrieved from the MM-DBMS server; and
•The client-site QoS profile is the mobile client’s QoS requirements and client-site multimedia data processing capabilities.
The objective of constructing the QQPPM is to create a smallest element as a criterion to reduce queried multimedia object quality between the MM-DBMS server’s original data quality and actual quality that client can access.
Experiments
At the current time, there is neither a commercial MMDBMS (Elmasri & Navathe, 2004) nor real wireless network facility that allows us to implement the QQPPM approach. Therefore, in order to verify the conception of QQPPM is reasonable and practicable, we have designed experiments focusing on examining the response time due to the fact that the limited bandwidth provided by wireless networks conflicts with the extraordinary resources required by a multimedia queried object’s transmission. We have conducted simulations of transmitting different real-world multimedia data sizes with different infrastructure densities under wireless mobile environments. With proposed network QoS services, we have investigated the relationships among scaleable number of users, multimedia data sizes, corresponding systems’ response times, and network traffics.
with all client-site QoS considerations, server-site QoS requirements, and network QoS services into one simple 5 matrix. Based on balance of all the QoS elements in the matrix, we calculated the precision criterion factor and applied it to process the queried object transmission. We conducted real-world wireless, multimedia database query processing simulations with different queried object
sizes and different client densities on a traditional wireless network and a state-of—the-art wireless network and evaluated the relationship among object size, response time, and client densities.
We have reached the following results through this research:
•Multimedia object retrieval needs resource consumption that often exceeds available resource capacities of the deployed wireless mobile networks and portable devices.
•The precautions of extra resource requirements can not be taken by server, client, or even network infrastructure in advance. We need QoS management in wireless mobile environments to specify a range of acceptable QoS levels to allow for scaling of multimedia query processing, rather than trying either to guarantee specific values or to stop the querying.
•The model of QQPPM is reasonable and practicable. The theoretical studies and experimental practices support its possibility.
REFERENCES
Bordbar, B., Derrick, J., & Waters, G. (2002). Using UML to specify QoS constraints in ODP. Computer Networks, 40(2), 279-304.
Braumandl, R., & Kemper, A. (2003). Quality of service in an information economy. ACM Transactions on Internet Technology, 3(4), 291-333.
Cao, G. (2003). Integrating distributed channel allocation and adaptive handoff management for QoS-sensi- tive cellular metworks. Wireless Networks, 9, 131142.
CONCLUSION
We studied and extended the standard QoS mechanism in the context of wireless mobile environments. The client consideration elements such as hardware facilities, user preference, mobility coverage, and critical quality acceptance are included in the extended QoS framework. Rather than using statistical resource utilizations, we combined
Chalmers, D., & Sloman, M. (1999). A survey of quality of service in mobile computing environments. IEEE Communications Surveys, 2, 2.
Chang, C.-C., Hung, Y.-P., & Shih, T. K. (2002). Future multimedia database and research directions. Hershey, PA: Idea Group.
621
TEAM LinG
Service Mechanism Quality for Enhanced Mobile Multimedia Database Query Processing
Ecklund, D. J., Goebel, V., Plagemann, T., & Ecklund, E. F., Jr. (2002). Dynamic end-to-end QoS management middleware for distributed multimedia systems. Multimedia Systems, 8, 431-442.
Elmasri, R., & Navathe, S. (2004). Fundamentals of database systems (4th ed.). Boston: Pearson.
Ganz, A., Ganz, Z., & Wongthavarawat, K. (2004). Multimedia wireless networks: Technologies, standards, and QoS. Upper Saddle River, NJ: Prentice Hall.
Hillborg, M. (2002). Wireless XML developer’s guide. Emeryville, CA: McGraw-Hill/Osborne.
Kazantzidis, M. I. (2002). Adaptive multimedia in wireless IP networks. Unpublished doctoral dissertation, University of California, Los Angeles.
Miloucheva, I., & Tartarelli, S. (2002). QoS roadmap. In
Next Generation Networks Initiative Consortium. Retrieved from www.ist-mome.org/document/ qos_roadmap_final.pdf
Ramakrishnan, R., & Gehrke, J. (2003). Database management systems (3rd ed.). Boston,: McGraw-Hill.
Shih, T. K. (2001). An introduction to multimedia database. Hershey, PA: Idea Group.
Shih, T. K. (2002). Distributed multimedia databases: Techniques and applications. Hershey, PA: Idea Group.
Watson, R. (2004). Data management: Databases and organizations. New York: Wiley.
KEY TERMS
GPRS: Short for General Packet Radio Service, a standard for wireless communications that runs at speeds up to 115 kilobits per second, compared with current GSM’s (Global System for Mobile Communications) 9.6 kilobits. GPRS, which supports a wide range of bandwidths, is an efficient use of limited bandwidth and is particularly suited for sending and receiving small bursts of data, such as e-mail and Web browsing, as well as large volumes of data. Retrieved September 20, 2004, from http://www.webopedia.com/TERM/G/GPRS.html.
Mobile Client-Site Multimedia Data Processing Capabilities: To study and extend the standard QoS mechanism in the context of wireless mobile environments for multimedia applications, the client-site multimedia data processing capabilities can be considered. The related elements such as hardware facilities, user prefer-
ence, mobility coverage, and critical quality acceptance should be included in the extended QoS framework.
Mobile Multimedia Database Management System (MM-DBMS): Since a traditional DBMS does not support QoS-based objects, it only concentrates on tables that contain a large number of tuples, each of which is of relatively small size. However, a MM-DBMS should support QoS-sensitive multimedia data types in addition to providing all the facilities for DBMS functions. Once multimedia objects such as images, sound clips, and videos are stored in a database, individual objects of very large size have to be handled efficiently. Furthermore, the crucial point is that mobile MM-DBMSs should have the QoS-based capabilities to efficiently and effectively process the multimedia data in wireless mobile environments.
Multimedia Database Object Properties: The properties of a multimedia object refer to the object’s QoSsensitive characteristics. They can be categorized into several attributes. The nature of object can be described by frame size, frame rate, color depth, compression, etc; the quality of object presentation can be determined by delay variation and loss or error rate.
QoS-Based Multimedia Query Processing and Data Retrieval: The current multimedia query processing mechanism provides the three procedures in practice. They are (1) search, (2) browsing, and (3) query refinement.
There are four considerations regarding multimedia data retrieval.
•First, the queried objects are successfully presented in the client site under the QoS support.
•Second, since the result queries may contain long audio segments, large images, or long videos, efficient extracting and presenting essential information for clients to browse and select are required.
•Third, the response time that is determined by both the network and database search should be efficiently short.
•Fourth, the query refinements are possible to itinerate.
QoS-Based Query Processing Precision Matrix (QQPPM): The QQPPM solution covers the specification of client-site QoS preferences and their quantitative relationship to wireless network QoS conditions, and also the specification of the server-site QoS profile and their quantitative relationship to the real-time wireless network QoS conditions. QQPPM will be one of the
622
TEAM LinG

Service Mechanism Quality for Enhanced Mobile Multimedia Database Query Processing
functions in MM-DBMSs. The objective of QQPPM is to create a smallest element as a criterion to reduce queried object quality between the server’s original data quality and actual quality that the client can access. With the QQPPM, the multimedia query result can be displayed on the client site no matter what the queried object characteristics are and no matter what available network resources have.
Quality of Service (QoS) Management: QoS management refers to a set of specific requirements for a particular service provided by a network to users. In general, QoS requirements are in accordance with the perceived QoS based on data transmission and application type. These requirements are usually described by quantitative figures that are more or less related to the
technology behind the network service, and thus a user
will find limited flexibility in changing the profile after 5 subscription to the service. In the context of the wireless mobile environment, these requirements can be catego-
rized mainly into four attributed types: (1) bandwidth, (2) timeliness, (3) mobility, and (4) reliability.
UMTS: Short for Universal Mobile Telecommunications System, a 3G mobile technology that will deliver broadband information at speeds up to 2 Mbits/sec. Besides voice and data, UMTS will deliver audio and video to wireless devices anywhere in the world through fixed, wireless and satellite systems. Retrieved September 20, 2004, from http://www.webopedia.com / TERM /U/UMTS.Html.Camera ready copy
623
TEAM LinG

624
Set Comparison in Relational Query
Languages
Mohammad Dadashzadeh
Oakland University, USA
INTRODUCTION
Today’s de facto database standard, the relational database, was conceived in the late 1960’s by Edgar F. Codd at IBM. The relational data model offered the user a logical view of the data that was shielded from consideration of how the data would, in fact, be physically organized in storage. This feat was accomplished in large part by the introduction of relational query languages that would specify the desired set of records in a non-procedural fashion. In contrast to the prevailing record-at-a-time, loop-oriented, procedural query languages of the hierarchical and network database management systems, relational query languages were setoriented in that they would operate on sets of records (i.e., relations or tables) at-a-time in order to produce the desired set of output records. Codd introduced both a relational algebra and a relational calculus as a basis for dealing with data in relational form. Indeed, he defined what the first relational language was: Data Sublanguage Alpha (Codd, 1971).
The non-procedural nature of relational query languages made it possible to envision that end users could be expected to formulate ad hoc queries without resorting to a programmer. To that end, RDBMS adoption was thought to be facilitated by creation of an English-like query language. The language created for this purpose at IBM was called SEQUEL (Structured English Query Language), though it eventually grew in scope to handle other tasks including database modification, definition, authorization, and transaction processing (Chamberlin et al., 1976). At about the same time, another IBM research group produced Query by Example (Zloof, 1975), which because of its graphical interface proved to be easier to use for casual users. However, the wider applicability of SEQUEL led to its adoption and standardization as SQL (Structured Query Language) between 1982 and 1986.
As a relational query language, SEQUEL borrowed features from both relational algebra and relational calculus. However, in an effort to appeal to end users, the expressive power of relational calculus quantification (universal, for all, and existential, there exists) was somewhat sacrificed in favor of algebraic grouping (Group By and SET operations). Unfortunately, the bal-
anced approach of SEQUEL to relational calculus and relational algebra was abandoned in SQL, resulting in undue complexity when formulating queries requiring universal quantification. This article examines the shortcomings of relational query languages in formulating such set comparison queries and proposes solutions to overcome them with minimal effort.
BACKGROUND
Consider the following relational database about suppliers, parts, and jobs. (The primary key of each relation is underlined.)
SUPPLIER( S#, SName, Status, City )
PART( P#, PName, Color, City )
JOB( J#, JName, City )
SHIPMENT( S#, P#, J#, QTY )
The relation SHIPMENT records the quantity of each part being shipped by each supplier to various jobs. An instance of this database is depicted below.
Table 1. Supplier.db
Table 2. Part.db
Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited.
TEAM LinG

Set Comparison in Relational Query Languages
Table 3. Job.db
Table 4. Shipment.db
Now, consider the following queries:
5
•Q1: List the suppliers who ship every red part. (Answer: S5)
•Q2: List the suppliers who do not ship to any job located in London. (Answer: S1 and S3)
•Q3: List the jobs that are only receiving parts warehoused in London. (Answer: None)
•Q4: List the suppliers who are shipping to exactly the same jobs as supplier S1. (Answer: None)
•Q5: List the suppliers who are shipping exactly the same parts to jobs located in London as they are shipping to jobs located in Athens. (Answer: S2 and S4)
Each of the above queries involves comparison of sets of values in two tables. For example, in Q1, the set of parts (P# values) associated with each supplier (distinct S# value) in the SHIPMENT table must be examined to determine if it contains the set of parts (P# values) in the PART table sharing the value of “Red” for the COLOR attribute.
Despite their innocuous appearances, queries involving set comparison that tend to arise frequently in online analytical processing (OLAP) situations are especially difficult to formulate in relational query languages (Blanning, 1993; Celko, 1997; Matos & Grasser, 2002; Rao, Badia, & Van Gucht, 1996). This article summarizes the existing approaches and the proposed solutions to set comparison queries in relational algebra, Query by Example, and SQL.
SET COMPARISON IN
RELATIONAL ALGEBRA
In relational algebra (Ramakrishnan & Gehrke, 2003), the DIVISION operator provides the mechanism by which a restricted form of set comparison may be directly formulated. To fix ideas, consider the following formulation of query Q1 in relational algebra.
Q1: List the suppliers who ship every red part.
Temp1:= SELECTION (PART) using Color = “Red” Temp2:= PROJECTION (SHIPMENT) using S# and P#
Result:= DIVISION (Temp2, Temp1) using Temp2.P# and Temp1.P#
Here, Temp1 is produced by performing the SELECTION operation on the PART table using the selection condition Color = “Red.” Next, the Temp2 intermediate table is formed by projecting columns S# and P# from
625
TEAM LinG
the SHIPMENT table. The resulting Temp2 table is divided by Temp1, using P# as the dividing column in each table, to produce the Result table as follows. The rows of Temp2 are grouped according to S# (i.e., all the columns in the table excluding the dividing column). For each group, the set of values for P# (i.e., the dividing column) is compared to see if it contains every value in the set of P# (i.e., the dividing column) values in table Temp1. If and only if it does, then the value of S# for that group is placed in the Result table.
Unfortunately, the DIVISION operation only supports containment for set comparison. This means that, for example, to list the suppliers who are shipping to exactly the same jobs as supplier S1 (i.e., query Q4 from above), one must resort to additional operations and intermediate results:
Temp1:= SELECTION (SHIPMENT) using S# = “S1” Temp2:= PROJECTION (Temp1) using J#
Temp3:= SELECTION (SHIPMENT) using S# <> “S1” Temp4:= PROJECTION (Temp3) using S# and J# Temp5:= DIVISION (Temp4, Temp1) using Temp4.J# and Temp2.J#
Temp6:= RESTRICTION (SHIPMENT, Temp2) using SHIPMENT.J# NOT IN Temp2.J#
Temp7:= PROJECTION (Temp6) using S# Result:= DIFFERENCE (Temp4, Temp7)
Here, the RESTRICTION operation is used to keep only those rows of SHIPMENT for which the J# value does not appear in table Temp2. The S# values in such rows represent suppliers who are shipping to at least one job not currently shipped to by S1. Removing such S# values from the result of the DIVISION operation will give us the suppliers who are shipping to exactly the same jobs as supplier S1 does.
A Proposed Solution For
Relational Algebra
An improved relational operator called generalized division (GD) overcomes the shortcomings of the DIVISION operation (Dadashzadeh, 1989). The GD operator takes two relations as its inputs (e.g., SHIPMENT and PART) and produces an output table that has the same structure as the first input table (e.g., SHIPMENT). A column from the first (left) relation is designated as the grouping attribute (e.g., S# in SHIPMENT), and compatible columns from each relation are specified as the dividing columns (e.g., P# in SHIPMENT and P# in PART). A desired set comparison operation from the following list: EQUALS, IS NOT EQUAL TO, CON-
Set Comparison in Relational Query Languages
TAINS, DOES NOT CONTAIN, IS IN, and IS NOT IN completes the generalized division operation specification (e.g., EQUALS).
Using the GD operation, query Q1 is formulated as follows:
Q1: List the suppliers who ship every red part.
Temp1:= SELECTION (PART) using Color = “Red” Temp2:= GD (SHIPMENT, Temp1)
using (S#) SHIPMENT.P# CONTAINS Temp1.P# Result:= PROJECTION (Temp2) using S#
Here, the GD operator would construct the Temp1 table as follows. First, the rows of SHIPMENT are grouped (sorted) on the basis of their value for the grouping attribute S#. Next, for each group of rows from SHIPMENT (groups are distinguished by the value of the grouping attribute), the set of SHIPMENT.P# values appearing in the rows of that group is determined and compared against the set of Temp1.P# values. If the former set contains the latter, then that group of rows from SHIPMENT is passed to the Temp2 table. Otherwise, they are filtered and would not appear in the output table produced by the GD operation.
The improvement in the expressive power of the GD operation is more clearly seen when the set comparison operation is other than CONTAINS. The following formulations of query Q3 and query Q4 demonstrate this.
Q3: List the jobs that are only receiving parts warehoused in London.
Temp1:= SELECTION (PART) using City = “London” Temp2:= GD (SHIPMENT, Temp1)
using (J#) SHIPMENT.P# IS IN Temp1.P# Result:= PROJECTION (Temp2) using J#
Q4: List the suppliers who are shipping to exactly the same jobs as supplier S1.
Temp1:= SELECTION (SHIPMENT) using S# = “S1” Temp2:= GD (SHIPMENT, Temp1)
using (S#) SHIPMENT.J# EQUALS Temp1.J# Result:= PROJECTION (Temp2) using S#
An extension of the GD operation called grouped generalized division (GGD) provides for comparison of sets of values associated with matching groups of tuples in two relations (Dadashzadeh, 1989). Such matching of groups of rows in two tables and the associated set comparison are at the heart of the solution to formulating query Q5.
626
TEAM LinG

Set Comparison in Relational Query Languages
Q5: List the suppliers who are shipping exactly the same parts to jobs located in London as they are shipping to jobs located in Athens.
Temp1:= JOIN (SHIPMENT, JOB) using SHIPMENT.J# = JOB.J#
Temp2:= SELECTION (Temp1) using City = “Athens” Temp3:= SELECTION (Temp1) using City = “London” Temp4:= GGD (Temp3, Temp2)
using (Temp3.S#) Temp3.P# EQUALS Temp2.P# (Temp2.S#)
Result:= PROJECTION (Temp4) using S#
Here, the GGD operator would construct the Temp4 table as follows. First, the rows of Temp3 are grouped on the basis of their value for the grouping attribute Temp3.S#. Similarly, the rows of Temp2 are grouped on the basis of their value for the grouping attribute Temp2.S#. Next, for each group of rows from Temp3 (groups are distinguished by their common value for the grouping attribute), the set of Temp3.P# values appearing in the rows of that group is determined and compared against the set of Temp2.P# values appearing in the rows of the matching group (i.e., the group sharing the same value for S#) in Temp2. If the two sets of P# values are equal, then that group of rows from Temp3 is passed to the Temp4 table. Otherwise, they are filtered and would not appear in the output table produced by the GGD operation.
The generality and conciseness offered by the GD and GGD operators present a strong case for their inclusion in a relational algebra interface. The prospect that they can be provided with minimal effort, even in an existing algebraic language, should make their provision difficult to ignore. That prospect is in fact a reality since the GD and GGD operators can be expressed in terms of other relational operations (Dadashzadeh, 1989).
Figure 1. Suppliers the ship every red part
5
Figure 2. Suppliers that do not ship to any job located in London
Figure 3. Jobs that only receive parts warehoused in London
Figure 4. Suppliers shipping to exactly the same jobs as supplier S1
SET COMPARISON IN QUERY BY EXAMPLE
QBE, like SQL, was developed at IBM, but a number of other DBMSs, such as Paradox, have adopted QBE-like interfaces. Some systems, such as Microsoft Access, offer partial support for graphical queries, which reflects the influence of QBE. In its original definition, QBE provides little support for set comparison queries. In fact, Date (1992) points out that QBE is not relationally complete because it does not manage DIVISION (the algebraic counterpart to universal quantification) appropriately. However, Ramakrishnan and Gehrke (2003) demonstrate that the DIVISION operation can be expressed in IBM’s QBE either by using the aggregate operator COUNT (see also Dadashzadeh, 2003; Matos
and Grasser, 2002) or by using the data definition commands to create an auxiliary relation or view.
In contrast, Paradox’s QBE provides special set operators (SET, EVERY, NO, ONLY, and EXACTLY) that directly support the formulation of such queries, as illustrated in Figure 1-4 (Dadashzadeh, 2002):
In this QBE formulation shown in Figure 1, Paradox’s SET operator is used to define a set named XYZ as consisting of the P# of all red parts in the PART table. Then, Paradox’s set comparison operator EVERY is used to indicate that from the SHIPMENT table only those S#
627
TEAM LinG
values should be printed out that appear with EVERY value in the set XYZ.
The clarity afforded by the use of set operators in Paradox’s QBE unfortunately falls short when the query calls for comparison of sets of values associated with matching groups of tuples in two relations. Since Paradox considers the use of example elements in its set specification for the purpose of identifying a matching group of rows to be ambiguous, queries such as Q5 can only be formulated using a programming-like series of update commands to create and utilize auxiliary tables.
SET COMPARISON IN SQL
SQL does not provide direct support for comparing two sets. In fact, SQL does not provide operators to perform set intersection or set difference operations where it is required to compare two union-compatible tables for rows that are common to both or that are in one and not in the other. In order to formulate set intersection or set difference operations, the SQL user is expected to construct a query using two of the more difficult concepts in SQL: correlated subquery and the EXISTS function.
To demonstrate, consider the SQL query to list the supplier-part number pairs that reflect a shipment to J1 but that are not involved in shipments to J2. In Oracle’s implementation of SQL that supports a set difference operation, this query is formulated as:
(SELECT |
S#, P# |
FROM |
SHIPMENT |
WHERE |
J# = “J1”) |
MINUS |
|
(SELECT |
S#, P# |
FROM |
SHIPMENT |
WHERE |
J# = “J2”) |
While, in standard SQL, the formulation becomes:
SELECT |
S#, P# |
|
FROM |
SHIPMENT X |
|
WHERE |
J# = “J1” |
|
|
AND |
|
|
NOT EXISTS |
|
|
(SELECT |
* |
|
FROM |
SHIPMENT |
|
WHERE |
J# = “J2” |
|
|
AND S# = X.S# AND |
|
|
P# = X.P#) |
Set Comparison in Relational Query Languages
The complexity in formulating set difference and set intersection operations in SQL becomes much more pronounced when dealing with queries involving set comparison for matching groups of rows in tables, rather than entire tables, and especially when considering set comparison operations such as equality or containment (Dadashzadeh, 2001). Witness the following formulations of the example queries Q1 and Q5.
Q1: List the suppliers who ship every red part.
SELECT DISTINCT S#
FROM |
SHIPMENT X |
WHERE NOT EXISTS |
|
(SELECT |
* |
FROM |
PART |
WHERE |
Color = “Red” |
AND |
|
P# NOT IN |
|
(SELECT |
P# |
FROM |
SHIPMENT |
WHERE |
S# = X.S#) ) |
Q5: List the suppliers who are shipping exactly the same parts to jobs located in London as they are shipping to jobs located in Athens.
SELECT |
S# |
|
FROM |
SUPPLIER X |
|
WHERE |
NOT EXISTS |
|
(SELECT |
* |
|
FROM |
SHIPMENT, JOB |
|
WHERE |
S# = X.S# |
|
|
AND |
|
|
SHIPMENT.J# = JOB.J# |
|
|
AND |
|
|
City = “Athens” |
|
|
AND |
|
|
P# NOT IN |
|
|
(SELECT |
P# |
|
FROM |
SHIPMENT, JOB |
|
WHERE |
S# = X.S# |
|
AND |
|
|
SHIPMENT.J# = JOB.J# |
|
|
AND |
|
|
City = “London”) ) |
|
AND |
|
|
NOT EXISTS |
|
|
(SELECT |
* |
|
FROM |
SHIPMENT, JOB |
|
WHERE |
S# = X.S# |
|
|
AND |
|
|
SHIPMENT.J# = JOB.J# |
|
|
AND |
|
|
City = “London” |
|
|
AND |
|
628
TEAM LinG

Set Comparison in Relational Query Languages
P# NOT IN |
|
(SELECT |
P# |
FROM |
SHIPMENT, JOB |
WHERE |
S# = X.S# |
AND |
|
SHIPMENT.J# = JOB.J#
AND
City = “Athens”) )
A PROPOSED SOLUTION FOR SQL
The undue complexity in formulating queries involving set comparison was avoided, to a large extent, in SEQUEL2 (Chamberlin et al., 1976), the forerunner of SQL. In SEQUEL2, the EXISTS function is nonexistent. Instead, SEQUEL2 provides explicit support for set comparison in two ways. First, the built-in function SET in SEQUEL2 can be used in conjunction with the GROUP BY and HAVING operators to compare a set of values associated with a group of rows with the set of values derived from another table. The set comparison operators supported consist of: IS EQUAL TO; IS NOT EQUAL TO; CONTAINS; DOES NOT CONTAIN; IS IN; and IS NOT IN. The example queries Q1 and Q5 could be formulated in SEQUEL2 as follows.
Q1: List the suppliers who ship every red part.
(SEQUEL2 using SET) |
|
|
SELECT |
DISTINCT S# |
|
FROM |
SHIPMENT |
|
GROUP BY |
S# |
|
HAVING |
SET(P#) CONTAINS |
|
|
(SELECT |
DISTINCT P# |
|
FROM |
PART |
|
WHERE |
Color = “Red”) |
Q5: List the suppliers who are shipping exactly the same parts to jobs located in London as they are shipping to jobs located in Athens. (SEQUEL2 using SET)
SELECT |
DISTINCT S# |
|
FROM |
SHIPMENT X, JOB |
|
WHERE |
SHIPMENT.J# = JOB.J# |
|
|
AND |
|
|
City = “London” |
|
GROUPBY |
S# |
|
HAVING |
SET(P#) |
|
|
IS EQUAL TO |
|
|
(SELECT |
DISTINCT P# |
|
FROM |
SHIPMENT, JOB |
|
WHERE S# = X.S# |
|
|
AND |
|
SHIPMENT.J# = JOB.J#
AND 5
City = “Athens”)
The second way in which set comparison can be performed in SEQUEL2 is by direct comparison of compatible sets in the WHERE clause. The example queries Q1 and Q5 could be formulated in SEQUEL2 in the following, decidedly less complex, fashion.
Q1: List the suppliers who ship every red part. (SEQUEL2 without using SET)
SELECT |
S# |
FROM |
SUPPLIERX |
WHERE |
|
(SELECT |
DISTINCT P# |
FROM |
SHIPMENT |
WHERE |
S# = X.S#) |
CONTAINS |
|
(SELECT |
P# |
FROM |
PART |
WHERE |
Color = “Red”) |
Q5: List the suppliers who are shipping exactly the same parts to jobs located in London as they are shipping to jobs located in Athens. (SEQUEL2 without using SET)
SELECT |
S# |
|
FROM |
SUPPLIER X |
|
WHERE |
(SELECT |
DISTINCT P# |
FROM |
SHIPMENT, JOB |
|
WHERE |
S# = X.S# |
|
|
AND |
|
|
SHIPMENT.J# = JOB.J# |
|
|
AND |
|
|
City = “London”) |
|
IS EQUAL TO |
|
|
(SELECT |
DISTINCT P# |
|
FROM |
SHIPMENT, JOB |
|
WHERE |
S# = X.S# |
|
|
AND |
|
|
SHIPMENT.J# = JOB.J# |
|
|
AND |
|
|
City = “Athens”) |
|
Clearly, the only requirement to support this second approach to formulating set comparisons in SQL is to directly accommodate set comparison operators such as IS EQUAL TO. The other construct employed in the SEQUEL2 formulations above, correlated subqueries, is already in place in SQL. In an unfortunate affront to human factor engineering, the current SQL standard expects the user to reinvent the set comparison opera-
629
TEAM LinG