Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Rivero L.Encyclopedia of database technologies and applications.2006

.pdf
Скачиваний:
11
Добавлен:
23.08.2013
Размер:
23.5 Mб
Скачать

Discovering Association Rules in Temporal Databases

possible to discover pair of rules which differ in just one item and are complementary with respect to the lifespan of these different items. In this case, we say that such items are substitute one for each other.

Mannila, H., Toivonen, H., & Verkamo, I. (1995). Discovering frequent episodes in sequences. KDD’95. AAAI (pp. 210-215).

Ozden, B., Ramaswamy, S., & Silberschatz, A. (1998). Cyclic association rules. ICDE 1998.

REFERENCES

Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of ACM SIGMOD (Vol. 22, pp. 207216).

Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. IBM Res. Rep. RJ9839, IBM Almaden.

Ramaswamy, S., Mahajan, S., & Silberschatz, A. (1998). On the discovery of interesting patterns in association rules.

Proceedings 24th VLDB Conference.

Tansel, A. U., Clifford, J., Gadia, S. K., Jajodia, S., Segev, A., & Snodgrass, R. T. (Eds.). (1993). Temporal databases: Theory, design, and implementation. Benjamin/ Cummings.

Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. Proceedings of 11th IEEE International Conference on Data Engineering (pp. 3-14).

Ale, J., & Rossi, G. (2000). An approach to discovering temporal association rules. Proceedings of ACM 15th

Symposium on Applied Computing, (Vol. 1, pp. 294-300).

Ale, J., & Rossi, G. (2002). The itemset’s lifespan approach to discovering general temporal association rules. Proceedings ACM 2nd Temporal Data Mining Workshop (pp. 1-10).

Bettini, C., Wang, X., Jajodia, S., & Lin, J. (1998). Discovering frequent event patterns with multiple granularities in time sequences. IEEE TOKDE, 10(2), 222-237.

Chakrabarti, S., Sarawagi, S., & Dom, B. (1998). Mining surprising patterns using temporal description length.

Proceedings 24th VLDB Conference.

Chen, X., Petrounias, I., & Heathfield, H. (1998). Discovering temporal association rules in temporal databases.

Proceedings International Workshop IADT’98.

Lee, C. H., Chen, M. S., & Lin, C. R. (2003). Progressive partition miner: An efficient algorithm for mining general temporal association rules. IEEE Transactions on Knowledge and Data Engineering, 15(4), 1004-1017.

Lee, C. H, Lin, C. R., & Chen, M. S. (2001). On mining general temporal association rules in a publication database. Proceedings of the IEEE International Conference on Data Mining (ICDM-01).

Li, Y., Ning, P., Wang, X., & Jajodia, S. (2001). Discovering calendar-based temporal association rules. Proceedings 8th International Symposium on Temporal Representation and Reasoning (pp. 111-118).

KEY TERMS

Association Rule: A statement A => B , which states that if A is true then we can expect B to be true with a certain degree of confidence. A and B are sets of items, and the => operator is interpreted as “implies.”

Association Rule Mining: The data mining task of finding all association rules existing in a database, having support and confidence greater than a minimum support value and a minimum confidence value.

Confidence of a Rule: Percentage of the rows that contain the antecedent that also contain the consequent of the rule. The confidence of a rule gives us an idea of the strength of the influence that the antecedent has on the presence of the consequent of the rule.

Data Mining: The nontrivial extraction of implicit, previously unknown, and potentially useful information from data.

Lifespan: The time over which a database object is defined.

Lift: A measure used to determine the value of an association rule that tells us how much the presence of the antecedent influences the appearance of the consequent.

Market Basket Analysis: The process of looking at the transaction or market basket data to determine product affinities for each item in the basket.

Support of a Rule: The fraction of the rows of the database that contain both the antecedent and the consequent of the rule. The support of a rule tells us in how many instances (rows) the rule can be observed.

Temporal Database: A database that supports some aspect of time, not counting user-defined time.

200

TEAM LinG

 

201

 

Document Versioning in Digital Libraries

 

 

 

D

 

 

 

 

 

MercedesMartínez-González

Universidad de Valladolid, Spain

INTRODUCTION

Digital libraries are systems that contain organized collections of objects, serving in their most basic functions as a mirror of the traditional library that contains paper documents. Most of the information contained in the collections of a digital library consists of documents, which can evolve with time. That is, a document can be modified to obtain a new document, and digital library users may want access to any of those versions. This introduces in digital libraries the problem of versioning, a problem that has also been considered in a related community, the hypertext community(hypermedia in its most extensive acception).Some domains in which document evolution is very important are the legislative do- main(Arnold-Moore,2000;Martínez-González,2001;Vitali, 1999), the management of errata made to scientific articles (Poworotznek, 2003) and software construction (Conradi & Westfechtel, 1998).

In the legislative domain, rules suffer amendments that result in new versions of the amended rules. Access to all versions of a document is an important facility for their users; for example, to understand a tribunal sentence it is necessary to get access to the text of involved rules, as they were valid at the moment the sentence was made. In legislative documents, modifications are embedded inside other documents, so that the document to be modified is cited and how it should be modified is expressed later. For each modification, its author cites the document fragment he or she wants to change and indicates how the said fragment could be modified (e.g., eliminating it, substituting it). The new version obtained by the application of these changes is virtual, in the sense that the library users1 know it exists but there is no physical copy of it available.

Figure 1 shows an example. The text that appears in the figure is a fragment of an EU normative document. It is possible to recognise here the 20th article of the document, that modifies another article of a different document (the ‘1968 convention’). The reference in the figure indicates in a precise manner the article (number 41) affected by the modification (substitution) that follows. The legislators leave to the readers the cut-and-paste work needed to obtain an updated version of the modified convention.

Errata to scientific articles are somehow similar. The errata are posterior to the original article and they are published together with the reference to the part of the article to be changed by the modification. How and where the errata are inserted varies among publishers and among publications from the same publisher. One way to insert errata is by listing it at the beginning or at the end of the corrected article.

Software construction is a bit different, as it is the composition of software with several of the program files considered here. The different versions of program files are available at the same time, and the composition of software has to assemble adequate versions in order to obtain the correct version of the software. There is no need to be precise about the internal parts of a document or file affected by changes; the program files used to obtain a software are not fragmented during the composition process.

Next, the issues related with document versioning are revised, the main approaches are proposed, and the issues that each approach privileges are identified. Some of them are more recent than others, but promising. Versioning a document impacts not only the document itself but also other items, as references from and to the versioned document, or the indexes created for information retrieval operation.

Figure 1. A fragment of a modifier EU normative document2

Article 20

The following shall be substituted for Article 41 of the 1968 Convention: “Article 41

A judgement given on an appeal provided for in Article 40 may be contested only:

-in Belgium, France, Italy, Luxembourg and the Netherlands, by an appeal in cassation,

-in Denmark, by an appeal to the hojesteret, with the leave of the Minister of Justice,

-in the Federal Republic of Germany, by a Rechtsbeschwerde,

-in Ireland, by an appeal on a point of law to the Supreme Court,

-in the United Kingdom, by a single further appeal on a point of law.”

Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited.

TEAM LinG

BACKGROUND

As for the issues of interest related to document versions, there are seven categories:

1.What can be versioned: This question can be considered from two perspectives. The first perspective considers objects stored in the system. This is the typical situation in the Web and hypertext environments. Hypertext nodes (e.g., documents, files) can change, but the hypertext structure can also change (e.g., objects may vary their location, some of them may disappear, others may change their references to other objects). The evolution considered in the second perspective is the one used by digital library users—these documents may or may not match unidirectionally any of the objects stored in the digital library (Arms, Blanchi, & Overly, 1997). Changes in this case affect the content of documents: the text, the internal structure of documents (e.g., this is the case for structured documents), or references (i.e., citations within documents that are part of document content). How the evolution of documents at the user level impacts the system depends on the technical solutions used in the digital library (a digital library can use hypertext models and software, textual document databases, or other types of databases).

2.Detecting changes:Sometimes it is necessary to recognise two documents as versions of the same work or to find the changes that have to be applied to a document version to obtain another one. There are two possible ways to do this: extracting references from document content (Martínez-González, de la Fuente, Derniame, & Pedrero, 2003; Thistlewaite, 1997), or comparing versions (Chawathe,Rajaraman,García-Molína,&Widom,1996; Cobena, Abiteboul & Marian, 2002; Lim & Ng, 2001).

3.Representing changes: The information about versions and their differences has to be stored somehow in the system. There are three ways to accomplish it:

To store the versions caused by a change or the corresponding differences. This is the solution 1 of versioning management approaches.

To represent changes as annotations (attributes) to the affected nodes. This is solution 2 for version management. All the history of a node is described in these annotations.

To model modification relationships as links between the modifiers and the target of the modification (Martínez-González et al., 2003). This solution considers the semantic relationship that is

Document Versioning in Digital Libraries

behind a change.

4.Querying changes or querying the history of a document: This consists of answering questions about a document evolution, such as “What are the documents that modify this one?” or “What are the modifications of this document over the last month?” The way to operate here is dependent on the choice made for representing changes.

If versions are first class objects, the only way to query changes is to search for differences between versions. It can be a file or tree (in case of structured documents) comparison.

If changes are represented as annotations, to query them is to query annotations. In the case of structured documents, that means to query node attributes.

If changes are represented as independent links, to query changes is to query links, which is the same as querying any document.

5.Accessing any version: The access to any version can be provided either by storing all of them or by composing them on demand. It depends on the approach chosen for version management.

6.Dealingwith thepropagation (effects) of versioning some items on related items: For example, changes to a document affect references or links that reach the document. This is the well-known problem of dangling links common in the Web or, in a more general definition, the referential integrity issue (Ashman, 2000). Other possible impacts are on indexes used for information retrieval operations.

7.Inferring the composition rules of new versions: In certain situations, such as the example in Figure 1, the new versions are virtual (no copy or direct way to obtain it is provided), and the structure of the new version has to be inferred from the information provided about modifications. Humans commonly assume this task, but an automatic inference can also be tackled (Martínez-González, de la Fuente, & Derniame, 2003).

APPROACHES TO THE MANAGEMENT OF VERSIONS

Different approaches can be distinguished. Here they are listed, ordered temporally: The ones with longer tradition appear first and the more recent ones are at the end of the list.

1.Maintaining simultaneously all versions (or deltas) in digital library collections and linking related versions: The main problem with this ap-

202

TEAM LinG

Document Versioning in Digital Libraries

proach is links maintenance (Choquette, Poulin, & Bratley, 1995). This approach facilitates access to any version and does not consider queries about the evolution of versioned items. The propagation of versioning effects is considered but there are no general solutions, and this is still a difficult issue to deal with. This approach and variations of it have received a good amount of attention in the version control area.

2.Considering different stamps of the database and comparing them to detect changes reflecting that an object has been versioned (Cellary & Jomier, 1992):

This solution is used with object databases and therefore can be considered when modelling documents as objects (Abiteboul et al., 1997). In this approach, changes are represented indirectly as the difference between two database states.

3.Modelling modifications as attributes and storing this information with documents: This approach comes from the area of semistructured data, which include structured documents. Changes are represented as annotations (attributes) to the affected nodes, facilitating queries about nodes “history.” The detection of versions is done by tree comparisons (Chawathe et al., 1996; Cobena et al., 2002). In contrast to the previous solutions, this is the first in which document structure is considered, thereby associating changes to document fragments instead of to whole documents. It is possible to know that the document has been changed and where to find the changes; however, it is up to the user to obtain the versions if this is his or her wish. For the same reason, it does not facilitate the automatic composition of versions.

4.Automatically composing versions of documents:

This can be done by keeping the rules that allow the generation of versions of documents. This is the option named intentional versioning (Conradi & Westfechtel, 1998) and has been used for document management (Arnold-Moore, 1997). It is also possible to compose versions by querying metadata (i.e., attributes, links) stored in the system databases (Hemrich,2002;Martínez-Gonzálezetal.,2003).These solutions deal well with access to versions and they are in a good position to treat queries about version evolution. Their main weakness is dealing with the propagation of versioning effects on information retrieval operations.

MANNERS OF IMPLEMENTING VERSIONING SOLUTIONS

The manners to implement solutions to manipulate versions have evolved with time. First, there are the version

control servers, which provide general mechanisms for version control issues (Hicks, Legget, Nürnberg, & D Schnase, 1998; Whitehead, 2001). As for the manner to represent and store the versioning information, some solutions using HTML elements (Vitali, 1999) were proposed. This is a possible implementation of modelling changes in attributes, which has been superseded with

the arrival of XML. A more recent option consists of automatically composing versions. This type of solution appeared with the arrival of XML. There are variations of doing this, such as splitting documents into pieces and storing the composition rules of each version (ArnoldMoore, 2000). Instead of storing composition rules, they can be obtained by querying attributes describing the temporal validity of pieces of content (Hemrich, 2002). Another option is storing the information about modification relationships between documents and inferring the structure (composition rules) and content of a new version from the information stored in the modifications database (Martínez-González, 2001).

FUTURE TRENDS

Some issues related with document versioning, such as querying document changes or dealing with the propagation of versioning, receive the attention of the scientific community recently as compared with some other aspects, as the access to any version. Thereafter, these are open questions, as the solutions proposed until now are not yet satisfactory or scalable enough (see Ashman, 2000; Cobena et al., 2002 for details about some solutions). With the massive use of the Web, where these problems also appear, it is to expect new proposals to emerge. Of course, digital libraries will benefit from this, as they would improve the quality of the services offered to their users.

CONCLUSION

Document evolution demands digital libraries to provide solutions to manipulate versions and to satisfy user requests. Several issues emerge related to versioning, including accessing any version of a document, querying its history, managing the impact of versioning an item on related items, and so forth. The background of dealing with these problems is varied. The version control community has long studied issues as access to any version. They also know well the problems that versioning hypertext elements may cause on other hypertext items as links. However, this study area does not consider other issues, such as querying the history of a document

203

TEAM LinG

or the impact of versioning on information retrieval operations as indexing.

More recent are the approaches that compose versions automatically and infer the composition rules (structure) of versions from semantic information. These solutions, which introduce dynamism and knowledge extraction in version management applications, are promising for digital libraries. They can help to solve automatically some issues that otherwise could not be treated in many of these systems, because a manual introduction of composition rules or versioning information is, in many cases, unaffordable.

REFERENCES

Abiteboul, S., Cluet, S., Christophides, V., Milo, T., Moerkotte, G., & Simeon, J. (1997). Querying documents in object databases. International Journal on Digital Libraries, 1(1), 5-19.

Arms, W. Y., Blanchi, C., & Overly, E. A. (1997, February). An architecture for information in digital libraries. D-Lib Magazine.

Arnold-Moore, T. (1997). Automatic generation of amendment legislation. Sixth International Conference on Artificial Intelligence and Law, ICAIL’97, Melbourne, Victoria, Australia.

Arnold-Moore, T. (2000). Connected to the law: Tasmanian legislation using EnAct. Journal of Information, Law and Technology, 1. Retrieved September 7, 2004, from http://elj.warwick.ac.uk/jilt/01-1/

Ashman, H. (2000). Electronic document addressing: Dealing with change. ACM Computing Surveys, 32(3), 201-212.

Cellary, W., & Jomier, G. (1992). Consistency of versions in object-oriented databases. In F. Bancilhon et al. (Eds.),

Building an object-oriented database system. The story of O2: Vol. 19. The Morjgan Kaufmann Series in Data Management Systems (pp. 447-462). Morgan Kaufmann.

Chawathe, S., Rajaraman, A., Garcia-Molina, H., & Widom, J. (1996). Change detection in hierarchically structured information. SIGMOD Record (ACM Special Interest Group on Management of Data), 25(2), 493-504.

Choquette, M., Poulin, D., & Bratley, P. (1995). Compiling legal hypertexts. In N. Revell & A. M. Tjoa (Eds.), Database and expert systems applications, 6th International Conference, DEXA’95, Lecture Notes in Computer Science, 978 (pp. 449-58).

Document Versioning in Digital Libraries

Cobena, G., Abiteboul, S., & Marian, A. (2002). Detecting changes in XML documents. Data Engineering 2002 (ICDE2002),41-52.

Conradi, R., & Westfechtel, B. (1998). Version models for software configuration management. ACM Computing Surveys (CSUR), 30(2), 232-282.

Hemrich, M. (2002). A new face for each show: Make up your content by effective variants engineering. XML Europe 2002. Retrieved September 7, 2004, from http:// www.idealliance.org/papers/xmle02/

Hicks, D. L., Leggett, J. J., Nürnberg, P. J., & Schnase, J. L. (1998). A hypermedia version control framework. ACM Transactions on Information Systems, 16(2), 127-160.

Lim, S. J., & Ng, Y. K. (2001). An automated changedetection algorithm for HTML documents based on semantic hierarchies. The 17th International Conference on Data Engineering (ICDE 2001), Heidelberg, Germany.

Martínez-González, M. (2001). Dynamic exploitation of relationships between documents in digital libraries: Aplication to legal documents. Doctoral dissertation, Universidad de Valladolid, España; Institut National Polytechnique de Lorraine, France.

Martínez-González, M., de la Fuente, P., & Derniame, J.-C. (2003). XML as a means to support information extraction from legal documents. International Journal of Computer Systems Science and Engineering, 18(5), 263-277.

Martínez-González, M., de la Fuente, P., Derniame, J., & Pedrero, A. (2003). Relationship-based dynamic versioning of evolving legal documents. Web-knowledge Management and Decision Support, Lecture Notes on Artificial Intelligence, 2543, 298-314.

Poworotznek, E. (2003). Linking of errata: Current practices in online physical sciences journals. Journal of the American Society for Information Science and Technology, 54(12), 1153-1159.

Thistlewaite, P. (1997). Automatic construction and management of large open webs. Information Processing and Management, 33(2), 161-173.

Vitali, F. (1999). Versioning hypermedia. ACM Computing Surveys, 31(4), 24.

Whitehead, E. J. (2001). Design spaces for link and structure versioning. Proceedings of Hypertext’01, 12th ACM Conference on Hypertext and Hypermedia, Aarhus, Denmark.

204

TEAM LinG

Document Versioning in Digital Libraries

KEY TERMS

Digital Library: A set of electronic documents organized in collections, plus the system that provides access to them. They are the digital version of traditional libraries.

Hypertext: The organization of information units as a network of associations, which a user can choose to resolve. Hypertext links are the instances of such associations.

Intensional Versioning: Automatic construction of versions based on configuration rules.

Referential Integrity: In hypertext, a measure of the reliability of a reference to its endpoints. A reference has the property of referential integrity if it is always possible to resolve it. When references are represented as links it is called link integrity.

Structured Documents: Documents made by composing well-delimited pieces of content that can present an inclusion hierarchy between them.

Version Control: Set of mechanisms that support object evolution in computer applications.

Versions: Variations of an object with a high degree of similarity. Document versions are never completely equal, but they are similar enough so as to be recognisable as the same document.

Virtual Document: A document (intellectual entity)

that exists in the conscience of individuals but of which D there is no physical copy available.

XML: Extensible markup language. Markup language for structured documents. Structure is represented with textual markup that intermixes with document content. XML is a recommendation from the World Wide Web Consortium (W3C).

ENDNOTES

1Library users are people who access the digital library searching for documents (intellectual entities) that match their requirements. These users may be specialists in a domain (e.g., jurists), with no special knowledge about computers, who just use the library as another tool for their work.

2Adapted from Convention of Accession of 9 October 1978 of the Kingdom of Denmark, of Ireland and of the United Kingdom of Great Britain and Northern Ireland to the Convention on jurisdiction and enforement of judgments in civil and commercial matters and to the Protocol on its interpretation by the Court of Justice.

205

TEAM LinG

206

E-GovernmentDatabases

CatherineHoriuchi

Seattle University, USA

INTRODUCTION

The new face of government is electronic. Prior to the development of e-government, adoption of small-scale computing and networking in homes and businesses created the world of e-business, where computer technologies mediate transactions once performed face-to- face. Through the use of computers large and small, companies reduce costs, standardize performance, extend hours of service, and increase the range of products available to consumers. These same technological advances create opportunities for governments to improve their capacity to meet growing public service mandates. Tasks that formerly required a trip to city hall can be accomplished remotely. Government employees can post answers to frequently asked questions online, and citizens can submit complex questions through the same electronic mail (e-mail) systems already used at home and in businesses. This developing e-government increases the number and complexity of electronic databases that must be managed according to the roles information plays in government operations.

BACKGROUND

E-government has been defined as the application of e- business technologies and strategies to government organizations, the delivery of local government service through electronic means, and a method to enhance the access to and delivery of its services to benefit citizens. E-government is envisioned to be “a tool that facilitates creation of public value” (United Nations, 2003). In its survey of member states, the UN found 173 of 191 members have government Web sites, offering at minimum government information and some measure of service. Common capabilities of governmental Web sites include ability to download government forms and access information on a region or local services (Swartz, 2004). The United States has the largest number of citizens with Internet-connected computers at home and the most developed network infrastructure, resulting in the greatest amount of information and number of services and products available online. Even so, information technology process adaptations for government services are

rudimentary, and many citizens do not have the ability to access their government remotely.

Less visibly, e-government adoption is steadily increasing connections made through the Internet between public agencies and private networks. This connectivity creates computer-to-computer interfaces between multiple databases. Some connections include Web-enabled access to decades-old data structures stored on traditional mainframe systems. Other systems use new databases built or bought from proprietary third parties. A third type of major database project ports data from older systems into new Web-enabled models. Functional requirements determine the database management properties needed by each data system.

E-GOVERNMENT DATA SYSTEMS

Technology utilization and diffusion of technological processes are increasingly important in managing the “hollow state” (Milward & Provan, 2000). This model of governance is desirable because it allows for greater flexibility in government budgeting and operations. A systematic contracting process creates price competition and encourages fresh approaches to social services, but also requires coordination of information and applications to exchange data securely between the service providers and the government agency. To facilitate this, e-government has adopted structuring concepts first developed in commercial sectors as business-to-busi- ness (B2B) and business-to-consumer (B2C) applications. Process re-engineering in government is transforming these concepts into government-to-business, gov- ernment-to-citizen, and government-to-government constructs with parallel acronyms G2B, G2C, and G2G, respectively. Each of these models creates its own service and reliability profile. Diffusion of data is an essential element of these business processes adopted by government (Bajaj & Sudha, 2003), resulting in large data sets shared among technology project partners.

Sometimes, these diffusions go awry because government is not like business in fundamental ways (Allison, 1980) that make government databases more comprehensive and at the same time more vulnerable to data dissemination contrary to established procedures. In one high

Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited.

TEAM LinG

E-Government Databases

profile case, over five million traveler records were provided by an airline to a government contractor working on a Pentagon data mining and profiling project (Carey & Power, 2003), contrary to the airline’s privacy policy. The data were cross-referenced to other demographic data available from another firm. The expanded data set was then used in a paper that was presented at a technology conference and posted briefly on the conference Web site. In a similar case of unauthorized data exchange, an airline transferred 1.2 million records— names, addresses, phone numbers, and credit-card information—to multiple private firms competing for government contracts. This was not reported for nearly two years (McCartney & Schatz, 2004). The U.S. government’s Computer-Assisted Passenger Prescreening System (CAPPS II), intended solely as a screening tool, was cancelled in response to the public outcry over multiple incidents of unauthorized release of data (Alonso-Zaldivar, 2004). The misuse occurred despite assurances by the government that travel data would be protected from precisely this type of diffusion and dissemination, that “information will only be kept for a short period after completion of the travel itinerary, and then it will be permanently destroyed” (Department of Homeland Security, 2003a, p. 1) and stored “at a TSA secure facility” (Department of Homeland Security, 2003b, p. 16).

Models developed for commercial enterprises weakly map to specific requirements of governmental systems. The myriad roles of government result in unique system requirements. Table 1 lists some of the e-government applications already in operation in at least some jurisdictions.

The range in requirements can be highlighted by a brief comparison of two very different applications and properties of the databases they require. In the adoption of e-voting, governments are replacing paper ballots with touch-screen terminals as the voter interface, using machine technology for tabulating the votes and aggregating election results. Once an election is certified, the database can be reinitialized for the next election. For parcel and tax records, information that has been kept for decades or even centuries on paper is now stored on computers that are subject to periodic replacement cycles. Humankind’s earliest writings and data collections are

Table 1. E-government public services applications

Informational Web site (online brochure)

Permitting

Webcasting of meetings

License renewals

Records requests

E-voting

comprised of this very same information, so persistence

is a fundamental attribute. E The public manager must develop information strat-

egy and management skills better suited to the adoption of advanced technologies (Dawes, 2004). These new skills involve system security models, measuring system efficacy, utilizing tools for archival of public records, and managing information across system life cycles that vary widely, put into a public service context. Public managers have not developed these skills at the same rate that they have implemented new technologies or undertaken large projects, contributing to the high failure rate of public sector technology initiatives. These failures range from high-profile project abandonment to an assessment of low value for cost, such as the 1994 assessment that $200 billion in federal expenditures over 12 years created little discernable benefit (Brown & Brudney, 1998).

ELECTRONIC VOTING

Beginning in the 1990s with the development of easily understood Internet browser applications and the widespread penetration of personal computers into households, governments have sought to facilitate access and services through this familiar interface. The introduction of modern technology to one fundamental government activity – voting – has drawn special attention both for its potential to accurately and rapidly count votes and for the specific problems associated with the data collected.

Voting machine technology has gradually been modernized. The original paper ballots have been replaced in varying districts. First developed were mechanical lever machines, then punch cards either supplied with hole punchers or pre-perforated. Optical scanners read cards with ovals or squares that have been filled in by pencil or pen. The most recent introduction incorporates personal computer technology and networks to create computerbased electronic voting (e-voting) systems. All but the latest technologies have been in active use for decades. In the 1996 U.S. presidential election, approximately 2% of the votes were cast on paper ballots, 21% on mechanical lever systems, 37% on punched cards, 25% on optical cards, and nearly 8% on e-voting machines (Hunter, 2001). None of these counts votes perfectly; in 1984, the state of Ohio invalidated 137,000 punch card ballots (Whitman, 2000). Despite ongoing efforts to modernize voting technologies to simplify voting and reduce errors, an election dispute in 2000 began a public debate on computers, databases, networks, and voting.

The contested U.S. presidential election involved the unlikely scenario where the margin of votes for victory in several states appeared smaller than the margin of error for the semi-automated tools that tabulated votes. This re-

207

TEAM LinG

sulted in the infamous pictures of election officials from the state of Florida holding up punch cards rejected by their tabulating machine to determine whether or not a voter had at least partially registered a vote. The popular vocabulary expanded to include the technical terms “hanging chad” for a partially detached square and “dimpled chad” where the card was merely dented. Voting officials involved in the recount argued on the “voter intent” for each type of incomplete perforation.

The recount process ultimately settled by the U.S. Supreme Court resulted in heightened interest in replacing older technology, punch cards in particular (Niman, 2004). With electronic touch screen voting, there would be no paper ballots that would be processed through a tabulating machine. Instead, the new machine would count the electronic votes and then send the totals electronically through a computer link. Despite initial claims by vendors that their systems would have higher reliability, serious problems have developed. For example, “Diebold’s voting system...inexplicably gave thousands of Democratic votes in the Oct. 7 recall election to a Southern California socialist” (Hoffman, 2004). While Diebold and other companies alter their machinery to comply with new mandates for paper voter receipts, questions remain on the security and reliability of the computer systems and the database of votes.

TAX AND PARCEL RECORDS

Governments have long had responsibility for essential data related to ownership of property and collection of taxes. The automation of tax rolls and parcel ownership records creates databases with special requirements for persistence and public disclosure. Under government rules, parcel data and aggregate tax data are generally available for public inspection; advances in software interfaces could result in meeting this public inspection requirement through the Internet. However, parcel records are often linked in legacy mainframe applications to individual tax records that are not available for public inspection, so most local governments do not allow online access to the parcel records, lacking a sure, simple, and affordable method to segment the data structures.

Local governments have undergone gradual adoptions of supplemental databases to manage land use. Many planning departments have automated simple permitting. Geographic information systems (GIS) are implemented across multiple local government boundaries (Haque, 2001). Shared systems in metropolitan zones allow for better cost management of new technologies and create regional models of growth, but these new systems do not have the persistence of the paper maps they replace. An original book of parcel maps may last well over a hundred

E-Government Databases

years, while at best the life cycle of a computer database is measured in decades. Many hardware and software systems are replaced or migrated to newer technologies every few years; so there is a tendency for governments to replicate data in new systems, rather than create a single multi-function database.

FUTURE TRENDS

Governments face challenges in their adoption of modern technology systems and supporting databases (Thomas & Streib, 2003). Some factors encourage persistent use of outdated systems in government operations. These include the long duration of governments, civil service protection for government workers, and noncompetitive provision of fundamental public safety and other services. Other factors support developing new systems: elections often result in changes in political leadership, altering short-term interests and priorities, while efficiency initiatives for public/private partnerships require new linkage and partitioning of governmental data structures. Adding to these functional challenges are security and privacy constraints (Bond & Whiteley, 1998). Table 2 summarizes security-related threats to electronic databases that public managers must address.

Security concerns have increased as databases grow and networks result in data proliferation. These concerns extend to the capacity of government to ensure privacy of information (Strickland, 2003). Uncertain budgets constrain initial implementation and threaten the maintenance of a system under affordability mandates. Security of systems increases the cost and provides no servicerelated benefit, and many observers question the efficacy of security and privacy efforts by government agencies.

Table 2. Security challenges of electronic public databases

Data can be “stolen” through copying, while leaving the source data intact.

Databases can be encrypted, but this function is often enacted in the absence

of unified management of encryption keys.

Custom systems can result in platform and database software version dependency, limiting application of security patches.

Back-ups and test restorations limit potential corruption or loss from system failures but proliferate copies of sensitive databases.

Linking data from legacy systems with newer applications that offer more information may limit inherent referential integrity.

208

TEAM LinG

E-Government Databases

CONCLUSION

While the mechanics of information management are similar in the public and private sectors, constraints and special requirements in public service modify the role of hardware and software systems, particularly the databases which increasingly contain personal information. These differences require information technology savvy and management skills redefined for the governmental context.

Governments use data structures to manage information on individuals, properties, boundaries, as well as to perform basic operational functions such as accounting. Databases that store government information have widely divergent operating requirements. In e-voting, validity is essential, but the data does not need to persist once the voting results have been certified. Tax and parcel records must persist for decades, if not centuries. Security and privacy are matters of interest to government agencies, but adequate funds may not be available to address concerns which develop as a corollary to database development and network connectivity; limited access and slow development of Web-enabled government are among security and privacy strategies in use. Addressing technical constraints on information technology, meeting security challenges, and satisfying citizen expectations will be keys to the success of the next generation of public managers.

REFERENCES

Allison, G. (1980). In J. Shafritz (Ed.), Classics of public administration (pp. 396-413). Wadsworth.

Alonso-Zaldivar, R. (2004). U.S. rethinks air travel screening: Facing questions about privacy issues, the government will try to redesign a computer system to identify suspected terrorists. Los Angeles Times, July 16, A20.

Bajaj, A., & Sudha, R. (2003, October-December). IAIS: A methodology to enable inter-agency information sharing in eGovernment. Journal of Database Management, 14(4), 59-80.

Bond, R., & Whiteley, C. (1998, July). Untangling the Web: A review of certain secure e-commerce legal issues.

International Review of Law, Computers & Technology, 12(2), 349-370.

Brown, M.M., & Brudney, J.L. (1998). Public sector information technology initiatives: Implications for programs of public administration. Administration and Society, 30(4), 421-442.

Carey, S., & Power, S. (2003). Responding to privacy concerns, JetBlue e-mails an explanation. Wall Street E Journal, September 22, B3.

Dawes, S.S. (2004, January). Training the IT-savvy public manager: Priorities and strategies for public management education. Journal of Public Affairs Education, 10(1), 5-17.

Department of Homeland Security. (2003a, February 13). CAPPS II: Myths and facts.

Department of Homeland Security. (2003b, July 22). CAPPS II Privacy Act notice. DHS/TSA-2003-1.

Haque, A. (2001). GIS, public service, and the issue of democratic governance. Public Administration Review, 61(3), 259-265.

Hoffman, I. (2004). Diebold vows to fix e-vote problems.

Oakland Tribune, March 25.

Hunter, G.E. (2001). The role of technology in the exercise of voting rights. Law Technology, 34(4), 1-14.

McCartney, S., & Schatz, A. (2004). American released passenger data. Wall Street Journal, April 12, A2.

Milward, H.B., & Provan, K.G. (2000). Governing the hollow state. Journal of Public Administration Research and Theory, 10(2), 359-379.

Niman, M.I. (2004). A brave new world of voting. The Humanist, 64(1), 10-13.

Strickland, L.S. (2003, January/July). Records and information management perspectives, Part I: Legislative and legal developments. Bulletin of the American Society for Information Science and Technology, 29(5), 11-15.

Swartz, N. (2004, January/February). E-government around the world. Information Management Journal, 31(1), 12.

Thomas, J.C., & Streib, G. (2003). The new face of government: Citizen-initiated contacts in the era of e-govern- ment. Journal of Public Administration Research and Theory, 13(1), 83-102.

United Nations. (2003). World public sector report 2003.

E-government at the crossroads. New York: UN Press.

Whitman, D. (2000). Chadology 101: Divining a dimple. Who knew a simple ballot could be so tricky? U.S. News & World Report, 129(21), 34.

209

TEAM LinG

Соседние файлы в предмете Электротехника