Introduction When the results of analysis permeate the

Introduction :Establishing
a business case for introducing and developing a data quality
management program is often predicated on the extent to which data
quality issues impact the organization and the return on the investment
in data quality improvement. Today, most organizations use data in two
ways: transactional/operational use (“running the business”), and
analytic use (“improving the business”). When the results of analysis
permeate the operational use, the organization can exploit discovered
knowledge to optimize along a number of value drive dimensions. Both
usage scenarios rely on high quality information, suggesting the need
for processes to ensure that data is of sufficient quality to meet all
the business needs. Therefore, it is of great value to any enterprise
risk management program to incorporate a program that includes processes
for assessing, measuring, reporting, reacting to, and controlling
different aspects of risks associated with poor data qualitysummary :While
we often resort to specific examples where flawed data has led to
business problems, there is frequently real evidence of hard impacts
directly associated with poor quality data. Anecdotes help to motivate
and raise awareness of data quality as an issue.  However, developing a
performance management framework that helps to identify, isolate,
measure, and improve the value of data within the business contexts
requires correlating business impacts with data failures and then
characterizing the loss of value that is attributable to poor data
quality. This requires some exploration into assembling the business case, namely: •
Reviewing the types of risks and costs relating to the use of
information, • Considering ways to specify data quality expectations, •
Developing processes and tools for clarifying what data quality means, •
Defining data validity constraints, • Measuring data quality, and  •
Reporting and tracking data issues. Given these aspects of measurement, one can materialize a data quality scorecard that measures data quality performance. conclusion :The
objective of designing a business impact hierarchy for data quality
issues is two-fold. First, incrementally classifying impacts into small
pieces for analysis makes determining how poor data quality impacts our
business processes a much more manageable task. Second, the categorical
hierarchy of impact areas will naturally map to future performance
reporting structure for gauging improvement. As one identifies where
poor data quality impacts the business, one also can identify we will
also be identifying where data quality improvement will improve the
business, and this provides a solid framework for quantifying measurable
performance metrics that will eventually be used to craft key data
quality performance indicators. When a new discipline
emerges, it usually takes some time and a great deal of academic
discussion before concepts and terms become standardized. Text mining is
one such new discipline. In a groundbreaking article, Untangling text
data mining, Hearst (1999) tackled the problem of clarifying text-mining
concepts and terminology. This article is aimed at building on Hearst’s
ideas by pointing out some inconsistencies and inaccuracies and
suggesting an improved and extended categorization of data-mining and
text-mining approaches.   Until recently, computer
scientists and information system specialists concentrated on the
discovery of knowledge from structured, numerical databases and data
warehouses. However, much, if not the majority, of available business
data are captured in text files that are not overtly structured, for
example memoranda and journal articles that are available
electronically. Bibliographic databases may contain overtly structured
fields, openUP (July 2007)  such as author, title, date
and publisher, as well as free text, such as an abstract or even full
text. The discovery of knowledge from database sources containing free
text is called ‘text mining’.   Web mining is a wider
field than text mining because the World-Wide Web also contains other
elements, such as multimedia and e-commerce data. As the Web continues
to expand rapidly, Web mining becomes more and more important (and
increasingly difficult). Although text mining and Web mining are two
different fields, it must be borne in mind that a great deal of the
content on the Web is text-based. ‘It is estimated that 80% of the
world’s online content is based on the text’ (Chen 2001:18). Therefore,
text mining should also form an important part of Web miningReferences : Albrecht,
R. and Merkl, D. 1998. Knowledge discovery in literature data bases.
Library and Information Services in Astronomy III. ( ASP conference
series , vol. 153.) Online. Available WWW:
http://www.stsci.edu/stsci/meetings/lisa3/albrechtr1.html (Accessed 20
August 2002).   Berson, A. and Smith, S.J. 1997. Data warehousing, data mining, and OLAP. New York: McGraw-Hill.   Biggs, M. 2000. Resurgent text-mining technology can greatly increase your firm’s ‘intelligence’ factor. InfoWorld 11(2):52.   Chen,
H. 2001. Knowledge management systems: a text mining perspective.
Tucson, Arizona: University of Arizona (Knowledge Computing
Corporation).   Cornford, T. and Smithson, S. 1996.
Project research in information systems: a student’s guide. Houndmills:
Macmillan. (Information system series.)   Halliman, C.
2001. Business intelligence using smart techniques: environmental
scanning using text mining and competitor analysis using scenarios and
manual simulation. Houston, TA: Information Uncover.   Han, J. and Kamber, M. 2001. Data mining: concepts and techniques. San Francisco, CA: Morgan Kaufmann.   Hearst,
M.A. 1999. Untangling text data mining. In: Proceedings of ACL’99: the
37 th Annual Meeting of the Association for Computational Linguistics,
University of Maryland, June 20–26 (invited paper). Online. Available
WWW: http://www.ai.mit.edu/people/jimmylin/papers/Hearst99a.pdf
(Accessed 20 August 2002).   Hovy, E. and Lin, C.Y. 1999.
Automated text summarization in SUMMARIST. In Mani, I. and Maybury, M.T.
(eds.) Advances in automated text summarization.MIT Press, MA:81–94.
Online. Available WWW: http://www.isi.edu /~cyl/ (Accessed 24 June
2003).  Kontos, J., Malagardi, I., Alexandris, C. and
Bouligaraki, M. 2000. Greek verb semantic processing for stock market
text mining. In Proceedings of Natural