GEOINFORMATICS : A DEFINING OPPORTUNITY
FOR EARTH SCIENCE RESEARCH
White Paper Submitted to NSF
The future research direction and
opportunities in earth sciences will be significantly affected both by
the availability and utilization of Information Technology. Researchers
in earth sciences are deeply involved in discovering the relationships
between the observed geologic record and the complex processes that have
shaped them, and recognize the uniqueness of the geologic evolution of
the Earth within the solar system . The earth has a complex record of the
dynamic interaction of plates, earth materials and life that provide clues
to the physical and chemical evolution of continents, oceans and the atmosphere.
The rock record which preserves nearly 4.5 billion years of history has
been meticulously gathered through observations over the centuries, and
highlight the scientific problems associated with studies of biodiversity
and climate change, planetary processes, and the 4-D architecture and evolution
of continents. As the complexities of these processes are only recently
being recognized through the application of new technologies, it is evident
that an enormous gain in understanding can be realized only if multidisciplinary
data are evaluated numerically, and integrated geospatially through the
utilization of Information Technology.
The Need for an Earth Science
Data System
Ever-growing understanding and acceptance
that the Earth functions as a complex system composed of myriad interrelated
mechanisms have made Earth scientists realize that existing information
systems and techniques used are inadequate. Currently, the uncoordinated
distribution of available data sets, a lack of documentation about them,
and the lack of easy-to-use access tools and computer codes are major obstacles
for scientists and educators alike. These obstacles have hindered scientists
and educators in the access and full use of available data and information,
and hence have limited scientific productivity and the quality of education.
Recent technological advances, however, provide practical means to overcome
such problems. Advances in computer design, software, disk storage systems
as well as the growth of the World Wide Web (WWW) now permit for the first
time the management of Gbytes to terabytes of data for distribution to
scientists, educators, students, and the general public.
Earth Science is a discipline that
is strongly data driven, and large data sets are often developed by researchers
and government agencies. The complexity of the fundamental scientific questions
being addressed require integrative and innovative approaches employing
these data sets if we are to find solutions. Although a number of databases
exist, the ultimate goal of the Earth Science community is to create a
fully integrated data system populated with high quality, freely available
data, as well as, a robust set of software to analyze and interpret the
data. This system would feature rich and comprehensive databases and convenient
access. These capabilities are needed to attack a variety of basic and
applied Earth Science problems
The development of the capability
to construct, organize, and verify an Earth Science data system is a natural,
and indeed essential step for the Earth Sciences to move forward so that
we can understand the Earth as a system, as well as meet societal needs.
Most Earth Science problems are inherently 4-D (x,y,z,t) in nature involving
the subsurface and variation with time. Thus, their solution requires data
analysis that is far more complex than provided by traditional Geographic
Information Systems (GIS). The extent, complexity, and sometimes primitive
form of existing data sets and data bases, as well as the need for the
optimization of the collection of new data, dictate that only a large,
cooperative, well -coordinated, and sustained effort will allow the community
to attain its scientific goals. With a strong emphasis on ease of access
and use, the resulting data system would be a very powerful scientific
tool to reveal new relationships in space and time, and would be an important
resource for students, teachers, the public at large, governmental agencies
and industry
Fundamental new discoveries will
require the availability of databases that encompass a variety of temporal
and spatial scales. Because of the need to integrate heterogeneous data
sets and tools to analyze them, the Geoinformatics program provides the
focus for community participation in a national experiment to enhance and
retain the pre-eminent role in the world for the United States in Earth
Sciences research. It will also be the catalyst for the creation of a global
database ( e.g. Digital Earth).
The Interim Steering Committee(ISC)
(see Appendix A for list of members) has identified both the procedural
details for community participation, as well as recommended the most exciting
research frontiers for the near future that require construction and utilization
of databases. However, the most important Earth Science problems to be
attacked using this data system and software are probably not yet known
because the creative energies of people getting together to explore relationships
among the data and test ideas will lead to unanticipated insights. The
TWO recommendations are described separately, and the benefits to the entire
Earth Sciences community are presented in the summary section.
Creation of a National Consortium
of Academic Institutions
Although it has never been tried
before, the power of having all information and knowledge along with access,
modeling, and visualization tools at the finger tips of a user has great
potential in advancing science, accelerating the discovery process, and
enhancing the quality of Earth Science education and being a valuable tool
for earth science industries involved in resource discovery or protection.
One of our goals is to bring this power to all scientists and interested
parties by forming a center consisting of a number of working groups and
nodes that develop and maintain elements of the data system. Broad input
and participation from the Earth Science community would be sought, and
the ultimate goal would be to form a consortium modeled after IRIS and
UNAVCO. The membership would consist of all interested academic organizations
in the U. S. and could easily exceed several hundred eventually. Each member
institution would appoint a representative to the governing body that would
in turn populate a series of committees to address key issue such as standards,
data management, software arrangements, publication strategies, personnel,
and system architecture. Only a small staff need be hired initially.
Initial Organization
Structure
|
|
|
|
COMMITTEES
|
STAFF
at central node
|
The ISC recommends the establishment
of a consortium of academic institutions through
1. invitation through mailings to all Earth Science institutions to participate
2. announcements in national journals and news magazines
3. inform all earth science societies
In order to take the first step
in this process, an initial group would be formed and would propose to
design and develop selected nodes, and the core of the first comprehensive
Earth information system for research and education, covering scales from
global to local, spatial to temporal. This system will ultimately contain
not only multidisciplinary data sets, but also data manipulation, analysis,
visualization, plotting tools and modeling codes to exploit the digital
data, all accessible on-line real-time via the World Wide Web. It will
be built to handle not only 3D spatial but also temporal changes. There
are countless data sets that could be developed into nodes on this system,
but the funding levels anticipated, and prudence dictate that the initial
implementation be modest. The details of this plan will be discussed at
a workshop scheduled for the fall of 2000, but it will be limited to about
six working groups (based on NSF Earth Science Programs), and a central
working group that provides coordination, develops storage and access mechanisms
for already collected data sets, and provides technical support to other
groups. Within each working group sub-nodes could be developed based on
type of data, topic, or region.
|
|
|
|
|
| WORKING GROUP 1 |
WORKING GROUP 2 |
WORKING GROUP 3 |
WORKING GROUP 4 |
WORKING GROUP 5 |
Frontiers of Earth Science Research
and Geoinformatics
Although it is difficult at best
to predict future research opportunities, the ISC at its second workshop
( May 22nd, 2000), identified three major research categories
that are likely to bring opportunities for new discoveries in the immediate
future through the creation of multidisciplinary geospatially referenced
databases.
EXPERT WORKING
GROUPS
|
DATA SOURCES |
CREATION OF
DATABASE
|
TOOLBOX
|
EXISTING
DATABASE
|
PETROLOGY/
GEOCHEMISTRY
STRATIGRAPHY
GEOBIOLOGY
HYDROLOGY
TECTONICS/
GEODYNAMICS
GEOPHYSICS
FACILITIES
|
ACADEMIC
STATE
SURVEYS
FEDERAL
AGENCIES
INDUSTRY
|
DATABASE TO
BE CREATED BY
ACADEMIC
COMMUNITY
THROUGH
RESEARCH
PROPOSALS
|
EXISTING
AND NEW
SOFTWARE
TO
DEVELOP
DYNAMIC
MODELS
|
UNIVERSITY
USGS
NASA
NOAA
USDA
DOE
EPA
DoD
OTHERS
|
Two centuries of observational and
analytical data are available to construct databases. As it is unlikely
that all the data can be verified and digitally cataloged, the ISC recommends
the creation of databases utilizing a progressive growth model based on
near term research needs. A representation of our vision that provides
for full community participation, identifies data sources and expert working
groups responsible for formulating quality control methods, as well as
creating attributes for all disciplinary data is shown above and described
in the following sections. The structure of the database will be constructed
by experts in Earth Sciences that have significant expertise in both GIS
and database management techniques. Additional help will be requested as
needed from the computer science community.
Creating the earth Science
Information System
A. Expert working groups:
Represent expertise in research categories as defined by the programs within
the EAR of the National Science Foundation (see Appendix A for current
working group). Responsibilities of the expert working group include
1. define criteria for quality
control within subdisciplines,
2. locating databases available
in the various subdisciplines,
3. cataloging available software
for data reduction or modeling,
4. providing the attributes of
data to be entered into the
databases,
5. promoting the utilization of
geospatial data
B. Data sources:
1.
published data will form the main component of the databases
2. all unpublished ( non proprietary ) data and meeting standards
of quality ( i.e. would be published
if submitted to a national
journal) as defined the expert
working groups.
3. Data available from other agencies and programs
C. Creating the database
and information system: Competitive proposals funded by NSF will
provide the initial stages for construction of databases. The ISC recommends
the need to fund multiple requests in as many disciplines as possible to
create the nascent interdisciplinary database. As oversight and management
of the growing database is required, the ISC recommends a progressive growth
model ,whereby the senior principal investigator will be responsible for
nodal data management till a more centralized clearinghouse is established.
The individual PI will then turn data over to the clearinghouse facility
for permanent storage and distribution to the entire community.
Well crafted initial projects are
critical to the success of the data system and ultimately to the formation
of the consortium. A fundamental objective of the initiative must be the
implementation of a small but highly visible change within the community
by adding a geospatial component to the geologic culture. This initiative
must be perceived as a significant contribution to the community at large.
If the proposed data and information system is not regarded as an exciting
and useful tool, members of the community will not expend the resources
(monetary and time) required to access and ultimately contribute to the
data system. The initiative requires several exciting, well integrated,
and easily accessible examples of data system construction to establish
the infrastructure as a indispensable community utility. To achieve this
goal, the initial projects must address fundamental earth processes and
make possible significant contributions to scientific understanding. It
is not necessary to collect new data for this to be successful, rather
the emphasis should be on mining existing data resources for the development
and integration of data sets in a spatially and temporally referenced framework.
In development of initial projects, this initiative must be sensitive to
existing data infrastructure (IRIS, UNAVCO, NASA, USGS, NOAA/NGDC) and
the anticipated needs of EarthScope.
SOME SUGGESTED INITIAL RESEARCH
DATABASES FOR GEOINFORMATICS
DYNAMICS OF DIVERSITY
MAGMATISM AND TECTONIC 4-D ARCHITECTURE
PHYLOGENY DATABASE
PROCESSES : GEOCHEMICAL OF SW UNITED
DATABASE STATES
(ANTICIPATED
STRATIGRAPHY AND TARGET
FOR
CLIMATE CHANGE RATES
OF CRUSTAL GROWTH EARTHSCOPE)
DATABASE AND RADIOMETRIC
AGE
DATABASE 4-DARCHITECTURE
OF REGION X
COOLING AND UPLIFT
HISTORY
OF CONTINENTS
SPATIAL AND TEMPORAL
VARIATIONS IN FAULT
SLIP
RATES DATABASE
SEISMIC WAVE PROPOGATION:
ROCK PHYSICS DATABASE
CRUSTAL THICKNESS:
Pn
TOMOGRAPHY DATABASE
GPS VECTOR DATABASE
GROUNDWATER CHEMISTRY
DATABASE
D. Available and needed Toolbox
of software: The various expert working groups will be responsible
for identifying all available software (academic and commercial) for data
reduction, manipulation and modeling. The expert working groups will also
be responsible for recommending the development of new software to enhance
the utilization of the databases.
This requires development and maintenance
of a well designed front-end for a variety of programs needed to extract,
interface, and model data available from the data system (e.g., GPS community
has good model to review: Scripts data structure for access to raw information;
UNAVCO working groups to make velocities available to the non-GPS community).
The development of a toolbox is a vital consideration, using existing data
sets to construct and verifying databases is a major task and, without
the needed software, virtually impossible. Thus these tools are an absolute
necessity for the success of the data system. In an environment characterized
by access to rapidly evolving data sets developed to address specific problems
(curiosity driven research) modification, addition of information, and
reorientation of the structure to address a new motive for data set development
will require an evolving system of software applications
E. Dynamic models: The creation
of numerical models with graphic and visualization capabilities will be
significant for the growth of the Geoinformatics program. The ISC recognizes
the increased opportunities for fundamental breakthroughs in EAR research
if the right software is available to integrate multidisciplinary data.
F. Linkages to available databases:
Many federal and state agencies, as well as academic institutions and industry
have either national and regional or thematic databases. Fusion of these
databases with those created through the Geoinformatics program will require
development of new software, new protocols as well as interagency agreements.
Coordination with the educational project, Digital Libraries for Earth
System Education (DLESE) will strengthen both programs.
Summary and Long Term Vision
Our approach is central to the vitality
and longevity of the data system. Simplicity and flexibility is crucial
in developing a system that can respond to changing technologies and user
needs. At the early stages of development, regional data system nodes will
be crucial to gathering and maintaining regional contributions, and as
an interface with the local community. The evolving data system would require
the establishment of an interim facility where fundamental data sets are
housed ( e.g. available tools and programs), and linked via broad bandwidth
Internet connections needed to handle data access and transfer. The data
system must be flexible and have minimal infrastructure requirements (PC
versus UNIX based systems; various data management protocols; peripheral
hardware requirements) and a minimum of mandated data structure requirements.
Metadata are needed and development efforts must be fully supported. For
the data system to be successful, there must be an incentive for users
to contribute data to the community system. NSF
and the community could develop a system of rewards via some formal system
of citation. A mechanism for publication of data sets (with or without
interpretation) may interface with emerging digital publication systems
and it is conceivable that the data system initiative may be able to enlist
different societies (AGU, GSA, AAPG, etc.) to support electronic publication
of data sets, depending upon the contents.
In the long term, the goal of this
effort is build the initial organization into a consortium overseeing a
comprehensive, effective national program in support of Geoinformatics.
This effort will take a number of years to mature, and will require considerable
thought and deliberation. The funding required is substantial and will
probably require interagency cooperation.
Overall, the benefits of this program
would include the continued scientific leadership of the United States
in Earth Sciences , as well as the opportunity to construct a global data
base that would uniquely characterize our planet.
ACKNOWLEDGEMENTS:
The committee is appreciative of the funding provided by the Earth Sciences
Division of the National Science Foundation for the workshops on Geoinformatics.
We also thank the staff of the American Geophysical Union, Washington,
D.C. for their support in organizing the meetings.
APPENDIX A
INTERIM STEERING COMMITTEE
A. Krishna Sinha
, coordinator
Petrology-Geochemistry-Isotope
1.Frank Spear, Rensselaer Polytechnic Institute, 518-276-6103,
spear@rpi.edu
2.Lang Farmer, University of Colorado, 303-492-8141, farmer@terra.colorado.edu
3.A.Krishna Sinha, Virginia Tech, 540-231-5580, pitlab@vt.edu
Tectonics
4.Roy Dokka, Louisiana State University, 225-388-2975,
rkdokka@geol.lsu.edu
5.John Oldow, Univ.
Idaho, 208-885-7327, oldow@uidaho.edu
Stratigraphy-Geobiology
6.Walter Snyder, Boise
State University, , 208-426-3645 wsnyder@boisestate.edu
7.Charles Marshall,
Harvard University. , 617-495-2572 marshall@eps.harvard.edu
8.Carl Flessa, University
of Arizona, 520-621-7336 kflessa@geo.arizona.edu
Geophysics
9. William Holt, SUNY,
Stony Brook 516-632-8215, wholt@horizon.ess.sunysb.edu
10. Randy Keller,
University of Texas, El Paso 915-747-5850 keller@geo.utep.edu
11. Nick Christensen,
University of Wisconsin, 608-265-4469 chris@geology.wisc.edu
Hydrology-Surficial
Processes
12.Ramon Arrowsmith,
Arizona State, 480-965-3541 ramon.arrowsmith@asu.edu
13.Herb Wang, University
of Wisconsin, 608-265-0693 wang@ls.admin.wisc.edu
14. Everett Springer,
, Los Alamos National Lab, 505-667-0569 everetts@lanl.gov
GIS/Database/Facility
15. Dogan Seber, Cornell,
607-255-1159 ds51@cornell.edu
|