Lamont-Doherty Earth Observatory

Overview of DLESE

The Digital Library for Earth System Education (DLESE; http://www.dlese.org) is envisioned as a facility that provides: (a) ready access to high-quality educational materials about the Earth and environment for use by educators and learners at all levels, (b) information, tools and services to maximize the usefulness of the materials provided; and (c) a community center that fosters interaction, collaboration and sharing among Earth science educators and learners. Development of the library is underway as a nation-wide distributed effort, overseen by the DLESE Steering Committee and coordinated through the DLESE Program Center at Boulder (DPC). Our project represents one component of this distributed effort.

Construction of DLESE began approximately 2 1/2 years ago. Since then, the DLESE vision has been defined and articulated, an effective governance system has been put in place, policies have been developed concerning collections, privacy, intellectual property, etc., the geoscience education and research communities have been engaged through workshops, listservers, and working groups, and the technical infrastructure of a working library has been built from scratch. DLESE Version 1.0 opened to the public in August 2001. Version 1.0 features a Discovery System that enables users to search for resources by grade level, educational resource type, and keyword, a web-based Resource Cataloger, and community services such as discussion forums and postings of educational opportunities. The further development of DLESE over the next 5 years, through versions 2 and 3, has been planned out in a Strategic Plan (DLESE 2001), a versioning document, and a 5 year NSF/GEO proposal for funding of DLESE core functions.

Among the digital library efforts funded by NSF, DLESE is notable for its emphasis on the importance of community (Marlino et al, 1991); indeed, its founding document (Manduca & Mogk, 2000) is subtitled "A Community Plan." In the DLESE vision, the community of library users is also the community of library builders, creating, gathering, and reviewing the resources, shaping the governing policies, and providing constant feedback into the development of the technical infrastructure. This community focus is in part a response to the research finding that systemic educational change requires large-scale, long-term community participation in and ownership of the process of change (AAAS, 2001; Rogers, 1995), and partly a result of the way in which DLESE has grown through a series of grass-roots workshops and planning sessions (NSF, 1996; NSF, 1997; Manduca & Mogk, 2002; DLESE, 2000; DLESE, 2001c).

Results from Prior NSF Support

The four PI's of the present proposal are currently funded by the NSDL Collections Track for work entitled "Collaborative Project: To Gather, Document, Filter and Assess the Broad and Deep Collection of the Digital Library for Earth System Education (DLESE)." This work comprises four tasks in support of the larger DLESE vision: (1) locating on-line educational resources relevant to Earth and environmental education, (2) cataloging those resources, i.e. associating appropriate metadata with each resource; (3) developing and implementing procedures to find the best resources within the DLESE Broad Collection for inclusion in a "Reviewed Collection" of highest quality resources; and (4) assessing the DLESE collection, i.e. systematically comparing the scope and balance of the existing collection with that desired by the user community. The performance period for our existing grant is 9/01/00 to 8/31/02. Our accomplishments to date:

Task 1: Gathering Resources. Responsible PI: Christopher DiLeonardo, Foothill-DeAnza Community College District. Award Number: 0085831 Amount: $106,067.

The collection effort of this project reviewed several hundred websites and thousands of pages of web-based resources. Following the principle of gathering a "broad and deep" collection for the library, the gathering team under the direction of DiLeonardo pursued a thematic approach, evaluating and referring resources for cataloging across a spectrum of topics. A major accomplishment in the initial phase of this project was the development of principles regarding granularity and the "level" of resource collected as an object for the library. Approximately four hundred individual objects will have been added to the DLESE collection by the end of the project from this gathering effort.

As a member of the DLESE Steering Committee, DiLeonardo has helped set the direction for the library's development and evolution. He co-wrote the DLESE Articles of Federation (DLESE 2001b), which sets the framework by which institutions can interact in both building and operating this community-based, distributed library. DiLeonardo has also worked on subcommittees investigating business models to support DLESE, and is most recently charged with developing strategies for an evolving governance system to match the library's changing focus from development to implementation.

Task 2: Cataloging Resources. Responsible PI: Sharon Tahirkheli, American Geological Institute. Award Number: 0085787 Amount: $98,711.

The findability of a resource in a digital library depends on the quality of its metadata (Gilliland-Swetland, 2000). The AGI group's job has been to associate the appropriate metadata with resources identified for inclusion in the DLESE Broad Collection, using the IMS standard with DLESE-specific extensions (DLESE Metadata Working Group, 2002). We cataloged resources referred by DiLeonardo, by the other collaborators on this project, and by Dave Mogk's collections group at Montana State University, as well as those suggested by the community though the DLESE website, and some we found ourselves in the process of cataloging. The current metadata fields that are being applied include: Title, URL, Mirror URL, Description, Audience (Potential Users), Resource Type, Subject, Technical Information (Software requirements for use), Copyright, Cost, Coverage (Geographic information), Geography Standards, Science Standards, Keywords, Relationships to Other Resources, Resource Creator and Resource Cataloger. At this writing, we have completed cataloging of 800 records. With expected staffing through the end of the current grant, we expect completion of close to 2000 resources.

Working closely with the DLESE Program Center, we have provided feedback on the functionality of the DLESE Cataloging tool (http://catalog.dlese.org/catalog/launch.html) through beta testing and subsequent updates. While examining resources and developing metadata, we have provided suggestions for the improvement of DLESE best practices, commentary on user-friendliness of the cataloging tool, and input into the development of taxonomies within the best practices. We have identified and examined issues such as granularity, and referred these issues to the DLESE Program Center, the Collections Committee, and to our collaborators for discussion and consideration. Tahirkheli also served on the NSDL metadata committee.

Task 3: Developing and Implementing a Review System. Responsible PI: Kim Kastens, Lamont-Doherty Earth Observatory of Columbia University. Award Number: 0085827 Amount: $337,333.

DLESE's Collections Policy (DLESE 2001d) calls for a Broad Collection, containing a wide range of resources on all relevant topics, plus a Reviewed Collection of resources which are considered to be of highest quality. The rationale for establishing a Reviewed Collection is to help library users find quality teaching and learning materials, and to help resource creators achieve academic career recognition. The rationale for maintaining an unreviewed or Open Collection to is provide the broadest possible range of resources, and to provide a forum in which resource users can provide feedback to creators to iteratively improve the quality of individual resources.

As one of several possible pathways into the Reviewed Collection, we have implemented and are currently testing the DLESE Community Review System (CRS; http://dlese.ldeo.columbia.edu/). This system solicits web-based feedback from educators in the DLESE Community who have used the resource with their learners, and combines it with specialist reviews to evaluate resources against seven selection criteria set by the DLESE Collection Committee: scientific accuracy, pedagogical effectiveness, ease of use, quality of documentation, robustness as a digital resource, ability to inspire or motivate, and significance. The premises behind the design of the review system, as well as the information and decision flow pathways, are detailed in Kastens & Butler (2001). This hybrid review system taps into the power of the Web and the strength in numbers of the DLESE Community, and is consistent with DLESE's guiding philosophy of library user as library creator (Manduca and Mogk, 2000). The process is overseen by an Editorial Review Board of scientists and educators (http://www.ldeo.columbia.edu/DLESE/collections/ERB_members.html), and has been refined through feedback from the community at an AGU special session (Kastens, 2000), the 2000 and 2001 DLESE Summer Leadership Workshops, a joint meeting of the Collections Committee/Editorial Review Board, and a meeting of the DLESE Steering Committee.

Our web-based recommendation engine gathers separate feedback from three categories of commenters: (a) non-educators, (b) educators who examined the resource but did not actually use it in their own classroom, and (c) educators who actually tried the resource with real learners in their classroom or other learning context. All of the feedback is emailed automatically to the resource creator, but only the feedback from educators who have used the resource with real learners is used in the Editorial Review Board decision about whether to admit a resource to the Reviewed Collection. The latter category of reviewers fill out scoring rubrics (figure 1a) pertaining to each of the Reviewed Collection selection criteria, and may enter free text comments if they chose. In addition, educators are encouraged to share "Teaching Tips," which will be available to potential users of the resource. Finally, the recommendation engine asks educators whether they used the resource in a challenging teaching or learning situation, such as with visually-impaired students, urban dwellers, limited English proficiency students, etc., and, if so, would they recommend the resource for use in such situations (fig. 1b).

The answers coming back from the recommendation engine are all stored in a relational database, and used to generate several types of automatic reports and information flows. Whenever an educator submits a review, an email summarizing the review content is sent to the resource creator and the Community Review System editor, with a blind cc to the reviewer. Information from educators who chose not to use the resource can be compiled into summary tables aimed at spotting resources that have been miscataloged, for example specified for too low or two high a target audience (fig 1c). Information from educators who used the resource with real learners is compiled into summaries which display the quantitative data as bar graphs and the free text comments chronologically (fig. 1d). These summaries are distributed to the responsible Editorial Review Board member, and to the resource creator.

The recommendation engine is implemented as a web-based front end, with the functionality implemented in php (4.1) code executed by an Apache release 1.3 server. The metadata associated with the CRS is stored in a PostgreSQL release 7.2 relational database and accessed via php. Uniform page style is maintained by using cascading style sheets. All code, including the web pages are versioned using RCS. Initially the Community Review System was running on a Sun Ultra 10/440 under Solaris 5.8 but was ported to a Dell OptiPlex GX240 running under a Linux 2.4 kernel from the Red Hat 7.2 distribution. The new server is about four times faster.

In March 2002, the first test group of three resources entered the Community Review System. From these earliest tests, we have verified that teachers will in fact fill out our scoring rubrics and submit useful commentary, that the review process doesn't take an unreasonable amount of the educators' time (less than ten minutes on average), and that resource creators find the feedback valuable (e.g.: "The feedback has been very interesting. I was particularly interested in the comments that several teachers wrote, very helpful. Also, the questions that you use in the review are poignant and will be very helpful to us as we review our own projects." Email communication, Joshua Koen, Center for the Improvement of Engineering and Science Education, project director of the Global Sun/Temperature Project). By the end of the current grant, we anticipate that the

Community Review System will be well-tested and running smoothly, and that a modest number of resources will have entered the DLESE Reviewed Collection via this pathway.

In addition to the technical and educational efforts described above, the L-DEO team have contributed substantially to the governance, policy and oversight functions required to make DLESE and NSDL function as integrated libraries and as learning communities. Project engineer Dale Chayes serves on the DLESE Technical Committee. PI Kastens chairs the DLESE Collections Standing Committee and is a participant on the NSDL Evaluation Committee. As Chair of the Collections Committee, Kastens is overseeing the development of DLESE policies on Collections accession, Deaccessioning of resources and collections, Pathways into the reviewed collection, Criteria for entrance to the Broad Collection, and Career recognition for resource creators. As a standing committee chair, Kastens also participates in DLESE Steering Committee meetings, served on the program committee for the 2000 and 2001 summer leadership conference, and serves on the nominating committee for the Steering Committee membership. She organized two sessions at each of the 2000 and 2001 summer leadership workshops.

Publications:

Kastens, K. A., 2000, A ‘Community-Review" Mechanism to Build the Collection of the Digital Library for Earth System Education, EOS Transactions AGU, vol 81, no 19.

Kastens, K. A., and John C. Butler, 2001, How to identify the "best" resources for the reviewed collection of the Digital Library for Earth System Education, Computers and the Geosciences, v. 27 (3), 375-378. (on Web at: http://www.ldeo.columbia.edu/DLESE/collections/CGms.html)

Task 4: Collections Assessment. Responsible PI: Barbara DeFelice, Dartmouth College. Award Number: 0085839 Amount: $57,969.

Collections assessment (Mosher, 1990; Hall, 1985; OCLC, 2002) is a thorough and systematic comparison between the actual collection of materials in a given library, and the ideal collection of materials that would be desired by the users, staff, funders, and/or overseers of the same library. The concept of collections assessment comes from efforts to evaluate print collections, and although not all of the techniques from the print collection environment apply to the digital environment, the concept remains useful (Bond, 1998).

Collections assessment is a necessary prerequisite for building any library's collection in a balanced and efficient manner. Collections assessment may be particularly important in a library built through community contributions, in order to avoid a collection dominated by the individual research and educational concerns of a small number of very active community members. The PI's of this project have demonstrated the need to build balanced collections for DLESE and have promulgated this awareness throughout the distributed DLESE collections efforts. A particular concern has been the addition of ecosystem materials, as well as materials in other underrepresented topics, so that the collection reflects a wide definition of earth system science.

Much of our work for the past 18 months has been spent on the prerequisites for collection assessment. In order to be able to gather the data needed for collection assessment, DeFelice worked closely with the DPC metadata group to add needed metadata elements. These range from details such as whether the resource is free or fee-based, to major issues surrounding the specific controlled vocabulary keyword terms. From this later concern, the current list of 30 topic terms was chosen as an initial measure until a vocabulary for earth system science is developed.

At the beginning of our project, the DLESE community and the geoscience community in general were not familiar with the concept of collections assessment. DeFelice presented the information and type of statistics needed for collections assessment to the DLESE Collections Committee, at a DLESE metadata workshop, at the 2000 DLESE Summer Leadership meeting, and at numerous meetings with the DPC staff.

DeFelice developed an understanding of the data structure of the DLESE catalog, and made recommendations about which fields should be indexed so that there is access to this data for the purposes of collections metrics. The DPC technical staff has now agreed to do this where there is metadata, the information is important, and where the data structure allows for this.

Rubrics to measure collection depth for DLESE v.1.0 were developed and are being refined. Specifications and requirements for the DLESE Discovery and Management systems for gathering use and collections metrics were developed, drawing on work throughout the digital library community on how to report and interpret collection and use metrics (Association for Research Libraries 2001 and 2002, Digital Library Federation 2000, International Coalition of Library Consortia 1998, Troll 2001). These requirements cover data from the actual collection, plus data on requests made from both the "Browse" and "Search" capabilities of the Discovery System.

Finally, within the last month, we have begun to see tangible results from all of this prerequisite work. We can now document the scope and balance of the actual collection with respect to the three critical parameters of learning context (early elementary through graduate/professional, plus informal and general public), topic (from a list of 30 topics), and resources type (e.g. photograph, map, classroom activity). In addition, we can quantify the distribution of requests to the Discovery System "Browse" facility according to these same three parameters. For requests to the Discovery System "Search" facility, we can now quantify the distribution of requests according to learning context, and identify the most heavily requested search terms, and examine search terms that result in null-returns.

Figure 2 shows an example of the kind of analysis we can do with our newly-obtained data. This graph compares how many times each topic term was clicked using the Browse feature in the Discovery system during the month of March 2002, versus how many items with that topic in the metadata were in the collection as of early April. At this point, this tantalizing data raises more questions than it answers. Are agricultural sciences and cryology really so important to the DLESE community? We will watch closely as additional months of data accumulate.

In collaboration with the DPC technical staff, we have identified further work that is needed to pull together content data, browse data and search data. Browse and search data are obtained from different data streams but both are needed to understand what users are looking for in the collection. A preliminary report on Version 1.0 using the newly available data from January to May will be provided at the DLESE Annual Meeting in June 2002. This report should help guide the collection building efforts from DLESE v.1. 0 to v. 2.0.

In addition to her work on collections assessment, DeFelice contributed substantially to the governance of DLESE and NSDL during this award period. As the only professional librarian on the DLESE Steering Committee, she played a central role in the development of the DLESE Collections Policy, the Scope Statement, the Collections Accession Policy, and the deaccessioning policy.

Publications:

DeFelice, Barbara J., 2000, Building a Community Centered Digital Library for Earth System Education, Proceedings of the Geoscience Information Society, vol. 31, p. 91-95, 2000. (Paper presented at the Geological Society of America Annual Conference, November 2000).

DeFelice, Barbara J., 2001, "Another Node on the interNet", Computers and Geosciences, vol. 27, no. 5, p.611-613, June 2001.

Figure 2: Example of the kind of data we can obtain by comparing the resources in the actual collection with requests for resources made by users through the DLESE Discovery System. The Discovery System data shown here represent requests during March 2002. (Above) A comparison of browse results versus library content, with respect to resource topic.
(Below) A comparison of search results versus library content, with respect to learning context (grade level). Note that the histogram of resources in the actual collection peaks in the undergraduate grade range, but the requests from users are most numerous in the high school range.

Needs of the DLESE Collections

During the period of our existing grant, the DLESE Collection has grown from zero resources to 1345 individual learning resources, through our efforts and those of other collecting/cataloging groups and individuals. The DLESE Program Center has built a working Discovery System, which is now visited by 250 people per day (J. Weatherly, DPC, personal communication, April 2002). However, the DLESE Collection does not yet provide a choice of excellent resources, ideally-suited for the target audience, focused on the appropriate topic, for the majority of visitors. We consider that the most urgent needs of the DLESE Collections include:

Many more resources in the Broad Collection.
More resources in the Reviewed Collection.
An accurate understanding of what kinds of resources DLESE holds, along dimensions of interest to potential resource users (e.g. topic, grade level, learning resource type).
Methodologies for routinely and methodically tracking changes in the scope and balance of the DLESE Collections as they grow, and for identifying gaps and thin spots.
Methodologies for capturing feedback on what kinds of resources users of the library would like to have in the Collection. Both analytical (e.g. tabulating keyword requests in the Discovery System) and qualitative (e.g. focus groups) methods should be used.
Communications pathways which inform resource creators and collectors of important gaps and thin spots in the Collections, which need to be filled either by finding suitable existing resources or by creating new resources.
Resources which are cataloged with accurate and complete metadata, to ensure that a manageable number of the most relevant resources are returned from a search, even as the collection size multiplies.
Resources which are searchable according to additional criteria, including: whether or not the resource is in the Reviewed Collection, alignment with educational standards, association with a geographic location or with a geological timeslice.
A respected, well-documented, smooth-functioning pathway for resources to be accepted into the Reviewed Collection, a pathway that is open to all relevant resources, regardless of audience type, topic, resource type, etc.
Mechanisms to contribute to the career recognition of creators of excellent quality educational resources.

Proposed Work

Working in close collaboration with the DLESE Program Center, the Community Collaboratories that DLESE will be setting up within the next year, the NSDL Core Integration Office, and the DLESE and NSDL Standing Committees, we proposed to accomplish four tasks: (1) Periodic assessment of the scope and balance of the DLESE Collection; (2) gathering of additional resources for the DLESE Broad Collection to fill gaps and thin spots in the Collection; (3) application of accurate and complete metadata to those resources; and (4) development of the Community Review System from a testbed into a respected, well-documented, smoothly-operational pathway into the DLESE Reviewed Collection. The division of labor between the four collaborative institutions of this proposal, the DLESE Program Center, and the DLESE Community Collaboratories is detailed in the DPC Subcontract Budget Justification.

Task (1) Collections Assessment.

We will methodically and systematically compare the scope and balance of the actual DLESE Collection, with the collection desired by the library users and by the DLESE community as expressed through searches and browses on the DLESE Discovery system, in community meetings and through the DLESE Scope statement. The results of periodic collection assessments will indicate areas where the collection needs to be developed, and will guide collecting efforts, collection building efforts and the development of new educational resources.

(1a) Assess scope and balance of actual collection: At present, we are assessing the scope and balance of the actual collection for learning context (aka grade level), topic, and resources type. We access this information through a web-based Collections Management System, built by the DLESE Program Center. This system enables us to ascertain how many resources exist in the collection with a certain attribute or combination of attributes, and to examine lists or descriptions of resources possessing these attributes.

We have only had access to the Collections Management System for one month, so that an immediate pressing task is to accumulate some time series history of the growth of the collection. Over time, is the ratio of high school to undergraduate resources in the collection growing to be more in line with the ratio of requested resources? Over time, is the percentage of "active learning resources" (e.g. field activity, lab activity) growing, as instructors become more familiar with using and creating such activities? At our request, the DPC has begun to "snapshot" the metadata collection once a month, and they will provide an Assessment version of the Collections Management System which will work with these snapshots, so that we will be able to generate data on the current and historical status of the library collection.

The Collection Management System does not currently provide access to all of the parameters recorded in the metadata framework. Some missing fields such as cost/no cost, and alignment with science and geography standards, are of interest for collections assessment. To make it possible to assess with respect to these fields, we will work with DPC to determine which such fields can and should be indexed to make them available for searching in the Management System.

At present, the Collections Management System provides data only as plots on the screen, similar to those visible in the browse facility of the publicly-accessible Discovery System, so it is tedious to compile data on multiple combinations of attributes. Under their subcontract from this grant, DPC will develop a capability to download digital data in a common format (e.g. Excel-compatible tab-delimited) from the Assessment version of the Collections Management System to allow reconnaissance data exploration. Following an initial stage of data exploration, DPC and our Collections Assessment PI will agree on a set of most valuable metrics for documenting the scope and balance of the collection. DPC will then develop an automated reporting capability to generate these metrics automatically on a monthly basis.

It is only possible to assess a digital library collection according to those parameters which are recorded in the metadata framework. For this reason, our collections assessment group will continue to work closely with the DPC metadata group to ensure that the evolving DLESE metadata standard includes those parameters which are of interest for collections assessment (as well as those which are important for resource discovery.) As the collection develops and the metadata is enhanced, it should be possible increase the number of parameters on which we assess the collection. Examples of important parameters are: geographic distribution of resources focusing on a particular locality, temporal distribution of resources focusing on a particular slice of geologic time, language of the resource, and earth system science controlled vocabulary.

(1b) Document scope and balance of collection desired by users: To determine the desired collection as expressed by users of DLESE, we will compile data on the requests for resources made through the DLESE Discovery System, through both the browse and search capability. We will pay particular attention to null-returns, seeking to identify important gaps in the collection.

The DPC is currently collecting "click-stream data" from the Web server, which tracks in detail every button selected and hyperlink clicked on the DLESE site, including searches and browses in the Discovery System. The data collected includes date and time, a count for each click on a metadata parameter, whether or not the user looked at more than the first page of results, and the IP address from which the request came. For browse requests, the data include which learning context or topic or resource type the user clicked. For search requests, the data include which (if any) learning context the user clicked, what s/he entered in the search field, and how many resources were returned by the search.

Under their subcontract to this project, the DPC technical staff will develop filters for the click-stream data based on the specifications developed by our collections assessment PI. Before use in assessment, non-user hits from web-crawlers and from internal requests by DPC staff and collections-groups must be removed. Concerns about the user-request data that need to be dealt with in the filter construction include misspellings, upper/lower-case equivalence, singular/plural equivalence, terms that are not commonly used in the titles, descriptors and subject terms, and search interface issues.

Initially, DPC will deliver the filtered data for our project in a common format (e.g. Excel-compatible tab-delimited) to facilitate reconnaissance data exploration. Some of this background work has taken place, so results will be available early in the first year of the grant. Following an initial stage of data exploration, DPC and our Collections Assessment PI will agree on a set of most valuable metrics for documenting the search and browse requests coming from the Discovery System. DPC will then develop an automated reporting capability to generate these metrics automatically on a monthly basis. Results will be analyzed and any changes needed in the filters and reports will be made during the second year.

Some analysis will have to be done manually by our collections assessment team, such as determining whether a low-return search should have retrieved more results because the content does exist. It will also be necessary to manually map keywords used in searches to the controlled vocabulary terms used in browses in order to get a better sense of the topics that are requested by DLESE users.

Just as we document changes over time in the actual collection, we will also document changes over time in the balance of requests submitted through the search and browse functions of the Discovery System. Is the fraction of requests that result in null-returns decreasing as the collection grows? Are new areas of interest emerging, which might require a redirection of some collection effort? Have requests for elementary and informal education resources increased, in response to DLESE's outreach efforts towards those communities?

In proposing to use users search and browse requests as a proxy for user desires, we recognize that there may be confounding factors in this data, introduced, for example, by clumsy search technique. It would be desirable to validate the user-request data through observation of user behavior in the Discovery System. Although such a study is not in the purview of this project, we will keep in close contact with the ongoing usability studies of the Discovery System, and with the efforts of the DLESE Evaluation collaboratory, and perhaps with the NSDL Evaluation Working Group, who may undertake this kind of study.

(1c) Compare actual and desired collection, and disseminate results and recommendations: The audiences for the DLESE Collections Assessment include the resource gathering and cataloging team of this collaborative project, current and potential builders of themed collections within DLESE, developers of instructional materials, funders of collection building and instructional materials development projects, and the builders of other digital libraries who might learn from our methodology. Results will be disseminated via annual reports to the NSDL program at NSF, through presentation at professional meetings, at the NSDL PI's meeting, at the DLESE annual meeting, via the NSDL "Whiteboard" newsletter, and via the NSDL and DLESE websites.

Our first comprehensive assessment, reporting on DLESE v. 1.0, will be disseminated in the fall of 2002, at the end of the current grant period. Subsequently, we will issue annual assessment reports, reporting on DLESE v.2.0 in December 2003, followed by a final assessment at the end of the grant period in December 2004.

We anticipate that each report will contain (a) graphs/tables/matrices, (b) rubrics and (c) recommendations. The graphs/tables/matrices will compare the proportion of resources in the actual collection possessing an attribute or combination of attributes versus the proportion of requests for that attribute in Discovery System searches and browses. Matrices along two collection axes (e.g. learning context and topic) will be used to discover thin spots and "hot spots" in the collection. Time series plots will be used to explore the growth of the library through time.

Our proposed collections assessment scoring rubrics are analogous to those used by educators in the K-12 community, and are designed to document the comparative depth and breadth of the collection by topic or other major parameter such as learning resource type. The collections rubric score is determined primarily by the collection and user-request metrics. However, the relative availability of materials has weight in the score as well, since a subject area that has few items in DLESE but is well represented in other libraries or collections should have a lower score than a subject for which there are few items in DLESE but also few items in other collections. For knowledge of the relative availability of materials, we will draw on DeFelice's contacts and experience for physical Earth systems (lithosphere, hydrosphere and atmosphere), and on the experience and contacts of C. Rinaldo for ecology and the living environment. (C. Rinaldo, an ecology/biology librarian at Harvard, is a consultant on this proposal.) The scoring rubrics need to be customized for each release version of DLESE to accommodate changes in the metadata framework (Tammy Sumner, personal communication, 2002).

The recommendation section of each report will spotlight areas within the DLESE scope in which the collection needs to be strengthened, based on user requests, numbers of items available, and zero result hit analysis. Based on knowledge of other related resources, we will try to distinguish between areas in which existing resources need to be identified and cataloged, and areas where new resources need to be created from scratch. Finally, we will recommend a baseline set of collections metrics and collections-related use metrics that can be use with DLESE thematic collections and with other NSDL collections projects.

Task (2) Gathering Resources:

Under our present grant, our resource-gathering effort has spanned the breadth of Earth and environmental science, seeking to build a starter-collection with at least a few resources of interest to a wide range of potential DLESE users. In parallel with our efforts, other groups and individuals have been gathering resources in their own areas of interest, and submitting those resources through the DLESE Resource Cataloger; this includes several groups funded by the NSDL Collections track. For the next 2.5 years, our collections gathering effort will be guided by the Collections Assessment. We will focus on filling gaps or thin spots in the collection, seeking to avoid a "lumpy" collection rich only in those areas collected by specific interest groups. This collection method differs, not only from our past efforts, but also from ongoing collection efforts in most other Digital Libraries. This change in collection philosophy is in response to the evolution of the DLESE collection and mirrors the best practices in collection development in traditional libraries.

We will leverage our existing experience to grow the library collection more rapidly over the next 2.5 years. An anticipated 1,500 to 2,000 objects representing over 10,000 web pages will be gathered as the result of this project. By utilizing the collection assessment approach, we will maintain the guiding philosophy of gathering a collection that is both "broad and deep."

Task (3) Cataloging Resources.

The findability of a resource in the DLESE or NSDL Discovery system depends on the quality of its metadata. DLESE has the technical infrastructure to allow anyone from the DLESE community to catalog a resource which they like (the Community Cataloger http://catalog.dlese.org/catalog/launch.html). However, because metadata is such a new concept to the geoscience education community, this capability has resulted in only a trickle of resources from rank-and-file members of the community, and some of the metadata contributed by these volunteer catalogers has been incomplete or otherwise flawed. DLESE is working on organizational and technical structures that should help to recruit and train individuals and small groups from the community to systematically locate and catalog sub-collections of resources in their area of expertise and interest (e.g. the Collections Accession Policy, the Collections Management System).

However, the DLESE Broad Collection needs to grow now, and it needs to grow swiftly, in order to attract users to the Library. We think that for at least the next two years, it will be cost-effective and efficient to have geoscience information professionals delivering a steady stream of accurate, complete metadata records into the DLESE metadata database. Our metadata team is drawn from the group that oversees GeoRef (http://www.georef.org), a bibliographic database of 2.4 million items, including articles, books, maps, conference papers, reports, and theses covering the geosciences. Members of the GeoRef group have worked with the DLESE cataloging tool during its development and are experienced with its application to identified resources. Given past experience with the application of the tool to catalog resources, we would reasonably expect to be able to catalog approximately 2400 resources with eighteen months of cataloger time.

Tahirkheli's team will also continue to work with the DLESE Program Center metadata staff to refine the DLESE metadata standard, which is expected to grow to include an Earth-System specific controlled vocabulary, additional education standards, and georeferencing.

(4) Developing and Implementing the Community Review System.

Over the period of the next grant, we will continue to increase the scope and breadth of the Reviewed Collection. Our aim is that between 5% and 10% of the best resources in the Broad Collection will be selected for inclusion in the Reviewed Collection. Only resources which the creator has proactively submitted for consideration will begin the Community Review process.

Over the period of the next grant, we will:

(4a) Continue to test and refine the recommendation engine and editorial procedures, guided by community feedback: The feedback from our current test group of resource creators has been extremely valuable for improving the nature of the questions we are asking and the navigation pathways through the recommendation engine. We have identified approximately a dozen additional resource creators who are willing to work with us collaboratively to improve the review process from the resource creators' perspective (see, for example, attached letter of support from Tamara Ledley, TERC). We gather resource creators' feedback via email, conference call, and face-to-face meetings.

So far, we have less feedback from resource reviewers, only the occasional comment typed on the feedback page of the recommendation engine. To add to our understanding of how reviewers are using and responding to the CRS recommendation engine, we will work with the DPC usability laboratory equipment and staff, through their subcontract to this grant, to conduct usability sessions with educators representative of the potential reviewer community.

The CRS Editorial Review Board (ERB) met once in its entirety in February 2001, and the pedagogical subgroup met again in May 2001. In between, the ERB has conducted its work by email and phone. We plan another face-to-face meeting early in the proposed award period, when the review system is moving from its test phase to an operational phase, to agree on efficient and effective procedures for the editor-mediated steps of the review system.

Finally, we will continue to use the DLESE annual summer leadership workshops, Collections-, User- and Technical-Committee listservers and meetings, and appropriate interest group meetings and listservers, as venues to gather feedback from resource creators, resource reviewers, and library users about the directions in which they would like to see the Reviewed Collection develop.

(4b) Develop New Indices and Reports for Resource Users: We have already implemented several forms of reportback from the Community Review System to the Editorial Review Board, the resource creator, and the cataloging staff (see fig. 1 and "Results from Prior Support" section). Feedback from potential Reviewed Collection users at the 2001 DLESE summer leadership workshop told us clearly that users wish to learn more about the nature of a resource which they are considering using than just the single fact that it has been admitted to the Reviewed Collection. Therefore, one of our development efforts in the coming performance period will be to develop indices or reports that reveal informative aspects of the insights collected in the review process, while still protecting the confidentiality of individual reviews. Our ideas:

We will experiment with reporting average scores on the recommendation-engine's quantitative questions for a given selection criterion, aggregating across all reviewers and all rubrics pertaining to that selection criterion.
We will have the Editorial Review Board member(s) who oversaw the review of that resource write a brief "Editor's Summary," highlighting the strengths and weaknesses of the resource as revealed by the review process in its entirety.
We will develop a display methodology for the information encoded into the "Teaching Tips" field of the recommendation engine (see http://dlese.ldeo.columbia.edu/review/used_yes2.html). This is the only element of the review that will be made available verbatim (although anonymously) to potential users.
We will develop a display methodology for the information encoded in the "challenging teaching & learning situation" section of the recommendation engine (see figure 1 and http://dlese.ldeo.columbia.edu/review/used_yes2.html). Our preliminary plan is a dynamically-generated web page for each of the categories of special learners. We will recruit a seventh member of the Editorial Review Board, with expertise in challenging learning situations, to digest the information flow coming in from this part of the recommendation engine, and to help develop ways to report this data in an informative way.

Each of these ideas will be prototyped, and then tested by the Editorial Review Board, the Collections Committee, and other invited members of the DLESE Advisory Structure. The most effective of the ideas will be incorporated into the mainstream of the Community Review System.

(4c) Develop an NSDL Annotation Service to Disseminate Review Information: NSDL policy (Hillman, 2001, 2002) is that review information should not be conveyed via the object-level metadata associated with a given resource. Rather, it should be disseminated via an "annotation service," which serves information about resources. We propose to develop an annotation service following the NSDL protocols, which will provide access to the following types of annotations:

a review status flag for each resource that is currently under review, has been under review in the past, or reached the Reviewed Collection via the Community Review System.
teaching tips for each reviewed resource.
the Editor's Summary for each reviewed resource.
average scores, or other informative aggregated indices developed from the quantitative questions on the recommendation engine.
information on suitability of the resource for challenging teaching and learning situations.

(4d) Develop mechanisms to showcase Reviewed Collection items within the DLESE Discovery System: The DLESE Program Center will harvest the review status flag from the Lamont OAI server. DPC and Lamont will co-develop multiple mechanisms to showcase Reviewed Collection items to users of the Discovery System and other DLESE programs and facilities. These mechanisms will tend to steer users to quality resources, while at the same time providing recognition to the creators of the most effective resources (Note that these strategies should apply to resources that reach the Reviewed Collection via any of several potential pathways, not just via the Community Review System.) The mechanisms which we have planned are:

a "new arrivals bookshelf" containing brief descriptions and links to resources that are newly admitted to the Reviewed Collection.
in the Discovery System, when the return comes back for a resource in the Reviewed Collection, there is a visually-prominent indication, such as an icon or a different-colored background, indicating that this is a Reviewed Collection resource.
The capability to limit searches in the Discovery System to only resources in the Reviewed Collection. Feedback from DLESE Discovery System testers, especially K-12 teachers, is that they would prefer the system to return a few hits on excellent resources rather than many hits.
The capability to use review status as one of the criteria to sort the returns from the Discovery System, such that Reviewed items have a higher likelihood of emerging higher on the hit-list.

(4e) Develop mechanisms to encourage community members to submit reviews: Working with DPC, the Community-Building Collaboratory, and the DLESE Users Standing Committee, we will develop mechanisms to recruit, encourage and reward DLESE members for submitting reviews. Our plans:

The most important step will be the development of peer pressure within the DLESE Community, the expectation that you owe it to the community to pay back with your insights and experience in return for gaining access to this excellent educational resource.
We will investigate "Excellence in Reviewing" awards, like those given by American Geophysical Union journals.
In the Discovery System, when the return comes back for a resource which is currently undergoing the community review process, there will be a visually-prominent indication, such as an icon or a different-colored background, and a link which will take the user to the review page for that resource.
Registered DLESE members would be offered an opportunity to request to be automatically notified when new resources in their area of expertise enter the review system. When a resource matching their interest profile (for example, a certain topic and a certain grade level) comes in, they will receive an email invitation to review that resource, with direct links to the resource and the review page for the resource.

(4f) Develop mechanisms to identify potential reviewers as educators or non-educators: At present, our recommendation engine merely asks politely "Are you an Educator?", and on the basis of the clicked answer directs the commenter down one or the other fork of the recommendation engine. The credibility of the Community Review System would be enhanced if this were a stronger test. DPC expects to be implementing a registration service sometime within the next year (Mike Wright, DPC, personal communication, Feb 2002). This may build on the work of Kate Wittenberg, who is a member of the DLESE Steering Committee and funded by NSDL Core Integration track to work on authentication systems. As the technical, data management, and policy aspects of registration and authentication move forward within DLESE, our project will tap into this infrastructure to identify whether the email address of a person coming to the Community Review System with the intent to submit a full review is indeed that of an educator recognized by the DLESE member database.

Contributions to the NSDL Community and to Digital Library Scholarship

In addition to building the DLESE Collection, our project will contribute to the broader NSDL community in several areas of collection development practice and research.

Collections Assessment: At present, ours is the only NSDL project undertaking a systematic collections assessment (Susan Jesuroga, NSDL Technical Project Liaison, personal communication, Feb 2002). Our techniques for gathering, reporting and analyzing digital collection metrics and user-request metrics will be applicable to many NSDL collections. Based on our experience, we will be able to share methodologies for comparing the actual scope of a digital library with what is desired by the user community, for documenting how that balance is changing over time, and for incorporating collection assessment as an element of evaluation. The strategy of using collections assessment to steer collections development and resource creation is one which we think could be productively incorporated into other elements of NSDL.

Review Methodology: Although several NSDL groups are creating systems for reviewing educational resources, the DLESE Community Review System involves a unique hybrid combining feedback from users via a web-enabled recommendation system with expert reviews mediated by an editor. The quantitative data from the web-based recommendation engine can be aggregated into indices that protect the anonymity of the reviews, but still provide valuable information to potential resource users. Based on feedback at the December 2001 NSDL meeting we anticipate that some NSDL collections which have not yet developed a review system may wish to adapt elements of DLESE's Community Review System.

Enhancing the Diversity of the User Community: The Community Review System incorporates a series of questions designed to reveal resources that have been used successfully in non-traditional and challenging teaching and learning situations, for example, with adult learners, visually-impaired students, place-bound students,urban dwellers with limited contact with Nature, etc. (see figure 1 and http://dlese.ldeo.columbia.edu/review/used_yes2.html). This data collection technique could be used by other NSDL collections, regardless of whether they use any other elements of the Community Review System.

Annotation Services: The DLESE Community Review System will be one of the first providers of information to the NSDL community via an annotation service (Diane Hillman, personal communication, Feb 2002). We expect to share insights about both what kind of information is of interest to library users, and technical insights about how to implement an annotation service in the NSDL context.

Scholarly Contributions: We anticipate that this project will result in at least two scholarly publications in peer-reviewed journals. The first, on collections assessment, will describe how we leveraged the digital nature of our library to capture information about the desired collection and the actual collection. This paper will also reflect on the role of collections assessment in a contributor-based digital library in which the user-community is not pre-defined by the library founders, but rather self-aggregates as it is drawn towards a collection which it finds interesting and useful. The second paper will explore the data from the Community Review System, comparing the editor-mediated pedagogy reviews with the data from the web-mediated recommendation engine, and identifying which of the resource attributes seem to be the most powerful discriminants for resource quality.