With participation and support from the international scholarly community, JSTOR has created a high-quality, interdisciplinary archive of scholarship, is actively preserving over one thousand academic journals in both digital and print formats, and continues to greatly expand access to scholarly works and other materials needed for research and teaching globally. We are investing in new initiatives to increase the productivity of researchers and to facilitate new forms of scholarship.

We are pleased to confirm JSTORs readiness to participate in the 'Digging into Data Initiative'. JSTOR is a scholarly archive of the full runs of approximately 1000 leading academic journals and covers approximately fifty disciplines, with a strong presence in the humanities, social and field sciences, business and economics.  A full list of the included journals is available at http://www.jstor.org/action/showJournals?browseType=titleInfoPage .

JSTOR is prepared to provide access at two levels.

1.1 Potential participants to Digging into Data can apply to JSTOR for an account to our “Data for Research” service.  This allows users to createdatasets of word frequency against article for any subset of the articles in the JSTOR archive.  An open, but size-limited service will be generally available from mid January, 2009 at http://dfr.jstor.org.  The Digging into Data accounts will remove the limits on the size of the datasets.

1.2 Any accepted participant in the Digging into Data Program can gain access to the full text of the of JSTOR collections as XML data in a (slightly extended) NLM format. The dataset will include OAI-ORE resource maps.  The full text includes OCR’d text of the articles, and bibliographic metadata.


1. The data referred to in (1.2) above will be a standard corpus and will be distributed “as-is”.  JSTOR will not filter, sort or in any way process it to individual participants requirements.  Participants are expected to be competent in the processing of XML data and no technical support is offered by JSTOR in the filtering or processing of the XML.  The agreement will be for a limited time, after which the data should be destroyed or returned.

2. Samples of the data in (1.2) and the XML schema will be available on request to potential participants for the purpose of assessing the suitability of the full collection for their proposal.  Final participants should expect to provide media such as USB disk drives for the delivery of the data.

3. All participants will be required to sign a standard license and non-disclosure agreement for the use of the data referred in section (1,2) above.

4. Any potential participant must accept a “click-through” limited use agreement for the service and data mentioned in (1.1) above, and must provide contact details and a bon-fide email address at the participating institution.

5. Participants should allow sufficient time for the processing of the agreement mentioned in 1.2.  We have found that 6-8 weeks is typical for legal departments of academic institutions to process such agreements.