We offer a public platform to help researchers organize, share, and explore their research data.
Instructions for creating and exploring research datasets are available here
The problem we are trying to address is described in a colorful way here
MissionToo often, the results of years of engineering and scientific inquiry are stored on media that become outdated, broken, misplaced, or are simply not accessible. Even when these data are accessible, it can be difficult to interpret them. DataCenterHub seeks to alleviate this problem by providing a simple, standardized yet flexible platform to preserve and share data. In the future, this platform will offer data visualization tools and the ability to compare directly data from different sources.
We plan to archive each uploaded dataset on Purdue’s institutional repositories (FORTRESS and PURR) to ensure the longevity of the data contributed to DataCenterHub . While funding is available, we shall also maintain a more readily accessible copy of all data on DataCenterHub .
Data StructureAll data on DataCenterHub are uploaded as “datasets”. A dataset is a collection of 1) information about experiments/ cases in which objects or sites (physical or virtual) are subjected to excitation (stimuli) and their responses are recorded in 2) “data” and “media” files. The objects and their responses can be/ are described through a series of 3) “parameters” and drawings/sketches. Information, files, and parameters from all datasets are organized in a table in which each row corresponds to a single experiment/ case. The table can be searched using keywords, author information, etc.
General Information refers to bibliographic information that one can use to find datasets and experiments/ cases in the repository.
- Experiment or Case ID is the ID assigned by the source(s) to the experiment/ case.
- Source refers to the name(s) of the person(s) who recorded the data.
- Keywords are words describing the dataset and experiment/ case. Users can search for specific datasets or experiments/ cases in the repository with these keywords.
- Latitude and Longitude are the coordinates or ranges of coordinates of the location where data were recorded. These are not needed for simulations.
- Compiled By refers to names of the people who compiled the data. Compiling refers to organizing the data into a dataset as opposed to recording the data during the experiment/ case.
- Compiled On is the date when the dataset was compiled (format: YYYY-MM-DD).
Files of different types (reports, drawings, measurements, photos, videos, audio, etc.) are generated for each experiment/ case. DataCenterHub has been designed to help you organize and preserve these files. They are grouped as follows:
- Report(s) are the documentation related to the experiment/ case.
- Drawings/Diagrams are the image files that are needed to interpret the data, including drawings illustrating the test set-up, sensor layout and the specimen or site.
- Data include files (preferably in text format, e.g. *.txt, *.csv) with measurements and observations. It is recommended that data files are organized in columns with each column having a descriptive header (e.g., sensor ID, and units). Material sample tests may be stored here.
- Photos, Videos, etc. are the media files including photos, videos, audio generated for the experiment/ case. Parameters consist of quantities and variables that the researcher, compiler, professional or scientific organization chose to describe the experiment/ case in quantitative terms. Examples may include dimensions, material properties, temperature, key test results and indices.
Parameters consist of quantities and variables that the researcher, compiler, professional or scientific organization chose to describe the experiment/ case in quantitative terms. Examples may include dimensions, material properties, temperature, key test results and indices.
SupportFunding for our research is provided by the National Science Foundation under Grant No. #1724728, CIF21 DIBBs: EI: Creating a Digital Environment for Enabling Data-Driven Science (DEEDS), awarded by the Office of Advanced Cyberinfrastructure (OAC), Directorate for Computer & Information Science & Engineering.
Previous funding for this work was provided by the National Science Foundation under Grant No.1443027 and by the Center of Earthquake Engineering and Disaster Data (CREEDD) at Purdue University.
Contact UsPrincipal Investigator:
Ann Christine Catlin firstname.lastname@example.org
Co-investigators & Senior Personnel
Ashraf Alam email@example.com
Marisol Sepúlveda firstname.lastname@example.org
Joseph Francisco email@example.com
Connie Weaver firstname.lastname@example.org
Chandima Hewa Nadungodage email@example.com
Santiago Pujol firstname.lastname@example.org
Lucas Laughery email@example.com
NSF awards $3.5 million 4-year grant to build powerful web platform for data-driven science. The DEEDS platform at DataHub will provide research groups with support for
- collecting, organizing and preserving their data,
- launching and tracking their computations,
- exploring and sharing their workflows and results.