Tips and Guide for Data Users
Documentation Files and Format
For any study, there are several possible types of documentation files for Research Connections data collections:
Codebook and Documentation Files
- Codebook: Information on the structure, contents, and layout of a data file. The codebook may also contain information on study design and methodology.
- Dictionary file: Information on column locations and labeling of variables
- Data map: Similar to a dictionary file
- Errata file: Errors noted for a particular collection, usually supplied by the principal investigator.
- Frequency file: Frequency of response or descriptive statistics for selected variables in a collection.
- Cross-tabulation file: Cross-tabulations for some or all variables in a collection
- User Guide: More detailed information about a particular collection, often provided by the principal investigator
- Manual: Instructions prepared by the principal investigator on some aspect of the data collection.
- Appendices: Additional documentation
- Reports: Description of findings or results based on analysis of a dataset. Prepared by the principal investigator.
- Record layout file: Similar to a dictionary file.
- Tables/Crosstables: Similar to frequencies files but presented in tabular format
The standard format for documentation is Portable Document Format (PDF), and we are moving toward compliance with the PDF/A standard. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader.