Skip to content

Subset: Collection

The questions in this section are designed to elicit information that may help researchers and practitioners to create alternative datasets with similar characteristics.

URI: Collection

Identifier and Mapping Information

Schema Source

  • from schema: https://w3id.org/bridge2ai/data-sheets-schema

Classes in subset

Class Description
CollectionConsent Did the individuals in question consent to the collection and use of their da...
CollectionMechanism What mechanisms or procedures were used to collect the data (e
CollectionNotification Were the individuals in question notified about the data collection? If so, p...
CollectionTimeframe Over what timeframe was the data collected? Does this timeframe match the cre...
ConsentRevocation If consent was obtained, were the consenting individuals provided with a mech...
DataCollector Who was involved in the data collection process (e
DataProtectionImpact Has an analysis of the potential impact of the dataset and its use on data su...
DirectCollection Did you collect the data from the individuals in question directly, or obtain...
EthicalReview Were any ethical review processes conducted (e
InstanceAcquisition How was the data associated with each instance acquired? Was the data directl...
SamplingStrategy Does the dataset contain all possible instances or is it a sample (not necess...

CollectionConsent

Did the individuals in question consent to the collection and use of their data? If so, please describe (or show with screenshots or other information) how consent was requested and provided, and provide a link or other access point to, or otherwise reproduce, the exact language to which the individuals consented.

CollectionMechanism

What mechanisms or procedures were used to collect the data (e.g., hardware apparatuses or sensors, manual human curation, software programs, software APIs)? How were these mechanisms or procedures validated?

CollectionNotification

Were the individuals in question notified about the data collection? If so, please describe (or show with screenshots or other information) how notice was provided, and provide a link or other access point to, or otherwise reproduce, the exact language of the notification itself.

CollectionTimeframe

Over what timeframe was the data collected? Does this timeframe match the creation timeframe of the data associated with the instances (e.g., recent crawl of old news articles)? If not, please describe the timeframe in which the data associated with the instances was created.

ConsentRevocation

If consent was obtained, were the consenting individuals provided with a mechanism to revoke their consent in the future or 8 for certain uses? If so, please provide a description, as well as a link or other access point to the mechanism (if appropriate).

DataCollector

Who was involved in the data collection process (e.g., students, crowdworkers, contractors) and how were they compensated (e.g., how much were crowdworkers paid)?

DataProtectionImpact

Has an analysis of the potential impact of the dataset and its use on data subjects (e.g., a data protection impact analysis) been conducted? If so, please provide a description of this analysis, including the outcomes, as well as a link or other access point to any supporting documentation.

DirectCollection

Did you collect the data from the individuals in question directly, or obtain it via third parties or other sources (e.g., websites)?

EthicalReview

Were any ethical review processes conducted (e.g., by an institutional review board)? If so, please provide a description of these review processes, including the outcomes, as well as a link or other access point to any supporting documentation.

InstanceAcquisition

How was the data associated with each instance acquired? Was the data directly observable (e.g., raw text, movie ratings), reported by subjects (e.g., survey responses), or indirectly inferred/derived from other data (e.g., part-of-speech tags, model-based guesses for age or language)? If the data was reported by subjects or indirectly inferred/derived from other data, was the data validated/verified?

SamplingStrategy

Does the dataset contain all possible instances or is it a sample (not necessarily random) of instances from a larger set? If the dataset is a sample, then what is the larger set? Is the sample representative of the larger set (e.g., geographic coverage)? If so, please describe how this representativeness was validated/verified. If it is not representative of the larger set, please describe why not (e.g., to cover a more diverse range of instances, because instances were withheld or unavailable).