Class: SamplingStrategy
Does the dataset contain all possible instances or is it a sample (not necessarily random) of instances from a larger set? If the dataset is a sample, then what is the larger set? Is the sample representative of the larger set (e.g., geographic coverage)? If so, please describe how this representativeness was validated/verified. If it is not representative of the larger set, please describe why not (e.g., to cover a more diverse range of instances, because instances were withheld or unavailable).
URI: data_sheets_schema:SamplingStrategy
erDiagram
SamplingStrategy {
stringList is_sample
stringList is_random
stringList source_data
stringList is_representative
stringList representative_verification
stringList why_not_representative
stringList strategies
string id
string name
string description
}
Software {
string version
string license
string url
string id
string name
string description
}
SamplingStrategy ||--}o Software : "used_software"
Inheritance
- NamedThing
- DatasetProperty
- SamplingStrategy
- DatasetProperty
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
is_sample | * String |
direct | |
is_random | * String |
direct | |
source_data | * String |
direct | |
is_representative | * String |
direct | |
representative_verification | * String |
direct | |
why_not_representative | * String |
direct | |
strategies | * String |
If the dataset is a sample from a larger set, what was the sampling strategy ... | direct |
used_software | * Software |
What software was used as part of this dataset property? | DatasetProperty |
id | 1 String |
the unique name of the dataset | NamedThing |
name | 0..1 String |
NamedThing | |
description | 0..1 String |
human readable description of the information | NamedThing |
Usages
used by | used in | type | used |
---|---|---|---|
Dataset | sampling_strategies | range | SamplingStrategy |
DataSubset | sampling_strategies | range | SamplingStrategy |
Instance | sampling_strategies | range | SamplingStrategy |
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/bridge2ai/data-sheets-schema
Mappings
Mapping Type | Mapped Value |
---|---|
self | data_sheets_schema:SamplingStrategy |
native | data_sheets_schema:SamplingStrategy |
LinkML Source
Direct
name: SamplingStrategy
description: Does the dataset contain all possible instances or is it a sample (not
necessarily random) of instances from a larger set? If the dataset is a sample,
then what is the larger set? Is the sample representative of the larger set (e.g.,
geographic coverage)? If so, please describe how this representativeness was validated/verified.
If it is not representative of the larger set, please describe why not (e.g., to
cover a more diverse range of instances, because instances were withheld or unavailable).
in_subset:
- Composition
- Collection
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
is_a: DatasetProperty
attributes:
is_sample:
name: is_sample
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
domain_of:
- SamplingStrategy
range: string
is_random:
name: is_random
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
domain_of:
- SamplingStrategy
range: string
source_data:
name: source_data
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
domain_of:
- SamplingStrategy
range: string
is_representative:
name: is_representative
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
domain_of:
- SamplingStrategy
range: string
representative_verification:
name: representative_verification
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
domain_of:
- SamplingStrategy
range: string
why_not_representative:
name: why_not_representative
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
domain_of:
- SamplingStrategy
range: string
strategies:
name: strategies
description: If the dataset is a sample from a larger set, what was the sampling
strategy (e.g., deterministic, probabilistic with specific sampling probabilities)?
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
domain_of:
- SamplingStrategy
range: string
Induced
name: SamplingStrategy
description: Does the dataset contain all possible instances or is it a sample (not
necessarily random) of instances from a larger set? If the dataset is a sample,
then what is the larger set? Is the sample representative of the larger set (e.g.,
geographic coverage)? If so, please describe how this representativeness was validated/verified.
If it is not representative of the larger set, please describe why not (e.g., to
cover a more diverse range of instances, because instances were withheld or unavailable).
in_subset:
- Composition
- Collection
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
is_a: DatasetProperty
attributes:
is_sample:
name: is_sample
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
alias: is_sample
owner: SamplingStrategy
domain_of:
- SamplingStrategy
range: string
is_random:
name: is_random
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
alias: is_random
owner: SamplingStrategy
domain_of:
- SamplingStrategy
range: string
source_data:
name: source_data
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
alias: source_data
owner: SamplingStrategy
domain_of:
- SamplingStrategy
range: string
is_representative:
name: is_representative
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
alias: is_representative
owner: SamplingStrategy
domain_of:
- SamplingStrategy
range: string
representative_verification:
name: representative_verification
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
alias: representative_verification
owner: SamplingStrategy
domain_of:
- SamplingStrategy
range: string
why_not_representative:
name: why_not_representative
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
alias: why_not_representative
owner: SamplingStrategy
domain_of:
- SamplingStrategy
range: string
strategies:
name: strategies
description: If the dataset is a sample from a larger set, what was the sampling
strategy (e.g., deterministic, probabilistic with specific sampling probabilities)?
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
alias: strategies
owner: SamplingStrategy
domain_of:
- SamplingStrategy
range: string
used_software:
name: used_software
description: What software was used as part of this dataset property?
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
multivalued: true
alias: used_software
owner: SamplingStrategy
domain_of:
- DatasetProperty
range: Software
id:
name: id
description: the unique name of the dataset
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
exact_mappings:
- schema:name
rank: 1000
slot_uri: dcterms:identifier
identifier: true
alias: id
owner: SamplingStrategy
domain_of:
- NamedThing
- Information
range: string
required: true
name:
name: name
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: schema:name
alias: name
owner: SamplingStrategy
domain_of:
- NamedThing
range: string
description:
name: description
description: human readable description of the information
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:description
alias: description
owner: SamplingStrategy
domain_of:
- NamedThing
- Information
- Relationships
- Splits
- DataAnomaly
- Confidentiality
- Deidentification
- SensitiveElement
- InstanceAcquisition
- CollectionMechanism
- DataCollector
- CollectionTimeframe
- EthicalReview
- DirectCollection
- CollectionNotification
- CollectionConsent
- ConsentRevocation
- DataProtectionImpact
- PreprocessingStrategy
- CleaningStrategy
- LabelingStrategy
- RawData
- ExistingUse
- UseRepository
- OtherTask
- FutureUseImpact
- DiscouragedUse
- ThirdPartySharing
- DistributionFormat
- DistributionDate
- LicenseAndUseTerms
- IPRestrictions
- ExportControlRegulatoryRestrictions
- Maintainer
- Erratum
- UpdatePlan
- RetentionLimits
- VersionAccess
- ExtensionMechanism
range: string