babelon package
Submodules
babelon.babelon_io module
babelon.io.
- babelon.babelon_io.convert_file(input_path, output, drop_unknown_columns=True, output_format=None)
Convert a file from one format to another.
- Parameters:
input_path (
str
) – The path to the input babelon tsv fileoutput (
TextIO
) – The path to the output file. If none is given, will default to using stdout.drop_unknown_columns (
bool
) – If true, columns unknown to Babelon format are dropped prior to processing.output_format (
Optional
[str
]) – The format to which the SSSOM TSV should be converted.
- Return type:
None
- babelon.babelon_io.parse_file(input_path, output_path)
Parse a Babelon metadata file and write to a table.
- Return type:
None
- Args:
input_path (str): The path to the input file in one of the legal formats, eg obographs, aligmentapi-xml output_path (TextIO): The path to the output file.
- Raises:
ValueError: [description]
- babelon.babelon_io.to_babelon_linkml_document(bdf)
Load a LinkML YAML representation from a BabelonDataFrame.
- babelon.babelon_io.to_json(bdf)
Convert a mapping set dataframe to a JSON object.
- Return type:
JsonObj
- babelon.babelon_io.to_owl_graph(bdf)
Convert a mapping set dataframe to OWL in an RDF graph.
- Return type:
Graph
- babelon.babelon_io.to_rdf_graph(bdf)
Convert a mapping set dataframe to an RDF graph.
- Return type:
Graph
- babelon.babelon_io.write_json(bdf, output, serialisation='json')
Write a mapping set dataframe to the file as JSON.
- Return type:
None
- Args:
bdf (BabelonDataFrame): The path to the input file in one of the legal formats, eg obographs, aligmentapi-xml output (TextIO): The path or stream of the output. serialisation (str): the target serialisation (must be ‘json’)
- Raises:
ValueError: [description]
- babelon.babelon_io.write_owl(bdf, output, serialisation='owl')
Write a mapping set dataframe to the file as OWL.
- Return type:
None
- Args:
bdf (BabelonDataFrame): The path to the input file in one of the legal formats, eg obographs, aligmentapi-xml output (TextIO): The path or stream of the output. serialisation (str): the target serialisation (must be ‘json’)
- Raises:
ValueError: [description]
babelon.cli module
Command line interface for Babelon.
babelon.constants module
Constants for babelon toolkit.
babelon.dataclasses module
- class babelon.dataclasses.EntityReference(v)
Bases:
Uriorcurie
A reference to a mapped entity. This is represented internally as a string, and as a resource in RDF
- type_class_curie = 'rdfs:Resource'
- type_class_uri = rdflib.term.URIRef('http://www.w3.org/2000/01/rdf-schema#Resource')
- type_model_uri = rdflib.term.URIRef('https://w3id.org/babelon/EntityReference')
- type_name = 'EntityReference'
- class babelon.dataclasses.Profile(translations=<factory>, translation_provider=None, profile_id=None, profile_version=None, comment=None, **_kwargs)
Bases:
YAMLRoot
Represents a set of translation that together compose a language profile.
-
class_class_curie:
ClassVar
[str
] = 'babelon:Profile'
-
class_class_uri:
ClassVar
[URIRef
] = rdflib.term.URIRef('https://w3id.org/babelon/Profile')
-
class_model_uri:
ClassVar
[URIRef
] = rdflib.term.URIRef('https://w3id.org/babelon/Profile')
-
class_name:
ClassVar
[str
] = 'profile'
-
comment:
Optional
[str
] = None
-
profile_id:
Optional
[str
] = None
-
profile_version:
Optional
[str
] = None
-
translation_provider:
Optional
[str
] = None
-
translations:
Union
[dict
,Translation
,List
[Union
[dict
,Translation
]],None
]
-
class_class_curie:
- class babelon.dataclasses.Translation(subject_id=None, predicate_id=None, source_value=None, source_language=None, translation_value=None, translation_language=None, source_version=None, translation_type=None, translator=None, translator_expertise=None, translation_date=None, translation_confidence=None, translation_precision=None, translation_status=None, source=None, comment=None, **_kwargs)
Bases:
YAMLRoot
Represents and individual translation
-
class_class_curie:
ClassVar
[str
] = 'owl:Axiom'
-
class_class_uri:
ClassVar
[URIRef
] = rdflib.term.URIRef('http://www.w3.org/2002/07/owl#Axiom')
-
class_model_uri:
ClassVar
[URIRef
] = rdflib.term.URIRef('https://w3id.org/babelon/Translation')
-
class_name:
ClassVar
[str
] = 'translation'
-
comment:
Optional
[str
] = None
-
predicate_id:
Union
[str
,EntityReference
] = None
-
source:
Optional
[str
] = None
-
source_language:
str
= None
-
source_value:
str
= None
-
source_version:
Optional
[str
] = None
-
subject_id:
Union
[str
,EntityReference
] = None
-
translation_confidence:
Optional
[float
] = None
-
translation_date:
Optional
[str
] = None
-
translation_language:
Optional
[str
] = None
-
translation_precision:
Union
[str
,TranslationPrecisionEnum
,None
] = None
-
translation_status:
Union
[str
,TranslationStatusEnum
,None
] = None
-
translation_type:
Union
[str
,TranslationTypeEnum
,None
] = None
-
translation_value:
Optional
[str
] = None
-
translator:
Optional
[str
] = None
-
translator_expertise:
Union
[str
,TranslatorExpertiseEnum
,None
] = None
-
class_class_curie:
- class babelon.dataclasses.TranslationPrecisionEnum(code)
Bases:
EnumDefinitionImpl
- BROADER = PermissibleValue({ 'text': 'BROADER', 'description': 'The translation value has a somewhat broader meaning than the source value.' })
- CLOSE = PermissibleValue({ 'text': 'CLOSE', 'description': 'The translation value is close in meaning to the source value, but not exact.' })
- EXACT = PermissibleValue({'text': 'EXACT', 'description': 'The translation is exact.'})
- NARROWER = PermissibleValue({ 'text': 'NARROWER', 'description': 'The translation value has a somewhat narrower meaning than the source value.' })
- class babelon.dataclasses.TranslationStatusEnum(code)
Bases:
EnumDefinitionImpl
- CANDIDATE = PermissibleValue({ 'text': 'CANDIDATE', 'description': ('The translation has been suggested from an entity (algorithm, person) ' 'outside the core team managing the translation.') })
- NOT_TRANSLATED = PermissibleValue({'text': 'NOT_TRANSLATED', 'description': 'This translation is incomplete.'})
- OFFICIAL = PermissibleValue({ 'text': 'OFFICIAL', 'description': ('The translation has been accepted by the core team managing the language ' 'profile.') })
- UNDER_REVIEW = PermissibleValue({ 'text': 'UNDER_REVIEW', 'description': ('The translation has been suggested from an entity (algorithm, person) inside ' 'the core team managing the translation, but not yet officially ratified.') })
- class babelon.dataclasses.TranslationTypeEnum(code)
Bases:
EnumDefinitionImpl
- AUGMENTATION = PermissibleValue({ 'text': 'AUGMENTATION', 'description': ('The record corresponds to an additional language specific terminological ' 'element without a corresponding element in the source language.') })
- CORRECTION = PermissibleValue({ 'text': 'CORRECTION', 'description': ('The record corresponds to a translation of a source value into a translation ' 'value, but rather than being an exact translation, it suggests a change to ' 'the original source value.') })
- TRANSLATION = PermissibleValue({ 'text': 'TRANSLATION', 'description': ('The record corresponds to an actual translation of a source value into a ' 'translation value.') })
- class babelon.dataclasses.TranslatorExpertiseEnum(code)
Bases:
EnumDefinitionImpl
- ALGORITHM = PermissibleValue({'text': 'ALGORITHM', 'description': 'The translator is a machine, not a person.'})
- DOMAIN_EXPERT = PermissibleValue({ 'text': 'DOMAIN_EXPERT', 'description': ('The translator is an expert of the domain of the ontology, for example an ' 'expert in anatomy when translating terms from an anatomy ontology such as ' 'Uberon.') })
- DOMAIN_STUDENT = PermissibleValue({ 'text': 'DOMAIN_STUDENT', 'description': ('The translator is a student of the domain of the ontology, for example a ' 'student of anatomy, when translating terms from an anatomy ontology such as ' 'Uberon.') })
- LAYPERSON = PermissibleValue({ 'text': 'LAYPERSON', 'description': ('The translator is an interested lay person with no specific knowledge of the ' 'domain.') })
- PROFESSIONAL_TRANSLATOR = PermissibleValue({ 'text': 'PROFESSIONAL_TRANSLATOR', 'description': 'The translator is a professional translator by trade.' })
- TECHNICAL_SPECIALIST = PermissibleValue({ 'text': 'TECHNICAL_SPECIALIST', 'description': ('The translator is a technical specialist, such as a software engineer, a ' 'bioinformatician or a data scientist.') })
- class babelon.dataclasses.slots
Bases:
object
- comment = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/comment'), name='comment', curie='babelon:comment', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/comment'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- predicate_id = Slot(uri=rdflib.term.URIRef('http://www.w3.org/2002/07/owl#annotatedProperty'), name='predicate_id', curie='owl:annotatedProperty', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/predicate_id'), domain=None, range=typing.Union[str, babelon.dataclasses.EntityReference, NoneType], mappings=None, pattern=None)
- profile_id = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/profile_id'), name='profile_id', curie='babelon:profile_id', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/profile_id'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- profile_version = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/profile_version'), name='profile_version', curie='babelon:profile_version', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/profile_version'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- source = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/source'), name='source', curie='babelon:source', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/source'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- source_language = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/source_language'), name='source_language', curie='babelon:source_language', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/source_language'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- source_value = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/source_value'), name='source_value', curie='babelon:source_value', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/source_value'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- source_version = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/source_version'), name='source_version', curie='babelon:source_version', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/source_version'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- subject_id = Slot(uri=rdflib.term.URIRef('http://www.w3.org/2002/07/owl#annotatedSource'), name='subject_id', curie='owl:annotatedSource', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/subject_id'), domain=None, range=typing.Union[str, babelon.dataclasses.EntityReference, NoneType], mappings=None, pattern=None)
- translation_confidence = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_confidence'), name='translation_confidence', curie='babelon:translation_confidence', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_confidence'), domain=None, range=typing.Optional[float], mappings=None, pattern=None)
- translation_date = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_date'), name='translation_date', curie='babelon:translation_date', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_date'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- translation_language = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_language'), name='translation_language', curie='babelon:translation_language', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_language'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- translation_precision = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_precision'), name='translation_precision', curie='babelon:translation_precision', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_precision'), domain=None, range=typing.Union[str, ForwardRef('TranslationPrecisionEnum'), NoneType], mappings=None, pattern=None)
- translation_predicate_id = Slot(uri=rdflib.term.URIRef('http://www.w3.org/2002/07/owl#annotatedProperty'), name='translation_predicate_id', curie='owl:annotatedProperty', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_predicate_id'), domain=<class 'babelon.dataclasses.Translation'>, range=typing.Union[str, babelon.dataclasses.EntityReference], mappings=None, pattern=None)
- translation_provider = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_provider'), name='translation_provider', curie='babelon:translation_provider', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_provider'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- translation_source_language = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/source_language'), name='translation_source_language', curie='babelon:source_language', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_source_language'), domain=<class 'babelon.dataclasses.Translation'>, range=<class 'str'>, mappings=None, pattern=None)
- translation_source_value = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/source_value'), name='translation_source_value', curie='babelon:source_value', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_source_value'), domain=<class 'babelon.dataclasses.Translation'>, range=<class 'str'>, mappings=None, pattern=None)
- translation_status = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_status'), name='translation_status', curie='babelon:translation_status', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_status'), domain=None, range=typing.Union[str, ForwardRef('TranslationStatusEnum'), NoneType], mappings=None, pattern=None)
- translation_subject_id = Slot(uri=rdflib.term.URIRef('http://www.w3.org/2002/07/owl#annotatedSource'), name='translation_subject_id', curie='owl:annotatedSource', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_subject_id'), domain=<class 'babelon.dataclasses.Translation'>, range=typing.Union[str, babelon.dataclasses.EntityReference], mappings=None, pattern=None)
- translation_type = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_type'), name='translation_type', curie='babelon:translation_type', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_type'), domain=None, range=typing.Union[str, ForwardRef('TranslationTypeEnum'), NoneType], mappings=None, pattern=None)
- translation_value = Slot(uri=rdflib.term.URIRef('http://www.w3.org/2002/07/owl#annotatedTarget'), name='translation_value', curie='owl:annotatedTarget', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translation_value'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- translations = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translations'), name='translations', curie='babelon:translations', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translations'), domain=None, range=typing.Union[dict, babelon.dataclasses.Translation, typing.List[typing.Union[dict, babelon.dataclasses.Translation]], NoneType], mappings=None, pattern=None)
- translator = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translator'), name='translator', curie='babelon:translator', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translator'), domain=None, range=typing.Optional[str], mappings=None, pattern=None)
- translator_expertise = Slot(uri=rdflib.term.URIRef('https://w3id.org/babelon/translator_expertise'), name='translator_expertise', curie='babelon:translator_expertise', model_uri=rdflib.term.URIRef('https://w3id.org/babelon/translator_expertise'), domain=None, range=typing.Union[str, ForwardRef('TranslatorExpertiseEnum'), NoneType], mappings=None, pattern=None)
babelon.translate module
Translate Babelon profiles.
- class babelon.translate.DeepLTranslator
Bases:
Translator
A specific translator class that uses DeepL API for translation.
- model_name()
Return the unique name of the translation model.
- translate(text_to_translate, language_code)
Translate text using DeepL API.
Args: text_to_translate (str): The text to be translated. language_code (str): The target language code (e.g., ‘DE’ for German).
Returns: str: The translated text, or an empty string if translation fails.
- class babelon.translate.OpenAITranslator(model='gpt-4o')
Bases:
Translator
A specific translator class that uses GPT-4 for translation.
- model_name()
Return the unique name of the model.
- translate(text_to_translate, language_code)
Translate text using OpenAI’s GPT-4 API (hypothetical).
Args: text_to_translate (str): The text to be translated. language_code (str): The target language code (e.g., ‘de’ for German).
Returns: str: The translated text.
- class babelon.translate.Translator
Bases:
object
A generic translator class.
- model_name()
Return the unique name of the model.
- Raises:
NotImplementedError: If the method is not implemented in the subclass
- translate(text, target_language)
Translate the provided text into the target language.
- Args:
text (str): The text to be translated. target_language (str): The language to translate the text into.
- Raises:
NotImplementedError: If the method is not implemented in the subclass.
- babelon.translate.get_translator_model(model='gpt-4')
Instantiate translator model based on string.
- Args:
model (str): The model to be instatiated.
- Raises:
ValueError: If the model does not exist.
- babelon.translate.prepare_translation_for_ontology(ontology, language_code, df_babelon, terms, fields, include_not_translated=False, update_translation_status=True)
Prepare a babelon translation table for an ontology.
- babelon.translate.translate_profile(babelon_df, language_code='en', update_existing=False, model='gpt-4')
Iterate through DataFrame rows and translate values.
babelon.translation_profile module
Translation Profile.
- babelon.translation_profile.statistics_translation_profile(translation_profile)
Take as an input a babelon profile (TSV) and returns some basic stats. :rtype:
None
number of translations by source_language, target_language number of translations by source_language, target_language, predicate_id number of translations by source_language, target_language, translation_status
- Args:
translation_profile (Path): translation profile
- babelon.translation_profile.table_print(title, data)
Print grouped translation data.
- Args:
title (str): Table title data (pd.DataFrame): Translation groupped data
babelon.utils module
Utility methods for babelon processing.
- class babelon.utils.BabelonDataFrame(df, converter=<factory>)
Bases:
object
A collection of mappings represented as a DataFrame, together with additional metadata.
-
converter:
Converter
-
df:
DataFrame
- property prefix_map
Get a simple, bijective prefix map.
- classmethod with_converter(converter, df)
Instantiate with a converter instead of a vanilla prefix map.
- Return type:
-
converter:
- babelon.utils.assemble_xliff_file(translation_units)
Assemble a XLIFF file from translation units.
- babelon.utils.assemble_xliff_translation_unit(identifier, id_normalised, label, element, value)
Assemble a XLIFF translation unit.
- babelon.utils.drop_unknown_columns_babelon(df)
Sort a babelon Dataframe according to key columns.
- babelon.utils.generate_translation_units(identifier, label, definition, synonyms)
Generate translation units from a Babelon record.
- babelon.utils.get_converter()
Get default SSSOM converter.
- babelon.utils.parse_babelon(input_path, drop_unknown_columns=False)
Parse a babelon TSV file into a BabelonDataFrame.
- babelon.utils.raise_for_bad_path(file_path)
Throw exception if file path is invalid.
- Return type:
None
- Args:
file_path: The file path or URL to be validated.
- Raises:
FileNotFoundError: If the provided file path is not a valid file or URL.
- babelon.utils.sort_babelon(df)
Sort a babelon Dataframe according to key columns.
Module contents
babelon package.