format-version: 1.2 subsetdef: Alliance_of_Genome_Resources "Alliance of Genome Resources Gene Biotype Slim" subsetdef: biosapiens "biosapiens protein feature ontology" subsetdef: DBVAR "database of genomic structural variation" subsetdef: SOFA "SO feature annotation" synonymtypedef: aa1 "amino acid 1 letter code" synonymtypedef: aa3 "amino acid 3 letter code" synonymtypedef: AAMOD "amino acid modification" synonymtypedef: AGR "Alliance of Genome Resources" synonymtypedef: BS "biosapiens" synonymtypedef: dbsnp "dbsnp variant terms" synonymtypedef: dbvar "DBVAR" synonymtypedef: ebi_variants "ensembl variant terms" synonymtypedef: RNAMOD "RNA modification" EXACT synonymtypedef: VAR "variant annotation term" ontology: so/subsets/SOFA [Term] id: SO:0000000 name: Sequence_Ontology namespace: sequence subset: SOFA is_obsolete: true [Term] id: SO:0000001 name: region namespace: sequence def: "A sequence_feature with an extent greater than zero. A nucleotide region is composed of bases and a polypeptide region is composed of amino acids." [SO:ke] subset: SOFA synonym: "sequence" EXACT [] is_a: SO:0000110 ! sequence_feature [Term] id: SO:0000004 name: interior_coding_exon namespace: sequence def: "A coding exon that is not the most 3-prime or the most 5-prime in a given transcript." [] subset: SOFA synonym: "interior coding exon" EXACT [] is_a: SO:0000195 ! coding_exon [Term] id: SO:0000005 name: satellite_DNA namespace: sequence def: "The many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA." [http://www.insdc.org/files/feature_table.html] subset: SOFA synonym: "INSDC_feature:repeat_region" BROAD [] synonym: "INSDC_qualifier:satellite" EXACT [] synonym: "satellite DNA" EXACT [] xref: http://en.wikipedia.org/wiki/Satellite_DNA "wiki" is_a: SO:0000705 ! tandem_repeat [Term] id: SO:0000006 name: PCR_product namespace: sequence def: "A region amplified by a PCR reaction." [SO:ke] comment: This term is mapped to MGED. This term is now located in OBI, with the following ID OBI_0000406. subset: SOFA synonym: "amplicon" RELATED [] synonym: "PCR product" EXACT [] xref: http://en.wikipedia.org/wiki/RAPD "wiki" is_a: SO:0000695 ! reagent [Term] id: SO:0000007 name: read_pair namespace: sequence def: "One of a pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert." [SO:ls] subset: SOFA synonym: "mate pair" EXACT [] synonym: "read-pair" EXACT [] is_a: SO:0000150 ! read relationship: part_of SO:0000149 ! contig relationship: part_of SO:0001790 ! paired_end_fragment [Term] id: SO:0000013 name: scRNA namespace: sequence def: "A small non coding RNA sequence, present in the cytoplasm." [SO:ke] subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:scRNA" EXACT [] synonym: "small cytoplasmic RNA" EXACT [] is_a: SO:0000655 ! ncRNA relationship: derives_from SO:0000483 ! nc_primary_transcript [Term] id: SO:0000038 name: match_set namespace: sequence def: "A collection of match parts." [SO:ke] subset: SOFA is_obsolete: true [Term] id: SO:0000039 name: match_part namespace: sequence def: "A part of a match, for example an hsp from blast is a match_part." [SO:ke] subset: SOFA synonym: "match part" EXACT [] is_a: SO:0001410 ! experimental_feature relationship: part_of SO:0000343 ! match [Term] id: SO:0000050 name: gene_part namespace: sequence def: "A part of a gene, that has no other route in the ontology back to region. This concept is necessary for logical inference as these parts must have the properties of region. It also allows us to associate all the parts of genes with a gene." [SO:ke] subset: SOFA is_obsolete: true [Term] id: SO:0000057 name: operator namespace: sequence def: "A regulatory element of an operon to which activators or repressors bind thereby effecting translation of genes in that operon." [SO:ma] comment: Moved to transcriptional_cis_regulatory_region (SO:0001055) from gene_group_regulatory_region (SO:0000752) on 11 Feb 2021 when SO:0000752 was merged into SO:0001055. See GitHub Issue #529. subset: SOFA synonym: "operator segment" EXACT [] xref: http://en.wikipedia.org/wiki/Operator_(biology)#Operator "wiki" is_a: SO:0001055 ! transcriptional_cis_regulatory_region [Term] id: SO:0000059 name: nuclease_binding_site namespace: sequence def: "A binding site that, of a nucleotide molecule, that interacts selectively and non-covalently with polypeptide residues of a nuclease." [SO:cb] subset: SOFA synonym: "nuclease binding site" EXACT [] is_a: SO:0001654 ! nucleotide_to_protein_binding_site [Term] id: SO:0000101 name: transposable_element namespace: sequence def: "A transposon or insertion sequence. An element that can insert in a variety of DNA sequences." [http://www.sci.sdsu.edu/~smaloy/Glossary/T.html] subset: SOFA synonym: "transposable element" EXACT [] synonym: "transposon" EXACT [] xref: http://en.wikipedia.org/wiki/Transposable_element "wiki" is_a: SO:0001039 ! integrated_mobile_genetic_element [Term] id: SO:0000102 name: expressed_sequence_match namespace: sequence def: "A match to an EST or cDNA sequence." [SO:ke] subset: SOFA synonym: "expressed sequence match" EXACT [] is_a: SO:0000347 ! nucleotide_match [Term] id: SO:0000103 name: clone_insert_end namespace: sequence def: "The end of the clone insert." [SO:ke] subset: SOFA synonym: "clone insert end" EXACT [] is_a: SO:0000699 ! junction relationship: part_of SO:0000753 ! clone_insert [Term] id: SO:0000104 name: polypeptide namespace: sequence alt_id: SO:0000358 def: "A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversible denaturation." [SO:ma] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. The term 'protein' was merged with 'polypeptide'. Although 'protein' was a sequence_attribute and therefore meant to describe the quality rather than an actual feature, it was being used erroneously. It is replaced by 'peptidyl' as the polymer attribute. subset: SOFA synonym: "protein" EXACT [] xref: http://en.wikipedia.org/wiki/Polypeptide "wiki" is_a: SO:0001411 ! biological_region relationship: derives_from SO:0000316 ! CDS [Term] id: SO:0000109 name: sequence_variant_obs namespace: sequence def: "A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration." [SO:ke] subset: SOFA synonym: "mutation" RELATED [] is_obsolete: true [Term] id: SO:0000110 name: sequence_feature namespace: sequence def: "Any extent of continuous biological sequence." [LAMHDI:mb, SO:ke] subset: SOFA synonym: "INSDC_feature:misc_feature" EXACT [] synonym: "INSDC_note:other" EXACT [] synonym: "INSDC_note:sequence_feature" EXACT [] synonym: "located sequence feature" RELATED [] synonym: "located_sequence_feature" EXACT [] synonym: "sequence feature" EXACT [] [Term] id: SO:0000112 name: primer namespace: sequence def: "An oligo to which new deoxyribonucleotides can be added by DNA polymerase." [SO:ke] subset: SOFA synonym: "DNA primer" EXACT [] synonym: "primer oligonucleotide" EXACT [] synonym: "primer polynucleotide" EXACT [] synonym: "primer sequence" EXACT [] xref: http://en.wikipedia.org/wiki/Primer_(molecular_biology) "wiki" is_a: SO:0000441 ! ss_oligo [Term] id: SO:0000113 name: proviral_region namespace: sequence def: "A viral sequence which has integrated into a host genome." [SO:ke] subset: SOFA synonym: "proviral region" EXACT [] synonym: "proviral sequence" RELATED [] is_a: SO:0001039 ! integrated_mobile_genetic_element [Term] id: SO:0000114 name: methylated_cytosine namespace: sequence def: "A methylated deoxy-cytosine." [SO:ke] subset: SOFA synonym: "methylated C" EXACT [] synonym: "methylated cytosine" EXACT [] synonym: "methylated cytosine base" EXACT [] synonym: "methylated cytosine residue" EXACT [] synonym: "methylated_C" EXACT [] is_a: SO:0000306 ! methylated_DNA_base_feature [Term] id: SO:0000120 name: protein_coding_primary_transcript namespace: sequence def: "A primary transcript that, at least in part, encodes one or more proteins." [SO:ke] comment: May contain introns. subset: SOFA synonym: "pre mRNA" RELATED [] synonym: "protein coding primary transcript" EXACT [] is_a: SO:0000185 ! primary_transcript [Term] id: SO:0000139 name: ribosome_entry_site namespace: sequence def: "Region in mRNA where ribosome assembles." [SO:ke] subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:ribosome_binding_site" EXACT [] synonym: "ribosome entry site" EXACT [] is_a: SO:0000836 ! mRNA_region relationship: part_of SO:0000204 ! five_prime_UTR [Term] id: SO:0000140 name: attenuator namespace: sequence def: "A sequence segment located within the five prime end of an mRNA that causes premature termination of translation." [SO:as] subset: SOFA synonym: "attenuator sequence" EXACT [] synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:attenuator" EXACT [] xref: http://en.wikipedia.org/wiki/Attenuator "wiki" is_a: SO:0005836 ! regulatory_region relationship: part_of SO:0000234 ! mRNA [Term] id: SO:0000141 name: terminator namespace: sequence def: "The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription." [http://www.insdc.org/files/feature_table.html] comment: Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:terminator" EXACT [] synonym: "terminator sequence" EXACT [] xref: http://en.wikipedia.org/wiki/Terminator_(genetics) "wiki" is_a: SO:0001055 ! transcriptional_cis_regulatory_region relationship: part_of SO:0000673 ! transcript [Term] id: SO:0000143 name: assembly_component namespace: sequence def: "A region of known length which may be used to manufacture a longer region." [SO:ke] subset: SOFA synonym: "assembly component" EXACT [] is_a: SO:0001410 ! experimental_feature [Term] id: SO:0000147 name: exon namespace: sequence def: "A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing." [SO:ke] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "INSDC_feature:exon" EXACT [] xref: http://en.wikipedia.org/wiki/Exon "wiki" is_a: SO:0000833 ! transcript_region [Term] id: SO:0000148 name: supercontig namespace: sequence def: "One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's." [SO:ls] subset: SOFA synonym: "scaffold" RELATED [] is_a: SO:0000353 ! sequence_assembly relationship: part_of SO:0000719 ! ultracontig [Term] id: SO:0000149 name: contig namespace: sequence def: "A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unavailable bases." [SO:ls] subset: SOFA xref: http://en.wikipedia.org/wiki/Contig "wiki" is_a: SO:0000143 ! assembly_component is_a: SO:0000353 ! sequence_assembly relationship: part_of SO:0000148 ! supercontig [Term] id: SO:0000150 name: read namespace: sequence def: "A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine." [SO:rd] subset: SOFA is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000149 ! contig [Term] id: SO:0000151 name: clone namespace: sequence def: "A piece of DNA that has been inserted in a vector so that it can be propagated in a host bacterium or some other organism." [SO:ke] subset: SOFA xref: http:http\://en.wikipedia.org/wiki/Clone_(genetics) "wiki" is_a: SO:0000695 ! reagent [Term] id: SO:0000159 name: deletion namespace: sequence alt_id: SO:1000033 def: "The point at which one or more contiguous nucleotides were excised." [SO:ke] subset: SOFA synonym: "deleted_sequence" EXACT [] synonym: "nucleotide deletion" EXACT [] synonym: "nucleotide_deletion" EXACT [] xref: http://en.wikipedia.org/wiki/Nucleotide_deletion "wiki" xref: loinc:LA6692-3 "Deletion" is_a: SO:0001059 ! sequence_alteration is_a: SO:0001411 ! biological_region [Term] id: SO:0000161 name: methylated_adenine namespace: sequence def: "A modified base in which adenine has been methylated." [SO:ke] subset: SOFA synonym: "methylated A" EXACT [] synonym: "methylated adenine" EXACT [] synonym: "methylated adenine base" EXACT [] synonym: "methylated adenine residue" EXACT [] synonym: "methylated_A" EXACT [] is_a: SO:0000306 ! methylated_DNA_base_feature [Term] id: SO:0000162 name: splice_site namespace: sequence def: "Consensus region of primary transcript bordering junction of splicing. A region that overlaps exactly 2 base and adjacent_to splice_junction." [SO:cjm, SO:ke] comment: With spliceosomal introns, the splice sites bind the spliceosomal machinery. subset: SOFA synonym: "splice site" EXACT [] xref: http://en.wikipedia.org/wiki/Splice_site "wiki" is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000163 name: five_prime_cis_splice_site namespace: sequence def: "Intronic 2 bp region bordering the exon, at the 5' edge of the intron. A splice_site that is downstream_adjacent_to exon and starts intron." [http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html, SO:cjm, SO:ke] subset: SOFA synonym: "5' splice site" EXACT [] synonym: "donor" RELATED [] synonym: "donor splice site" EXACT [] synonym: "five prime splice site" EXACT [] synonym: "splice donor site" EXACT [] is_a: SO:0001419 ! cis_splice_site [Term] id: SO:0000164 name: three_prime_cis_splice_site namespace: sequence def: "Intronic 2 bp region bordering the exon, at the 3' edge of the intron. A splice_site that is upstream_adjacent_to exon and finishes intron." [http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html, SO:cjm, SO:ke] subset: SOFA synonym: "3' splice site" RELATED [] synonym: "acceptor" RELATED [] synonym: "acceptor splice site" EXACT [] synonym: "splice acceptor site" EXACT [] synonym: "three prime splice site" EXACT [] is_a: SO:0001419 ! cis_splice_site [Term] id: SO:0000165 name: enhancer namespace: sequence def: "A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter." [http://www.insdc.org/files/feature_table.html] comment: An enhancer may participate in an enhanceosome GO:0034206. A protein-DNA complex formed by the association of a distinct set of general and specific transcription factors with a region of enhancer DNA. The cooperative assembly of an enhanceosome confers specificity of transcriptional regulation. This comment is a place holder should we start to make cross products with GO. subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:enhancer" EXACT [] xref: http://en.wikipedia.org/wiki/Enhancer_(genetics) "wiki" is_a: SO:0000727 ! cis_regulatory_module [Term] id: SO:0000167 name: promoter namespace: sequence def: "A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the core transcription machinery. A region (DNA) to which RNA polymerase binds, to begin transcription." [SO:regcreative] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. The region on a DNA molecule involved in RNA polymerase binding to initiate transcription. Moved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020. Merged with RNA_polymerase_promoter (SO:0001203) Aug 2020. Moved up one level from is_a CRM (SO:0000727) to is_a transcriptional_cis_regulatory_region (SO:0001055) as part of the GREEKC work January 2021. Pascale Gaudet from Gene Ontology pointed out that CRM can be located upstream of the promoter and therefore cannot include the promoter. subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:promoter" EXACT [] synonym: "promoter sequence" EXACT [] xref: http://en.wikipedia.org/wiki/Promoter "wiki" is_a: SO:0000842 ! gene_component_region is_a: SO:0001055 ! transcriptional_cis_regulatory_region [Term] id: SO:0000177 name: cross_genome_match namespace: sequence def: "A nucleotide match against a sequence from another organism." [SO:ma] subset: SOFA synonym: "cross genome match" EXACT [] is_a: SO:0000347 ! nucleotide_match [Term] id: SO:0000178 name: operon namespace: sequence def: "The DNA region of a group of adjacent genes whose transcription is coordinated onĀ one or several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene." [SO:ma] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. Definition updated with per Mejia-Almonte et.al Redefining fundamental concepts of transcription initiation in prokaryotes Aug 5 2020. subset: SOFA synonym: "INSDC_feature:operon" EXACT [] xref: http://en.wikipedia.org/wiki/Operon "wiki" is_a: SO:0005855 ! gene_group [Term] id: SO:0000179 name: clone_insert_start namespace: sequence def: "The start of the clone insert." [SO:ke] subset: SOFA synonym: "clone insert start" EXACT [] is_a: SO:0000699 ! junction relationship: part_of SO:0000753 ! clone_insert [Term] id: SO:0000181 name: translated_nucleotide_match namespace: sequence def: "A match against a translated sequence." [SO:ke] subset: SOFA synonym: "translated nucleotide match" EXACT [] is_a: SO:0000347 ! nucleotide_match [Term] id: SO:0000183 name: non_transcribed_region namespace: sequence def: "A region of the gene which is not transcribed." [SO:ke] subset: SOFA synonym: "non transcribed region" EXACT [] synonym: "non-transcribed sequence" EXACT [] synonym: "nontranscribed region" EXACT [] synonym: "nontranscribed sequence" EXACT [] is_a: SO:0000842 ! gene_component_region [Term] id: SO:0000185 name: primary_transcript namespace: sequence def: "A transcript that in its initial state requires modification to be functional." [SO:ma] subset: SOFA synonym: "INSDC_feature:precursor_RNA" EXACT [] synonym: "INSDC_feature:prim_transcript" EXACT [] synonym: "precursor RNA" EXACT [] synonym: "primary transcript" EXACT [] xref: http://en.wikipedia.org/wiki/Primary_transcript "wiki" is_a: SO:0000673 ! transcript [Term] id: SO:0000187 name: repeat_family namespace: sequence def: "A group of characterized repeat sequences." [SO:ke] subset: SOFA is_obsolete: true [Term] id: SO:0000188 name: intron namespace: sequence def: "A region of a primary transcript that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it." [http://www.insdc.org/files/feature_table.html] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "INSDC_feature:intron" EXACT [] xref: http://en.wikipedia.org/wiki/Intron "wiki" is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000193 name: RFLP_fragment namespace: sequence def: "A DNA fragment used as a reagent to detect the polymorphic genomic loci by hybridizing against the genomic DNA digested with a given restriction enzyme." [GOC:pj] subset: SOFA synonym: "restriction fragment length polymorphism" EXACT [] synonym: "RFLP" EXACT [] synonym: "RFLP fragment" EXACT [] xref: http://en.wikipedia.org/wiki/Restriction_fragment_length_polymorphism "wiki" is_a: SO:0000412 ! restriction_fragment [Term] id: SO:0000195 name: coding_exon namespace: sequence def: "An exon whereby at least one base is part of a codon (here, 'codon' is inclusive of the stop_codon)." [SO:ke] subset: SOFA synonym: "coding exon" EXACT [] is_a: SO:0000147 ! exon [Term] id: SO:0000196 name: five_prime_coding_exon_coding_region namespace: sequence def: "The sequence of the five_prime_coding_exon that codes for protein." [SO:cjm] subset: SOFA synonym: "five prime exon coding region" EXACT [] is_a: SO:0001215 ! coding_region_of_exon relationship: part_of SO:0000200 ! five_prime_coding_exon [Term] id: SO:0000197 name: three_prime_coding_exon_coding_region namespace: sequence def: "The sequence of the three_prime_coding_exon that codes for protein." [SO:cjm] subset: SOFA synonym: "three prime exon coding region" EXACT [] is_a: SO:0001215 ! coding_region_of_exon relationship: part_of SO:0000195 ! coding_exon [Term] id: SO:0000198 name: noncoding_exon namespace: sequence def: "An exon that does not contain any codons." [SO:ke] subset: SOFA synonym: "noncoding exon" EXACT [] is_a: SO:0000147 ! exon [Term] id: SO:0000200 name: five_prime_coding_exon namespace: sequence def: "The 5' most coding exon." [SO:ke] subset: SOFA synonym: "5' coding exon" EXACT [] synonym: "five prime coding exon" EXACT [] is_a: SO:0000195 ! coding_exon [Term] id: SO:0000203 name: UTR namespace: sequence def: "Messenger RNA sequences that are untranslated and lie five prime or three prime to sequences which are translated." [SO:ke] subset: SOFA synonym: "untranslated region" EXACT [] is_a: SO:0000836 ! mRNA_region [Term] id: SO:0000204 name: five_prime_UTR namespace: sequence def: "A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein." [http://www.insdc.org/files/feature_table.html] subset: SOFA synonym: "5' UTR" EXACT [] synonym: "five prime UTR" EXACT [] synonym: "five_prime_untranslated_region" EXACT [] synonym: "INSDC_feature:5'UTR" EXACT [] xref: http://en.wikipedia.org/wiki/5'_UTR "wiki" is_a: SO:0000203 ! UTR [Term] id: SO:0000205 name: three_prime_UTR namespace: sequence def: "A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein." [http://www.insdc.org/files/feature_table.html] subset: SOFA synonym: "INSDC_feature:3'UTR" EXACT [] synonym: "three prime untranslated region" EXACT [] synonym: "three prime UTR" EXACT [] xref: http://en.wikipedia.org/wiki/Three_prime_untranslated_region "wiki" is_a: SO:0000203 ! UTR [Term] id: SO:0000209 name: rRNA_primary_transcript namespace: sequence def: "A primary transcript encoding a ribosomal RNA." [SO:ke] subset: SOFA synonym: "ribosomal RNA primary transcript" EXACT [] synonym: "rRNA primary transcript" EXACT [] is_a: SO:0000483 ! nc_primary_transcript [Term] id: SO:0000233 name: mature_transcript namespace: sequence def: "A transcript which has undergone the necessary modifications, if any, for its function. In eukaryotes this includes, for example, processing of introns, cleavage, base modification, and modifications to the 5' and/or the 3' ends, other than addition of bases. In bacteria functional mRNAs are usually not modified." [SO:ke] comment: A processed transcript cannot contain introns. subset: SOFA synonym: "mature transcript" EXACT [] xref: http://en.wikipedia.org/wiki/Mature_transcript "wiki" is_a: SO:0000673 ! transcript relationship: derives_from SO:0000185 ! primary_transcript [Term] id: SO:0000234 name: mRNA namespace: sequence def: "Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns." [SO:ma] comment: An mRNA does not contain introns as it is a processed_transcript. The equivalent kind of primary_transcript is protein_coding_primary_transcript (SO:0000120) which may contain introns. This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "INSDC_feature:mRNA" EXACT [] synonym: "messenger RNA" EXACT [] synonym: "protein_coding_transcript" EXACT [] xref: http://en.wikipedia.org/wiki/MRNA "wiki" xref: http://www.gencodegenes.org/gencode_biotypes.html "GENCODE" is_a: SO:0000233 ! mature_transcript [Term] id: SO:0000235 name: TF_binding_site namespace: sequence def: "A DNA site where a transcription factor binds." [SO:ke] comment: Definition updated along with definitions in Mejia-Almonte et.al PMID:32665585. Added relationship part_of SO:0000727 CRM in place of previous CRM relationship has_part TF_binding_site August 2020 in response to requests from GREEKC initiative. Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. subset: SOFA synonym: "TF binding site" EXACT [] synonym: "transcription factor binding site" EXACT [] is_a: SO:0001055 ! transcriptional_cis_regulatory_region is_a: SO:0001654 ! nucleotide_to_protein_binding_site relationship: part_of SO:0000727 ! cis_regulatory_module [Term] id: SO:0000236 name: ORF namespace: sequence def: "The in-frame interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER." [SGD:rb, SO:ma] comment: The definition was modified by Rama. ORF is defined by the sequence, whereas the CDS is defined according to whether a polypeptide is made. This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "open reading frame" EXACT [] is_a: SO:0000717 ! reading_frame [Term] id: SO:0000239 name: flanking_region namespace: sequence def: "The sequences extending on either side of a specific region." [SO:ke] subset: SOFA synonym: "flanking region" EXACT [] is_a: SO:0001412 ! topologically_defined_region [Term] id: SO:0000252 name: rRNA namespace: sequence def: "rRNA is an RNA component of a ribosome that can provide both structural scaffolding and catalytic activity." [http://www.insdc.org/files/feature_table.html, ISBN:0198506732] comment: Definition updated 10 June 2021 as part of restructuring rRNA terms and reforming definitions to have similar structures. Request from EBI. See GitHub Issue #493 subset: SOFA synonym: "INSDC_feature:rRNA" EXACT [] synonym: "INSDC_qualifier:unknown" BROAD [] synonym: "ribosomal ribonucleic acid" EXACT [] synonym: "ribosomal RNA" EXACT [] xref: http://en.wikipedia.org/wiki/RRNA "wiki" is_a: SO:0000655 ! ncRNA relationship: derives_from SO:0000209 ! rRNA_primary_transcript [Term] id: SO:0000253 name: tRNA namespace: sequence def: "Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. Transfer RNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). Transfer RNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00005, ISBN:0198506732] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "INSDC_feature:tRNA" EXACT [] synonym: "INSDC_qualifier:unknown" BROAD [] synonym: "transfer ribonucleic acid" RELATED [] synonym: "transfer RNA" RELATED [] xref: http://en.wikipedia.org/wiki/TRNA "wiki" is_a: SO:0000655 ! ncRNA relationship: derives_from SO:0000483 ! nc_primary_transcript [Term] id: SO:0000274 name: snRNA namespace: sequence def: "A small nuclear RNA molecule involved in pre-mRNA splicing and processing." [http://www.insdc.org/files/feature_table.html, PMID:11733745, WB:ems] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:snRNA" EXACT [] synonym: "small nuclear RNA" EXACT [] xref: http://en.wikipedia.org/wiki/SnRNA "wiki" is_a: SO:0000655 ! ncRNA relationship: derives_from SO:0000483 ! nc_primary_transcript [Term] id: SO:0000275 name: snoRNA namespace: sequence def: "Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing." [GOC:kgc, PMID:31828325] comment: Updated the definition of snoRNA (SO:0000275) from "A snoRNA (small nucleolar RNA) is any one of a class of small RNAs that are associated with the eukaryotic nucleus as components of small nucleolar ribonucleoproteins. They participate in the processing or modifications of many RNAs, mostly ribosomal RNAs (rRNAs) though snoRNAs are also known to target other classes of RNA, including spliceosomal RNAs, tRNAs, and mRNAs via a stretch of sequence that is complementary to a sequence in the targeted RNA." to "Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing." to acknowledge that some snoRNAs functionally localize to other compartments (cytoplasm or even secreted). See GitHub Issue #578. subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:snoRNA" EXACT [] synonym: "small nucleolar RNA" EXACT [] is_a: SO:0000655 ! ncRNA relationship: derives_from SO:0000483 ! nc_primary_transcript [Term] id: SO:0000276 name: miRNA namespace: sequence alt_id: SO:0000649 def: "Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene (or the product of other non coding RNA genes. Micro RNAs are produced from precursor molecules (SO:0001244) that can form local hairpin structures, which ordinarily are processed (usually via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpin precursor molecule. Micro RNAs may trigger the cleavage of their target molecules or act as translational repressors." [PMID:11081512, PMID:12592000] subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:miRNA" EXACT [] synonym: "micro RNA" EXACT [] synonym: "microRNA" EXACT [] synonym: "small temporal RNA" EXACT [] synonym: "stRNA" EXACT [] xref: http://en.wikipedia.org/wiki/MiRNA "wiki" xref: http://en.wikipedia.org/wiki/StRNA "wiki" is_a: SO:0000370 ! small_regulatory_ncRNA relationship: derives_from SO:0000835 ! primary_transcript_region [Term] id: SO:0000289 name: microsatellite namespace: sequence def: "A repeat_region containing repeat_units of 2 to 10 bp repeated in tandem." [http://www.informatics.jax.org/silver/glossary.shtml, NCBI:th] subset: SOFA synonym: "INSDC_feature:repeat_region" BROAD [] synonym: "INSDC_qualifier:microsatellite" EXACT [] synonym: "microsatellite locus" EXACT [] synonym: "microsatellite marker" EXACT [] synonym: "short tandem repeat" EXACT [] synonym: "STR" EXACT [http://www.ncbi.nlm.nih.gov/books/NBK21126/def-item/A9651/] xref: http://en.wikipedia.org/wiki/Microsatellite "wiki" is_a: SO:0000005 ! satellite_DNA [Term] id: SO:0000294 name: inverted_repeat namespace: sequence def: "The sequence is complementarily repeated on the opposite strand. It is a palindrome, and it may, or may not be hyphenated. Examples: GCTGATCAGC, or GCTGA-----TCAGC." [SO:ke] subset: SOFA synonym: "INSDC_feature:repeat_region" BROAD [] synonym: "INSDC_qualifier:inverted" EXACT [] synonym: "inverted repeat" EXACT [] synonym: "inverted repeat sequence" EXACT [] xref: http://en.wikipedia.org/wiki/Inverted_repeat "wiki" is_a: SO:0000657 ! repeat_region [Term] id: SO:0000296 name: origin_of_replication namespace: sequence def: "A region of nucleic acid from which replication initiates; includes sequences that are recognized by replication proteins, the site from which the first separation of complementary strands occurs, and specific replication start sites." [http://www.insdc.org/files/feature_table.html, NCBI:cf] subset: SOFA synonym: "INSDC_feature:rep_origin" EXACT [] synonym: "ori" EXACT [] synonym: "origin of replication" EXACT [] xref: http://en.wikipedia.org/wiki/Origin_of_replication "wiki" is_a: SO:0001411 ! biological_region relationship: part_of SO:0001235 ! replicon [Term] id: SO:0000303 name: clip namespace: sequence def: "Part of the primary transcript that is clipped off during processing." [SO:ke] subset: SOFA is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000305 name: modified_DNA_base namespace: sequence def: "A modified nucleotide, i.e. a nucleotide other than A, T, C. G." [http://www.insdc.org/files/feature_table.html] comment: Modified base:. subset: SOFA synonym: "INSDC_feature:modified_base" EXACT [] synonym: "modified base site" EXACT [] is_a: SO:0001236 ! base is_a: SO:0001720 ! epigenetically_modified_region [Term] id: SO:0000306 name: methylated_DNA_base_feature namespace: sequence def: "A nucleotide modified by methylation." [SO:ke] subset: SOFA synonym: "methylated base feature" EXACT [] is_a: SO:0000305 ! modified_DNA_base [Term] id: SO:0000307 name: CpG_island namespace: sequence def: "Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes." [SO:rd] subset: SOFA synonym: "CG island" EXACT [] synonym: "CpG island" EXACT [] xref: http://en.wikipedia.org/wiki/CpG_island "wiki" is_a: SO:0001411 ! biological_region [Term] id: SO:0000314 name: direct_repeat namespace: sequence def: "A repeat where the same sequence is repeated in the same direction. Example: GCTGA-followed by-GCTGA." [SO:ke] subset: SOFA synonym: "direct repeat" EXACT [] synonym: "INSDC_feature:repeat_region" BROAD [] synonym: "INSDC_qualifier:direct" EXACT [] xref: http://en.wikipedia.org/wiki/Direct_repeat "wiki" is_a: SO:0000657 ! repeat_region [Term] id: SO:0000315 name: TSS namespace: sequence def: "The first base where RNA polymerase begins to synthesize the RNA transcript." [SO:ke] comment: Added relationship is_a SO:0002309 core_promoter_element with the creation of core_promoter_element as part of GREEKC initiative August 2020 - Dave Sant. subset: SOFA synonym: "INSDC_feature:misc_feature" BROAD [] synonym: "INSDC_note:transcription_start_site" EXACT [] synonym: "transcription start site" EXACT [] synonym: "transcription_start_site" EXACT [] is_a: SO:0000714 ! nucleotide_motif is_a: SO:0000835 ! primary_transcript_region relationship: overlaps SO:0000235 ! TF_binding_site relationship: part_of SO:0000167 ! promoter [Term] id: SO:0000316 name: CDS namespace: sequence def: "A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon." [SO:ma] subset: SOFA synonym: "coding sequence" EXACT [] synonym: "coding_sequence" EXACT [] synonym: "INSDC_feature:CDS" EXACT [] is_a: SO:0000836 ! mRNA_region [Term] id: SO:0000318 name: start_codon namespace: sequence def: "First codon to be translated by a ribosome." [SO:ke] subset: SOFA synonym: "initiation codon" EXACT [] synonym: "start codon" EXACT [] xref: http://en.wikipedia.org/wiki/Start_codon "wiki" is_a: SO:0000360 ! codon [Term] id: SO:0000319 name: stop_codon namespace: sequence def: "In mRNA, a set of three nucleotides that indicates the end of information for protein synthesis." [SO:ke] subset: SOFA synonym: "stop codon" EXACT [] xref: http://en.wikipedia.org/wiki/Stop_codon "wiki" is_a: SO:0000360 ! codon [Term] id: SO:0000324 name: tag namespace: sequence def: "A nucleotide sequence that may be used to identify a larger sequence." [SO:ke] subset: SOFA is_a: SO:0000696 ! oligo [Term] id: SO:0000325 name: rRNA_large_subunit_primary_transcript namespace: sequence def: "A primary transcript encoding a large ribosomal subunit RNA." [SO:ke] subset: SOFA synonym: "35S rRNA primary transcript" EXACT [] synonym: "rRNA large subunit primary transcript" EXACT [] is_a: SO:0000209 ! rRNA_primary_transcript [Term] id: SO:0000326 name: SAGE_tag namespace: sequence def: "A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts." [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7570003&dopt=Abstract] subset: SOFA synonym: "SAGE tag" EXACT [] is_a: SO:0000324 ! tag [Term] id: SO:0000330 name: conserved_region namespace: sequence def: "Region of sequence similarity by descent from a common ancestor." [SO:ke] subset: SOFA synonym: "conserved region" EXACT [] synonym: "INSDC_feature:misc_feature" BROAD [] synonym: "INSDC_note:conserved_region" EXACT [] xref: http://en.wikipedia.org/wiki/Conserved_region "wiki" is_a: SO:0001410 ! experimental_feature [Term] id: SO:0000331 name: STS namespace: sequence def: "Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known." [http://www.biospace.com] subset: SOFA synonym: "INSDC_feature:STS" EXACT [] synonym: "sequence tag site" EXACT [] is_a: SO:0000324 ! tag [Term] id: SO:0000332 name: coding_conserved_region namespace: sequence def: "Coding region of sequence similarity by descent from a common ancestor." [SO:ke] subset: SOFA synonym: "coding conserved region" EXACT [] is_a: SO:0000330 ! conserved_region [Term] id: SO:0000333 name: exon_junction namespace: sequence def: "The boundary between two exons in a processed transcript." [SO:ke] subset: SOFA synonym: "exon junction" EXACT [] is_a: SO:0000699 ! junction relationship: part_of SO:0000233 ! mature_transcript [Term] id: SO:0000334 name: nc_conserved_region namespace: sequence def: "Non-coding region of sequence similarity by descent from a common ancestor." [SO:ke] subset: SOFA synonym: "conserved non-coding element" EXACT [] synonym: "conserved non-coding sequence" EXACT [] synonym: "nc conserved region" EXACT [] synonym: "noncoding conserved region" EXACT [] is_a: SO:0000330 ! conserved_region [Term] id: SO:0000336 name: pseudogene namespace: sequence def: "A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their \"normal\" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its \"normal\" paralog)." [http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html] subset: Alliance_of_Genome_Resources subset: SOFA synonym: "INSDC_feature:gene" BROAD [] synonym: "INSDC_qualifier:pseudo" EXACT [] synonym: "INSDC_qualifier:unknown" EXACT [] xref: http://en.wikipedia.org/wiki/Pseudogene "wiki" is_a: SO:0001411 ! biological_region relationship: non_functional_homolog_of SO:0000704 ! gene [Term] id: SO:0000337 name: RNAi_reagent namespace: sequence def: "A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference." [SO:rd] subset: SOFA synonym: "RNAi reagent" EXACT [] is_a: SO:0000442 ! ds_oligo [Term] id: SO:0000340 name: chromosome namespace: sequence def: "Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication." [SO:ma] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA xref: http://en.wikipedia.org/wiki/Chromosome "wiki" is_a: SO:0001235 ! replicon [Term] id: SO:0000341 name: chromosome_band namespace: sequence def: "A cytologically distinguishable feature of a chromosome, often made visible by staining, and usually alternating light and dark." [SO:ma] subset: SOFA synonym: "chromosome band" EXACT [] synonym: "cytoband" EXACT [] synonym: "cytological band" EXACT [] xref: http://en.wikipedia.org/wiki/Cytological_band "wiki" is_a: SO:0000830 ! chromosome_part [Term] id: SO:0000343 name: match namespace: sequence def: "A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4." [SO:ke] subset: SOFA is_a: SO:0001410 ! experimental_feature [Term] id: SO:0000344 name: splice_enhancer namespace: sequence def: "Region of a transcript that regulates splicing." [SO:ke] subset: SOFA synonym: "splice enhancer" EXACT [] is_a: SO:0001056 ! splicing_regulatory_region [Term] id: SO:0000345 name: EST namespace: sequence def: "A tag produced from a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long." [SO:ke] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "expressed sequence tag" EXACT [] is_a: SO:0000324 ! tag relationship: derives_from SO:0000234 ! mRNA [Term] id: SO:0000347 name: nucleotide_match namespace: sequence def: "A match against a nucleotide sequence." [SO:ke] subset: SOFA synonym: "nucleotide match" EXACT [] is_a: SO:0000343 ! match [Term] id: SO:0000349 name: protein_match namespace: sequence def: "A match against a protein sequence." [SO:ke] subset: SOFA synonym: "protein match" EXACT [] is_a: SO:0000343 ! match [Term] id: SO:0000353 name: sequence_assembly namespace: sequence def: "A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences." [SO:ma] subset: SOFA synonym: "sequence assembly" EXACT [] xref: http://en.wikipedia.org/wiki/Sequence_assembly "wiki" is_a: SO:0001248 ! assembly [Term] id: SO:0000360 name: codon namespace: sequence def: "A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together code for a unique amino acid or the termination of translation and are contained within the CDS." [SO:ke] subset: SOFA xref: http://en.wikipedia.org/wiki/Codon "wiki" is_a: SO:0000851 ! CDS_region [Term] id: SO:0000366 name: insertion_site namespace: sequence def: "The junction where an insertion occurred." [SO:ke] subset: SOFA synonym: "insertion site" EXACT [] is_a: SO:0000699 ! junction [Term] id: SO:0000368 name: transposable_element_insertion_site namespace: sequence def: "The junction in a genome where a transposable_element has inserted." [SO:ke] subset: SOFA synonym: "transposable element insertion site" EXACT [] is_a: SO:0000366 ! insertion_site [Term] id: SO:0000370 name: small_regulatory_ncRNA namespace: sequence def: "A non-coding RNA less than 200 nucleotides long, usually with a specific secondary structure, that acts to regulate gene expression. These include short ncRNAs such as piRNA, miRNA and siRNAs (among others)." [PMID:28541282, PomBase:al, SO:ma] subset: SOFA synonym: "small regulatory ncRNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000372 name: enzymatic_RNA namespace: sequence def: "An RNA sequence that has catalytic activity with or without an associated ribonucleoprotein." [RSC:cb] comment: This was moved to be a child of transcript (SO:0000673) because some enzymatic RNA regions are part of primary transcripts and some are part of processed transcripts. Moved under ncRNA on 18 Nov 2021. See GitHub Issue #533. subset: SOFA synonym: "enzymatic RNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000374 name: ribozyme namespace: sequence def: "An RNA with catalytic activity." [SO:ma] subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:ribozyme" EXACT [] xref: http://en.wikipedia.org/wiki/Ribozyme "wiki" is_a: SO:0000372 ! enzymatic_RNA [Term] id: SO:0000375 name: cytosolic_5_8S_rRNA namespace: sequence def: "Cytosolic 5.8S rRNA is an RNA component of the large subunit of cytosolic ribosomes in eukaryotes." [https://rfam.xfam.org/family/RF00002] comment: Dave Sant removed '5_8S rRNA is also found in archaea.' from definition due to lack of references mentioning this on 1 Feb 2021. See GitHub Issue #505. Renamed from rRNA_5_8S to cytosolic_5_8S_rRNA on 10 June 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic 5.8S LSU rRNA" EXACT [] synonym: "cytosolic 5.8S ribosomal RNA" EXACT [] synonym: "cytosolic 5.8S rRNA" EXACT [] synonym: "cytosolic rRNA 5 8S" EXACT [] xref: http://en.wikipedia.org/wiki/5.8S_ribosomal_RNA "wiki" is_a: SO:0000651 ! cytosolic_LSU_rRNA relationship: derives_from SO:0000704 ! gene [Term] id: SO:0000380 name: hammerhead_ribozyme namespace: sequence def: "A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs." [PMID:2436805] subset: SOFA synonym: "hammerhead ribozyme" EXACT [] synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:hammerhead_ribozyme" EXACT [] xref: http://en.wikipedia.org/wiki/Hammerhead_ribozyme "wiki" is_a: SO:0000715 ! RNA_motif [Term] id: SO:0000385 name: RNase_MRP_RNA namespace: sequence def: "The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00030] comment: Moved under enzymatic_RNA on 18 Nov 2021. See GitHub Issue #533. subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:RNase_MRP_RNA" EXACT [] synonym: "RNase MRP RNA" EXACT [] is_a: SO:0000372 ! enzymatic_RNA [Term] id: SO:0000386 name: RNase_P_RNA namespace: sequence def: "The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterized activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00010] comment: Moved under enzymatic_RNA on 18 Nov 2021. See GitHub Issue #533. subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:RNase_P_RNA" EXACT [] synonym: "RNase P RNA" EXACT [] is_a: SO:0000372 ! enzymatic_RNA [Term] id: SO:0000390 name: telomerase_RNA namespace: sequence def: "The RNA component of telomerase, a reverse transcriptase that synthesizes telomeric DNA." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00025] subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:telomerase_RNA" EXACT [] synonym: "telomerase RNA" EXACT [] xref: http://en.wikipedia.org/wiki/Telomerase_RNA "wiki" is_a: SO:0000655 ! ncRNA [Term] id: SO:0000391 name: U1_snRNA namespace: sequence def: "U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00003] subset: SOFA synonym: "small nuclear RNA U1" EXACT [RSC:cb] synonym: "snRNA U1" EXACT [RSC:cb] synonym: "U1 small nuclear RNA" EXACT [RSC:cb] synonym: "U1 snRNA" EXACT [] xref: http://en.wikipedia.org/wiki/U1_snRNA "wiki" is_a: SO:0000274 ! snRNA [Term] id: SO:0000392 name: U2_snRNA namespace: sequence def: "U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00004] subset: SOFA synonym: "small nuclear RNA U2" EXACT [RSC:CB] synonym: "snRNA U2" EXACT [RSC:CB] synonym: "U2 small nuclear RNA" EXACT [RSC:CB] synonym: "U2 snRNA" EXACT [] xref: http://en.wikipedia.org/wiki/U2_snRNA "wiki" is_a: SO:0000274 ! snRNA [Term] id: SO:0000393 name: U4_snRNA namespace: sequence def: "U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015] subset: SOFA synonym: "small nuclear RNA U4" EXACT [RSC:cb] synonym: "snRNA U4" EXACT [RSC:cb] synonym: "U4 small nuclear RNA" EXACT [RSC:cb] synonym: "U4 snRNA" EXACT [] xref: http://en.wikipedia.org/wiki/U4_snRNA "wiki" is_a: SO:0000274 ! snRNA [Term] id: SO:0000394 name: U4atac_snRNA namespace: sequence def: "An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397)." [PMID:12409455] subset: SOFA synonym: "small nuclear RNA U4atac" EXACT [RSC:cb] synonym: "snRNA U4atac" EXACT [RSC:cb] synonym: "U4atac small nuclear RNA" EXACT [RSC:cb] synonym: "U4atac snRNA" EXACT [] is_a: SO:0000274 ! snRNA [Term] id: SO:0000395 name: U5_snRNA namespace: sequence def: "U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00020] subset: SOFA synonym: "small nuclear RNA U5" EXACT [RSC:cb] synonym: "snRNA U5" EXACT [RSC:cb] synonym: "U5 small nuclear RNA" EXACT [RSC:cb] synonym: "U5 snRNA" EXACT [] xref: http://en.wikipedia.org/wiki/U5_snRNA "wiki" is_a: SO:0000274 ! snRNA [Term] id: SO:0000396 name: U6_snRNA namespace: sequence def: "U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015] subset: SOFA synonym: "small nuclear RNA U6" EXACT [RSC:cb] synonym: "snRNA U6" EXACT [RSC:cb] synonym: "U6 small nuclear RNA" EXACT [RSC:cb] synonym: "U6 snRNA" EXACT [] xref: http://en.wikipedia.org/wiki/U6_snRNA "wiki" is_a: SO:0000274 ! snRNA [Term] id: SO:0000397 name: U6atac_snRNA namespace: sequence def: "U6atac_snRNA is an snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394)." [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=retrieve&db=pubmed&list_uids=12409455&dopt=Abstract] subset: SOFA synonym: "snRNA U6atac" EXACT [RSC:cb] synonym: "U6atac small nuclear RNA" EXACT [RSC:cb] synonym: "U6atac snRNA" EXACT [RSC:cb] is_a: SO:0000274 ! snRNA [Term] id: SO:0000398 name: U11_snRNA namespace: sequence def: "U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence." [PMID:9622129] subset: SOFA synonym: "small nuclear RNA U11" EXACT [RSC:cb] synonym: "snRNA U11" EXACT [RSC:cb] synonym: "U11 small nuclear RNA" EXACT [RSC:cb] synonym: "U11 snRNA" EXACT [] xref: http://en.wikipedia.org/wiki/U11_snRNA "wiki" is_a: SO:0000274 ! snRNA [Term] id: SO:0000399 name: U12_snRNA namespace: sequence def: "The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00007] subset: SOFA synonym: "small nuclear RNA U12" EXACT [RSC:cb] synonym: "snRNA U12" EXACT [RSC:cb] synonym: "U12 small nuclear RNA" EXACT [RSC:cb] synonym: "U12 snRNA" EXACT [] xref: http://en.wikipedia.org/wiki/U12_snRNA "wiki" is_a: SO:0000274 ! snRNA [Term] id: SO:0000403 name: U14_snoRNA namespace: sequence alt_id: SO:0005839 def: "U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00016, PMID:2551119] comment: An evolutionarily conserved eukaryotic low molecular weight RNA capable of intermolecular hybridization with both homologous and heterologous 18S rRNA. subset: SOFA synonym: "small nucleolar RNA U14" EXACT [] synonym: "snoRNA U14" EXACT [] synonym: "U14 small nucleolar RNA" EXACT [] synonym: "U14 snoRNA" EXACT [] is_a: SO:0000593 ! C_D_box_snoRNA [Term] id: SO:0000404 name: vault_RNA namespace: sequence def: "A family of RNAs are found as part of the enigmatic vault ribonucleoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00006] subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:vault_RNA" EXACT [] synonym: "vault RNA" EXACT [] xref: http://en.wikipedia.org/wiki/Vault_RNA "wiki" is_a: SO:0000655 ! ncRNA [Term] id: SO:0000405 name: Y_RNA namespace: sequence def: "Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00019] subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:Y_RNA" EXACT [] synonym: "Y RNA" EXACT [] xref: http://en.wikipedia.org/wiki/Y_RNA "wiki" is_a: SO:0000655 ! ncRNA [Term] id: SO:0000407 name: cytosolic_18S_rRNA namespace: sequence def: "Cytosolic 18S rRNA is an RNA component of the small subunit of cytosolic ribosomes in eukaryotes." [SO:ke] comment: Renamed to cytosolic_18S_rRNA from rRNA_18S on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic 18S ribosomal RNA" EXACT [] synonym: "cytosolic 18S rRNA" EXACT [] synonym: "cytosolic rRNA 18S" EXACT [] xref: http://en.wikipedia.org/wiki/18S_ribosomal_RNA "wiki" is_a: SO:0000650 ! cytosolic_SSU_rRNA relationship: derives_from SO:0000704 ! gene [Term] id: SO:0000409 name: binding_site namespace: sequence alt_id: BS:00033 def: "A biological_region of sequence that, in the molecule, interacts selectively and non-covalently with other molecules. A region on the surface of a molecule that may interact with another molecule. When applied to polypeptides: Amino acids involved in binding or interactions. It can also apply to an amino acid bond which is represented by the positions of the two flanking amino acids." [EBIBS:GAR, SO:ke] comment: See GO:0005488 : binding. subset: biosapiens subset: SOFA synonym: "binding site" EXACT [] synonym: "binding_or_interaction_site" EXACT [] synonym: "INSDC_feature:misc_binding" EXACT [] synonym: "site" RELATED [] xref: http://en.wikipedia.org/wiki/Binding_site "wiki" is_a: SO:0001411 ! biological_region [Term] id: SO:0000410 name: protein_binding_site namespace: sequence def: "A binding site that, in the molecule, interacts selectively and non-covalently with polypeptide molecules." [SO:ke] comment: See GO:0042277 : peptide binding. subset: SOFA synonym: "INSDC_feature:protein_bind" EXACT [] synonym: "protein binding site" EXACT [] is_a: SO:0000409 ! binding_site [Term] id: SO:0000412 name: restriction_fragment namespace: sequence def: "A region of polynucleotide sequence produced by digestion with a restriction endonuclease." [SO:ke] subset: SOFA synonym: "restriction fragment" EXACT [] xref: http://en.wikipedia.org/wiki/Restriction_fragment "wiki" is_a: SO:0000143 ! assembly_component [Term] id: SO:0000413 name: sequence_difference namespace: sequence def: "A region where the sequence differs from that of a specified sequence." [SO:ke] subset: SOFA synonym: "INSDC_feature:misc_difference" EXACT [] synonym: "sequence difference" EXACT [] is_a: SO:0000700 ! remark [Term] id: SO:0000418 name: signal_peptide namespace: sequence alt_id: BS:00159 def: "The signal_peptide is a short region of the peptide located at the N-terminus that directs the protein to be secreted or part of membrane components." [http://www.insdc.org/files/feature_table.html] comment: Old def before biosapiens:The sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence. subset: biosapiens subset: SOFA synonym: "INSDC_feature:sig_peptide" EXACT [] synonym: "signal" RELATED [uniprot:feature_type] synonym: "signal peptide" EXACT [] synonym: "signal peptide coding sequence" EXACT [] xref: http://en.wikipedia.org/wiki/Signal_peptide "wiki" is_a: SO:0001527 ! peptide_localization_signal relationship: derives_from SO:0000851 ! CDS_region relationship: part_of SO:0100011 ! cleaved_peptide_region [Term] id: SO:0000419 name: mature_protein_region namespace: sequence alt_id: BS:00149 def: "The polypeptide sequence that remains when the cleaved peptide regions have been cleaved from the immature peptide." [EBIBS:GAR, http://www.insdc.org/files/feature_table.html, SO:cb] comment: This term mature peptide, merged with the biosapiens term mature protein region and took that to be the new name. Old def: The coding sequence for the mature or final peptide or protein product following post-translational modification. subset: biosapiens subset: SOFA synonym: "chain" RELATED [uniprot:feature_type] synonym: "INSDC_feature:mat_peptide" EXACT [] synonym: "mature peptide" RELATED [] synonym: "mature protein region" EXACT [] is_a: SO:0000839 ! polypeptide_region relationship: derives_from SO:0000851 ! CDS_region relationship: part_of SO:0001063 ! immature_peptide_region [Term] id: SO:0000436 name: ARS namespace: sequence def: "A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host." [SO:ma] subset: SOFA synonym: "autonomously replicating sequence" EXACT [] is_a: SO:0000296 ! origin_of_replication [Term] id: SO:0000441 name: ss_oligo namespace: sequence def: "A single stranded oligonucleotide." [SO:ke] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "single strand oligo" EXACT [] synonym: "single strand oligonucleotide" EXACT [] synonym: "single stranded oligonucleotide" EXACT [] synonym: "ss oligo" EXACT [] synonym: "ss oligonucleotide" EXACT [] is_a: SO:0000696 ! oligo [Term] id: SO:0000442 name: ds_oligo namespace: sequence def: "A double stranded oligonucleotide." [SO:ke] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "double stranded oligonucleotide" EXACT [] synonym: "ds oligo" EXACT [] synonym: "ds-oligonucleotide" EXACT [] is_a: SO:0000696 ! oligo [Term] id: SO:0000454 name: rasiRNA namespace: sequence def: "A 17-28-nt, small interfering RNA derived from transcripts of repetitive elements." [http://www.developmentalcell.com/content/article/abstract?uid=PIIS1534580703002284, PMID:18032451] comment: Changed parent term from ncRNA (SO:0000655) to piRNA (SO:0001035). See GitHub Issue #573. subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:rasiRNA" EXACT [] synonym: "repeat associated small interfering RNA" EXACT [] is_a: SO:0000370 ! small_regulatory_ncRNA [Term] id: SO:0000462 name: pseudogenic_region namespace: sequence def: "A non-functional descendant of a functional entity." [SO:cjm] subset: SOFA synonym: "pseudogenic region" EXACT [] is_a: SO:0001411 ! biological_region [Term] id: SO:0000464 name: decayed_exon namespace: sequence def: "A non-functional descendant of an exon." [SO:ke] comment: Does not have to be part of a pseudogene. subset: SOFA synonym: "decayed exon" EXACT [] is_a: SO:0000462 ! pseudogenic_region relationship: non_functional_homolog_of SO:0000147 ! exon [Term] id: SO:0000468 name: golden_path_fragment namespace: sequence def: "One of the pieces of sequence that make up a golden path." [SO:rd] subset: SOFA synonym: "golden path fragment" EXACT [] is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000688 ! golden_path [Term] id: SO:0000472 name: tiling_path namespace: sequence def: "A set of regions which overlap with minimal polymorphism to form a linear sequence." [SO:cjm] subset: SOFA synonym: "tiling path" EXACT [] is_a: SO:0000353 ! sequence_assembly [Term] id: SO:0000474 name: tiling_path_fragment namespace: sequence def: "A piece of sequence that makes up a tiling_path (SO:0000472)." [SO:ke] subset: SOFA synonym: "tiling path fragment" EXACT [] is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000472 ! tiling_path [Term] id: SO:0000483 name: nc_primary_transcript namespace: sequence def: "A primary transcript that is never translated into a protein." [SO:ke] subset: SOFA synonym: "nc primary transcript" EXACT [] synonym: "noncoding primary transcript" EXACT [] is_a: SO:0000185 ! primary_transcript [Term] id: SO:0000484 name: three_prime_coding_exon_noncoding_region namespace: sequence def: "The sequence of the 3' exon that is not coding." [SO:ke] subset: SOFA synonym: "three prime coding exon noncoding region" EXACT [] synonym: "three_prime_exon_noncoding_region" EXACT [] is_a: SO:0001214 ! noncoding_region_of_exon relationship: part_of SO:0000195 ! coding_exon [Term] id: SO:0000486 name: five_prime_coding_exon_noncoding_region namespace: sequence def: "The sequence of the 5' exon preceding the start codon." [SO:ke] subset: SOFA synonym: "five prime coding exon noncoding region" EXACT [] synonym: "five_prime_exon_noncoding_region" EXACT [] is_a: SO:0001214 ! noncoding_region_of_exon relationship: part_of SO:0000200 ! five_prime_coding_exon [Term] id: SO:0000499 name: virtual_sequence namespace: sequence def: "A continuous piece of sequence similar to the 'virtual contig' concept of the Ensembl database." [SO:ke] subset: SOFA synonym: "virtual sequence" EXACT [] is_a: SO:0000353 ! sequence_assembly [Term] id: SO:0000502 name: transcribed_region namespace: sequence def: "A region of sequence that is transcribed. This region may cover the transcript of a gene, it may emcompas the sequence covered by all of the transcripts of a alternately spliced gene, or it may cover the region transcribed by a polycistronic transcript. A gene may have 1 or more transcribed regions and a transcribed_region may belong to one or more genes." [SO:ke] comment: This concept cam about as a direct result of the SO meeting August 2004.nThe exact nature of the relationship between transcribed_region and gene is still up for discussion. We are going with 'associated_with' for the time being. subset: SOFA is_obsolete: true [Term] id: SO:0000551 name: polyA_signal_sequence namespace: sequence def: "The recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA." [http://www.insdc.org/files/feature_table.html] comment: Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:polyA_signal_sequence" EXACT [] synonym: "poly(A) signal" EXACT [] synonym: "polyA signal sequence" EXACT [] synonym: "polyadenylation termination signal" EXACT [] is_a: SO:0001055 ! transcriptional_cis_regulatory_region [Term] id: SO:0000553 name: polyA_site namespace: sequence alt_id: SO:0001430 def: "The site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation. The boundary between the UTR and the polyA sequence." [http://www.insdc.org/files/feature_table.html] subset: SOFA synonym: "INSDC_feature:polyA_site" EXACT [] synonym: "polyA cleavage site" EXACT [] synonym: "polyA junction" EXACT [] synonym: "polyA site" EXACT [] synonym: "polyA_junction" EXACT [] synonym: "polyadenylation site" RELATED [] is_a: SO:0000699 ! junction relationship: part_of SO:0000205 ! three_prime_UTR relationship: part_of SO:0000233 ! mature_transcript [Term] id: SO:0000577 name: centromere namespace: sequence def: "A region of chromosome where the spindle fibers attach during mitosis and meiosis." [SO:ke] subset: SOFA synonym: "INSDC_feature:centromere" EXACT [] xref: http://en.wikipedia.org/wiki/Centromere "wiki" is_a: SO:0000628 ! chromosomal_structural_element [Term] id: SO:0000581 name: cap namespace: sequence def: "A structure consisting of a 7-methylguanosine in 5'-5' triphosphate linkage with the first nucleotide of an mRNA. It is added post-transcriptionally, and is not encoded in the DNA." [http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/mbglossary/mbgloss.html] subset: SOFA xref: http://en.wikipedia.org/wiki/5%27_cap "wiki" is_a: SO:0001411 ! biological_region [Term] id: SO:0000587 name: group_I_intron namespace: sequence def: "Group I catalytic introns are large self-splicing ribozymes. They catalyze their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of 9 paired regions (P1-P9). These fold to essentially two domains, the P4-P6 domain (formed from the stacking of P5, P4, P6 and P6a helices) and the P3-P9 domain (formed from the P8, P3, P7 and P9 helices). Group I catalytic introns often have long ORFs inserted in loop regions." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00028] comment: GO:0000372. subset: SOFA synonym: "group I intron" EXACT [] xref: http://en.wikipedia.org/wiki/Group_I_intron "wiki" is_a: SO:0000588 ! autocatalytically_spliced_intron [Term] id: SO:0000588 name: autocatalytically_spliced_intron namespace: sequence def: "A self spliced intron." [SO:ke] subset: SOFA synonym: "autocatalytically spliced intron" EXACT [] synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:autocatalytically_spliced_intron" EXACT [] is_a: SO:0000188 ! intron [Term] id: SO:0000590 name: SRP_RNA namespace: sequence def: "The signal recognition particle (SRP) is a universally conserved ribonucleoprotein. It is involved in the co-translational targeting of proteins to membranes. The eukaryotic SRP consists of a 300-nucleotide 7S RNA and six proteins: SRPs 72, 68, 54, 19, 14, and 9. Archaeal SRP consists of a 7S RNA and homologues of the eukaryotic SRP19 and SRP54 proteins. In most eubacteria, the SRP consists of a 4.5S RNA and the Ffh protein (a homologue of the eukaryotic SRP54 protein). Eukaryotic and archaeal 7S RNAs have very similar secondary structures, with eight helical elements. These fold into the Alu and S domains, separated by a long linker region. Eubacterial SRP is generally a simpler structure, with the M domain of Ffh bound to a region of the 4.5S RNA that corresponds to helix 8 of the eukaryotic and archaeal SRP S domain. Some Gram-positive bacteria (e.g. Bacillus subtilis), however, have a larger SRP RNA that also has an Alu domain. The Alu domain is thought to mediate the peptide chain elongation retardation function of the SRP. The universally conserved helix which interacts with the SRP54/Ffh M domain mediates signal sequence recognition. In eukaryotes and archaea, the SRP19-helix 6 complex is thought to be involved in SRP assembly and stabilizes helix 8 for SRP54 binding." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00017] subset: SOFA synonym: "7S RNA" RELATED [] synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:SRP_RNA" EXACT [] synonym: "signal recognition particle RNA" RELATED [] synonym: "SRP RNA" EXACT [] is_a: SO:0000655 ! ncRNA relationship: derives_from SO:0000483 ! nc_primary_transcript [Term] id: SO:0000593 name: C_D_box_snoRNA namespace: sequence def: "Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'." [http://www.bio.umass.edu/biochem/rna-sequence/Yeast_snoRNA_Database/snoRNA_DataBase.html] comment: Added 'SNORD' as a synonym of C_D_box_snoRNA (SO:0000593) and 'SNORA' as a synonym of H_ACA_box_snoRNA (SO:0000594). See GitHub Issue #577. subset: SOFA synonym: "box C/D snoRNA" EXACT [] synonym: "C D box snoRNA" EXACT [] synonym: "C/D box snoRNA" EXACT [] synonym: "SNORD" EXACT [PMID:31828325] is_a: SO:0000275 ! snoRNA [Term] id: SO:0000602 name: guide_RNA namespace: sequence def: "A short 3'-uridylated RNA that can form a duplex (except for its post-transcriptionally added oligo_U tail (SO:0000609)) with a stretch of mature edited mRNA." [http://www.rna.ucla.edu/index.html] subset: SOFA synonym: "gRNA" EXACT [] synonym: "guide RNA" EXACT [] synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:guide_RNA" EXACT [] xref: http://en.wikipedia.org/wiki/Guide_RNA "wiki" is_a: SO:0000655 ! ncRNA [Term] id: SO:0000603 name: group_II_intron namespace: sequence def: "Group II introns are found in rRNA, tRNA and mRNA of organelles in fungi, plants and protists, and also in mRNA in bacteria. They are large self-splicing ribozymes and have 6 structural domains (usually designated dI to dVI). A subset of group II introns also encode essential splicing proteins in intronic ORFs. The length of these introns can therefore be up to 3kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps. The 2' hydroxyl of a bulged adenosine in domain VI attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. Protein machinery is required for splicing in vivo, and long range intron to intron and intron-exon interactions are important for splice site positioning. Group II introns are further sub-classified into groups IIA and IIB which differ in splice site consensus, distance of bulged A from 3' splice site, some tertiary interactions, and intronic ORF phylogeny." [http://www.sanger.ac.uk/Software/Rfam/browse/index.shtml] comment: GO:0000373. subset: SOFA synonym: "group II intron" EXACT [] xref: http://en.wikipedia.org/wiki/Group_II_intron "wiki" is_a: SO:0000588 ! autocatalytically_spliced_intron [Term] id: SO:0000605 name: intergenic_region namespace: sequence def: "A region containing or overlapping no genes that is bounded on either side by a gene, or bounded by a gene and the end of the chromosome." [SO:cjm] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "intergenic region" EXACT [] xref: http://en.wikipedia.org/wiki/Intergenic_region "wiki" is_a: SO:0001411 ! biological_region [Term] id: SO:0000610 name: polyA_sequence namespace: sequence def: "Sequence of about 100 nucleotides of A added to the 3' end of most eukaryotic mRNAs." [SO:ke] subset: SOFA synonym: "polyA sequence" EXACT [] is_a: SO:0001411 ! biological_region relationship: adjacent_to SO:0000234 ! mRNA [Term] id: SO:0000611 name: branch_site namespace: sequence def: "A pyrimidine rich sequence near the 3' end of an intron to which the 5'end becomes covalently bound during nuclear splicing. The resulting structure resembles a lariat." [SO:ke] subset: SOFA synonym: "branch point" EXACT [] synonym: "branch site" EXACT [] synonym: "branch_point" EXACT [] is_a: SO:0000841 ! spliceosomal_intron_region [Term] id: SO:0000612 name: polypyrimidine_tract namespace: sequence def: "The polypyrimidine tract is one of the cis-acting sequence elements directing intron removal in pre-mRNA splicing." [http://nar.oupjournals.org/cgi/content/full/25/4/888] subset: SOFA synonym: "polypyrimidine tract" EXACT [] xref: http://en.wikipedia.org/wiki/Polypyrimidine_tract "wiki" is_a: SO:0000841 ! spliceosomal_intron_region [Term] id: SO:0000616 name: transcription_end_site namespace: sequence def: "The base where transcription ends." [SO:ke] subset: SOFA synonym: "transcription end site" EXACT [] is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000624 name: telomere namespace: sequence def: "A specific structure at the end of a linear chromosome, required for the integrity and maintenance of the end." [SO:ma] subset: SOFA synonym: "INSDC_feature:telomere" EXACT [] synonym: "telomeric DNA" EXACT [] synonym: "telomeric sequence" EXACT [] xref: http://en.wikipedia.org/wiki/Telomere "wiki" is_a: SO:0000628 ! chromosomal_structural_element [Term] id: SO:0000625 name: silencer namespace: sequence def: "A regulatory region which upon binding of transcription factors, suppress the transcription of the gene or genes they control." [SO:ke] subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:silencer" EXACT [] xref: http://en.wikipedia.org/wiki/Silencer_(DNA) "wiki" is_a: SO:0000727 ! cis_regulatory_module [Term] id: SO:0000627 name: insulator namespace: sequence def: "A regulatory region that 1) when located between a CRM and a gene's promoter prevents the CRM from modulating that genes expression and 2) acts as a chromatin boundary element or barrier that can block the encroachment of condensed chromatin from an adjacent region." [NCBI:cf, PMID:12154228, SO:regcreative] comment: moved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020. subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:insulator" EXACT [] synonym: "insulator element" EXACT [] xref: http://en.wikipedia.org/wiki/Insulator_(genetics) "wiki" is_a: SO:0000727 ! cis_regulatory_module [Term] id: SO:0000628 name: chromosomal_structural_element namespace: sequence def: "Regions of the chromosome that are important for structural elements." [] subset: SOFA synonym: "chromosomal structural element" EXACT [] is_a: SO:0000830 ! chromosome_part [Term] id: SO:0000643 name: minisatellite namespace: sequence def: "A repeat region containing tandemly repeated sequences having a unit length of 10 to 40 bp." [http://www.informatics.jax.org/silver/glossary.shtml] subset: SOFA synonym: "INSDC_feature:repeat_region" BROAD [] synonym: "INSDC_qualifier:minisatellite" EXACT [] synonym: "VNTR" EXACT [http://www.ncbi.nlm.nih.gov/books/NBK21126/def-item/A9655/] xref: http://en.wikipedia.org/wiki/Minisatellite "wiki" is_a: SO:0000005 ! satellite_DNA [Term] id: SO:0000644 name: antisense_RNA namespace: sequence def: "Antisense RNA is RNA that is transcribed from the coding, rather than the template, strand of DNA. It is therefore complementary to mRNA." [SO:ke] subset: SOFA synonym: "antisense RNA" EXACT [] synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:antisense_RNA" EXACT [] xref: http://en.wikipedia.org/wiki/Antisense_RNA "wiki" is_a: SO:0000655 ! ncRNA relationship: derives_from SO:0000645 ! antisense_primary_transcript [Term] id: SO:0000645 name: antisense_primary_transcript namespace: sequence def: "The reverse complement of the primary transcript." [SO:ke] subset: SOFA synonym: "antisense primary transcript" EXACT [] is_a: SO:0000185 ! primary_transcript [Term] id: SO:0000646 name: siRNA namespace: sequence def: "A small RNA molecule that is the product of a longer exogenous or endogenous dsRNA, which is either a bimolecular duplex or very long hairpin, processed (via the Dicer pathway) such that numerous siRNAs accumulate from both strands of the dsRNA. siRNAs trigger the cleavage of their target molecules." [PMID:12592000] subset: SOFA synonym: "INSDC_feature:ncRNA" BROAD [] synonym: "INSDC_qualifier:siRNA" EXACT [] synonym: "small interfering RNA" EXACT [] xref: http://en.wikipedia.org/wiki/SiRNA "wiki" is_a: SO:0000370 ! small_regulatory_ncRNA [Term] id: SO:0000650 name: cytosolic_SSU_rRNA namespace: sequence def: "Cytosolic SSU rRNA is an RNA component of the small subunit of cytosolic ribosomes." [SO:ke] comment: Renamed to cytosolic_SSU_rRNA from small_subunit_rRNA on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic small subunit rRNA" EXACT [] synonym: "cytosolic SSU ribosomal RNA" EXACT [] synonym: "cytosolic SSU rRNA" EXACT [] is_a: SO:0000252 ! rRNA [Term] id: SO:0000651 name: cytosolic_LSU_rRNA namespace: sequence def: "Cytosolic LSU rRNA is an RNA component of the large subunit of cytosolic ribosomes." [SO:ke] comment: Renamed to cytosolic_LSU_rRNA from large_subunit_rRNA on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic large subunit rRNA" EXACT [] synonym: "cytosolic LSU RNA" EXACT [] synonym: "cytosolic LSU rRNA" EXACT [] is_a: SO:0000252 ! rRNA relationship: derives_from SO:0000325 ! rRNA_large_subunit_primary_transcript [Term] id: SO:0000652 name: cytosolic_5S_rRNA namespace: sequence def: "Cytosolic 5S rRNA is an RNA component of the large subunit of cytosolic ribosomes in both prokaryotes and eukaryotes." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00001] comment: Renamed from rRNA_5S to cytosolic_5S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic 5S LSU rRNA" EXACT [] synonym: "cytosolic 5S ribosomal RNA" EXACT [] synonym: "cytosolic 5S rRNA" EXACT [] synonym: "cytosolic rRNA 5S" EXACT [] xref: http://en.wikipedia.org/wiki/5S_ribosomal_RNA "wiki" is_a: SO:0000651 ! cytosolic_LSU_rRNA relationship: derives_from SO:0000704 ! gene [Term] id: SO:0000653 name: cytosolic_28S_rRNA namespace: sequence def: "Cytosolic 28S rRNA is an RNA component of the large subunit of cytosolic ribosomes in metazoan eukaryotes." [SO:ke] comment: Renamed from rRNA_28S to cytosolic_28S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic 28S LSU rRNA" EXACT [] synonym: "cytosolic 28S ribosomal RNA" EXACT [] synonym: "cytosolic 28S rRNA" EXACT [] synonym: "cytosolic rRNA 28S" EXACT [] xref: http://en.wikipedia.org/wiki/28S_ribosomal_RNA "wiki" is_a: SO:0000651 ! cytosolic_LSU_rRNA relationship: derives_from SO:0000704 ! gene [Term] id: SO:0000655 name: ncRNA namespace: sequence def: "An RNA transcript that does not encode for a protein rather the RNA molecule is the gene product." [SO:ke] comment: A ncRNA is a processed_transcript, so it may not contain parts such as transcribed_spacer_regions that are removed in the act of processing. For the corresponding primary_transcripts, please see term SO:0000483 nc_primary_transcript. subset: SOFA synonym: "INSDC_qualifier:other" BROAD [] synonym: "known_ncrna" EXACT [] synonym: "noncoding RNA" EXACT [] xref: http://en.wikipedia.org/wiki/NcRNA "wiki" xref: http://www.gencodegenes.org/gencode_biotypes.html "GENCODE" is_a: SO:0000233 ! mature_transcript [Term] id: SO:0000657 name: repeat_region namespace: sequence def: "A region of sequence containing one or more repeat units." [SO:ke] subset: SOFA synonym: "INSDC_feature:repeat_region" BROAD [] synonym: "INSDC_qualifier:other" EXACT [] synonym: "repeat region" EXACT [] is_a: SO:0001411 ! biological_region relationship: has_part SO:0001411 ! biological_region [Term] id: SO:0000658 name: dispersed_repeat namespace: sequence def: "A repeat that is located at dispersed sites in the genome." [SO:ke] subset: SOFA synonym: "dispersed repeat" EXACT [] synonym: "INSDC_feature:repeat_region" BROAD [] synonym: "INSDC_qualifier:dispersed" EXACT [] synonym: "interspersed repeat" EXACT [] xref: http://en.wikipedia.org/wiki/Interspersed_repeat "wiki" is_a: SO:0000657 ! repeat_region [Term] id: SO:0000662 name: spliceosomal_intron namespace: sequence def: "An intron which is spliced by the spliceosome." [SO:ke] comment: GO:0000398. subset: SOFA synonym: "spliceosomal intron" EXACT [] is_a: SO:0000188 ! intron [Term] id: SO:0000667 name: insertion namespace: sequence alt_id: SO:1000034 def: "The sequence of one or more nucleotides added between two adjacent nucleotides in the sequence." [SO:ke] subset: DBVAR subset: SOFA synonym: "insertion" EXACT dbvar [http://www.ncbi.nlm.nih.gov/dbvar/] synonym: "nucleotide insertion" EXACT [] synonym: "nucleotide_insertion" EXACT [] xref: loinc:LA6687-3 "Insertion" is_a: SO:0001059 ! sequence_alteration is_a: SO:0001411 ! biological_region [Term] id: SO:0000668 name: EST_match namespace: sequence def: "A match against an EST sequence." [SO:ke] subset: SOFA synonym: "EST match" EXACT [] is_a: SO:0000102 ! expressed_sequence_match [Term] id: SO:0000673 name: transcript namespace: sequence def: "An RNA synthesized on a DNA or RNA template by an RNA polymerase." [SO:ma] comment: Added relationship overlaps SO:0002300 unit_of_gene_expression with Mejia-Almonte et.al PMID:32665585 Aug 5, 2020. subset: SOFA synonym: "INSDC_feature:misc_RNA" BROAD [] xref: http://en.wikipedia.org/wiki/RNA "wiki" is_a: SO:0000831 ! gene_member_region relationship: overlaps SO:0001411 ! biological_region [Term] id: SO:0000684 name: nuclease_sensitive_site namespace: sequence def: "A region of nucleotide sequence targeted by a nuclease enzyme." [SO:ma] subset: SOFA synonym: "nuclease sensitive site" EXACT [] is_a: SO:0000059 ! nuclease_binding_site [Term] id: SO:0000687 name: deletion_junction namespace: sequence def: "The space between two bases in a sequence which marks the position where a deletion has occurred." [SO:ke] subset: SOFA synonym: "deletion junction" EXACT [] is_a: SO:0000699 ! junction [Term] id: SO:0000688 name: golden_path namespace: sequence def: "A set of subregions selected from sequence contigs which when concatenated form a nonredundant linear sequence." [SO:ls] subset: SOFA synonym: "golden path" EXACT [] is_a: SO:0000353 ! sequence_assembly [Term] id: SO:0000689 name: cDNA_match namespace: sequence def: "A match against cDNA sequence." [SO:ke] subset: SOFA synonym: "cDNA match" EXACT [] is_a: SO:0000102 ! expressed_sequence_match [Term] id: SO:0000694 name: SNP namespace: sequence def: "SNPs are single base pair positions in genomic DNA at which different sequence alternatives exist in normal individuals in some population(s), wherein the least frequent variant has an abundance of 1% or greater." [SO:cb] subset: SOFA synonym: "single nucleotide polymorphism" EXACT [] is_a: SO:0001483 ! SNV [Term] id: SO:0000695 name: reagent namespace: sequence def: "A sequence used in experiment." [SO:ke] comment: Requested by Lynn Crosby, jan 2006. subset: SOFA is_a: SO:0001409 ! biomaterial_region [Term] id: SO:0000696 name: oligo namespace: sequence def: "A short oligonucleotide sequence, of length on the order of 10's of bases; either single or double stranded." [SO:ma] subset: SOFA synonym: "oligonucleotide" EXACT [] xref: http://en.wikipedia.org/wiki/Oligonucleotide "wiki" is_a: SO:0000695 ! reagent [Term] id: SO:0000699 name: junction namespace: sequence def: "A sequence_feature with an extent of zero." [SO:ke] comment: A junction is a boundary between regions. A boundary has an extent of zero. subset: SOFA synonym: "boundary" EXACT [] synonym: "breakpoint" EXACT [] is_a: SO:0000110 ! sequence_feature [Term] id: SO:0000700 name: remark namespace: sequence def: "A comment about the sequence." [SO:ke] subset: SOFA is_a: SO:0001410 ! experimental_feature [Term] id: SO:0000701 name: possible_base_call_error namespace: sequence def: "A region of sequence where the validity of the base calling is questionable." [SO:ke] subset: SOFA synonym: "possible base call error" EXACT [] is_a: SO:0000413 ! sequence_difference [Term] id: SO:0000702 name: possible_assembly_error namespace: sequence def: "A region of sequence where there may have been an error in the assembly." [SO:ke] subset: SOFA synonym: "possible assembly error" EXACT [] is_a: SO:0000413 ! sequence_difference [Term] id: SO:0000703 name: experimental_result_region namespace: sequence def: "A region of sequence implicated in an experimental result." [SO:ke] subset: SOFA synonym: "experimental result region" EXACT [] is_a: SO:0000700 ! remark [Term] id: SO:0000704 name: gene namespace: sequence def: "A region (or regions) that includes all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions and/or other functional sequence regions." [SO:immuno_workshop] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. A gene may be considered as a unit of inheritance. subset: SOFA synonym: "INSDC_feature:gene" EXACT [] xref: http://en.wikipedia.org/wiki/Gene "wiki" is_a: SO:0001411 ! biological_region relationship: member_of SO:0005855 ! gene_group [Term] id: SO:0000705 name: tandem_repeat namespace: sequence def: "Two or more adjacent copies of a region (of length greater than 1)." [SO:ke] subset: SOFA synonym: "INSDC_feature:repeat_region" BROAD [] synonym: "INSDC_qualifier:tandem" EXACT [] synonym: "tandem repeat" EXACT [] xref: http://en.wikipedia.org/wiki/Tandem_repeat "wiki" xref: http://www.sci.sdsu.edu/~smaloy/Glossary/T.html is_a: SO:0000657 ! repeat_region [Term] id: SO:0000706 name: trans_splice_acceptor_site namespace: sequence def: "The 3' splice site of the acceptor primary transcript." [SO:ke] comment: This region contains a polypyridine tract and AG dinucleotide in some organisms and is UUUCAG in C. elegans. subset: SOFA synonym: "3' trans splice site" RELATED [] synonym: "trans splice acceptor site" EXACT [] is_a: SO:0001420 ! trans_splice_site [Term] id: SO:0000714 name: nucleotide_motif namespace: sequence def: "A region of nucleotide sequence corresponding to a known motif." [SO:ke] subset: SOFA synonym: "INSDC_feature:misc_feature" BROAD [] synonym: "INSDC_note:nucleotide_motif" EXACT [] synonym: "nucleotide motif" EXACT [] is_a: SO:0001683 ! sequence_motif [Term] id: SO:0000715 name: RNA_motif namespace: sequence def: "A motif that is active in RNA sequence." [SO:ke] subset: SOFA synonym: "RNA motif" EXACT [] is_a: SO:0000714 ! nucleotide_motif [Term] id: SO:0000717 name: reading_frame namespace: sequence def: "A nucleic acid sequence that when read as sequential triplets, has the potential of encoding a sequential string of amino acids. It need not contain the start or stop codon." [SGD:rb] comment: This term was added after a request by SGD. August 2004. Modified after SO meeting in Cambridge to not include start or stop. subset: SOFA synonym: "reading frame" EXACT [] xref: http://en.wikipedia.org/wiki/Reading_frame "wiki" is_a: SO:0001410 ! experimental_feature [Term] id: SO:0000719 name: ultracontig namespace: sequence def: "An ordered and oriented set of scaffolds based on somewhat weaker sets of inferential evidence such as one set of mate pair reads together with supporting evidence from ESTs or location of markers from SNP or microsatellite maps, or cytogenetic localization of contained markers." [FB:WG] subset: SOFA synonym: "pseudochromosome" EXACT [] synonym: "superscaffold" RELATED [] is_a: SO:0000353 ! sequence_assembly [Term] id: SO:0000724 name: oriT namespace: sequence def: "A region of a DNA molecule where transfer is initiated during the process of conjugation or mobilization." [http://www.insdc.org/files/feature_table.html] subset: SOFA synonym: "INSDC_feature:oriT" EXACT [] synonym: "origin of transfer" EXACT [] xref: http://en.wikipedia.org/wiki/Origin_of_transfer "wiki" is_a: SO:0000296 ! origin_of_replication [Term] id: SO:0000725 name: transit_peptide namespace: sequence alt_id: BS:00055 def: "The transit_peptide is a short region at the N-terminus of the peptide that directs the protein to an organelle (chloroplast, mitochondrion, microbody or cyanelle)." [http://www.insdc.org/files/feature_table.html] comment: Added to bring SO inline with the EMBL, DDBJ, GenBank feature table. Old definition before biosapiens: The coding sequence for an N-terminal domain of a nuclear-encoded organellar protein. This domain is involved in post translational import of the protein into the organelle. subset: biosapiens subset: SOFA synonym: "INSDC_feature:transit_peptide" EXACT [] synonym: "signal" RELATED [] synonym: "transit" RELATED [uniprot:feature_type] synonym: "transit peptide" EXACT [] is_a: SO:0001527 ! peptide_localization_signal relationship: derives_from SO:0000851 ! CDS_region [Term] id: SO:0000727 name: cis_regulatory_module namespace: sequence def: "A regulatory region where transcription factor binding sites are clustered to regulate various aspects of transcription activities. (CRMs can be located a few kb to hundreds of kb upstream of the core promoter, in the coding sequence, within introns, or in the untranslated regions (UTR) sequences, and even on a different chromosome). A single gene can be regulated by multiple CRMs to give precise control of its spatial and temporal expression. CRMs function as nodes in large, intertwined regulatory network. CRM DNA accessibility is subject to regulation by dbTFs and transcription co-TFs." [PMID:19660565, SO:SG] comment: Requested by Stephen Grossmann Dec 2004. Changed relationship from has_part SO:0000235 TF_binding site to TF_binding_site is part_of SO:0000727 CRM in response to requests from GREEKC initiative in Aug 2020. Removed 3' from definition because 5' UTRs are included as well, notified by Colin Logie of GREEKC. Nov 9 2020. DS Updated name from 'CRM' to 'cis_regulatory_module' on 08 Feb 2021. See GitHub Issue #526. DS Added final sentence to definition as part of GREEKC Feb 16, 2021. See GitHub Issue #534. subset: SOFA synonym: "cis regulatory module" EXACT [] synonym: "CRM" EXACT [] synonym: "TF module" EXACT [] synonym: "transcription factor module" EXACT [] is_a: SO:0001055 ! transcriptional_cis_regulatory_region [Term] id: SO:0000730 name: gap namespace: sequence def: "A gap in the sequence of known length. The unknown bases are filled in with N's." [SO:ke] subset: SOFA synonym: "INSDC_feature:assembly_gap" NARROW [] synonym: "INSDC_feature:gap" EXACT [] is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000353 ! sequence_assembly [Term] id: SO:0000752 name: gene_group_regulatory_region namespace: sequence def: "A region that is involved in the regulation of transcription of a group of regulated genes." [] comment: Merged into transcriptional_cis_regulatory_region (SO:0001055) on 11 Feb 2021 as part of GREEKC reducing redundancy as we prepare to submit several terms to Ensembl. See GitHub Issue #529. subset: SOFA synonym: "gene group regulatory region" EXACT [] is_obsolete: true replaced_by: SO:0001055 [Term] id: SO:0000753 name: clone_insert namespace: sequence def: "The region of sequence that has been inserted and is being propagated by the clone." [SO:ke] subset: SOFA synonym: "clone insert" EXACT [] is_a: SO:0000695 ! reagent relationship: part_of SO:0000151 ! clone [Term] id: SO:0000777 name: pseudogenic_rRNA namespace: sequence def: "A non functional descendant of an rRNA." [SO:ke] comment: Added Jan 2006 to allow the annotation of the pseudogenic rRNA by flybase. Non-functional is defined as its transcription is prevented due to one or more mutatations. subset: SOFA synonym: "INSDC_feature:rRNA" BROAD [] synonym: "INSDC_qualifier:pseudo" EXACT [] synonym: "pseudogenic rRNA" EXACT [] is_a: SO:0000462 ! pseudogenic_region relationship: non_functional_homolog_of SO:0000673 ! transcript relationship: part_of SO:0000336 ! pseudogene [Term] id: SO:0000778 name: pseudogenic_tRNA namespace: sequence def: "A non functional descendent of a tRNA." [SO:ke] comment: Added Jan 2006 to allow the annotation of the pseudogenic tRNA by flybase. Non-functional is defined as its transcription is prevented due to one or more mutatations. subset: SOFA synonym: "INSDC_feature:tRNA" BROAD [] synonym: "INSDC_qualifier:pseudo" EXACT [] synonym: "pseudogenic tRNA" EXACT [] is_a: SO:0000462 ! pseudogenic_region relationship: non_functional_homolog_of SO:0000673 ! transcript relationship: part_of SO:0000336 ! pseudogene [Term] id: SO:0000830 name: chromosome_part namespace: sequence def: "A region of a chromosome." [SO:ke] comment: This is a manufactured term, that serves the purpose of allow the parts of a chromosome to have an is_a path to the root. subset: SOFA synonym: "chromosomal region" EXACT [] synonym: "chromosomal_region" EXACT [] synonym: "chromosome part" EXACT [] is_a: SO:0001411 ! biological_region relationship: part_of SO:0000340 ! chromosome [Term] id: SO:0000831 name: gene_member_region namespace: sequence def: "A region of a gene." [SO:ke] comment: A manufactured term used to allow the parts of a gene to have an is_a path to the root. subset: SOFA synonym: "gene member region" EXACT [] is_a: SO:0001411 ! biological_region relationship: member_of SO:0000704 ! gene [Term] id: SO:0000833 name: transcript_region namespace: sequence def: "A region of a transcript." [SO:ke] comment: This term was added to provide a grouping term for the region parts of transcript, thus giving them an is_a path back to the root. subset: SOFA synonym: "transcript region" EXACT [] is_a: SO:0001411 ! biological_region relationship: part_of SO:0000673 ! transcript [Term] id: SO:0000834 name: mature_transcript_region namespace: sequence def: "A region of a mature transcript." [SO:ke] comment: A manufactured term to collect together the parts of a mature transcript and give them an is_a path to the root. subset: SOFA synonym: "mature transcript region" EXACT [] is_a: SO:0000833 ! transcript_region [Term] id: SO:0000835 name: primary_transcript_region namespace: sequence def: "A part of a primary transcript." [SO:ke] comment: This term was added to provide a grouping term for the region parts of primary_transcript, thus giving them an is_a path back to the root. subset: SOFA synonym: "primary transcript region" EXACT [] is_a: SO:0000833 ! transcript_region relationship: part_of SO:0000185 ! primary_transcript [Term] id: SO:0000836 name: mRNA_region namespace: sequence def: "A region of an mRNA." [SO:cb] comment: This term was added to provide a grouping term for the region parts of mRNA, thus giving them an is_a path back to the root. subset: SOFA synonym: "mRNA region" EXACT [] is_a: SO:0000834 ! mature_transcript_region relationship: part_of SO:0000234 ! mRNA [Term] id: SO:0000837 name: UTR_region namespace: sequence def: "A region of UTR." [SO:ke] comment: A region of UTR. This term is a grouping term to allow the parts of UTR to have an is_a path to the root. subset: SOFA synonym: "UTR region" EXACT [] is_a: SO:0000836 ! mRNA_region [Term] id: SO:0000839 name: polypeptide_region namespace: sequence alt_id: BS:00124 alt_id: BS:00331 def: "Biological sequence region that can be assigned to a specific subsequence of a polypeptide." [SO:GAR, SO:ke] comment: Added to allow the polypeptide regions to have is_a paths back to the root. subset: biosapiens subset: SOFA synonym: "positional" RELATED [] synonym: "positional polypeptide feature" RELATED [] synonym: "region" NARROW [uniprot:feature_type] synonym: "region or site annotation" RELATED [] synonym: "site" NARROW [uniprot:feature_type] is_a: SO:0001411 ! biological_region relationship: part_of SO:0000104 ! polypeptide [Term] id: SO:0000841 name: spliceosomal_intron_region namespace: sequence def: "A region within an intron." [SO:ke] comment: A terms added to allow the parts of introns to have is_a paths to the root. subset: SOFA synonym: "spliceosomal intron region" EXACT [] is_a: SO:0000835 ! primary_transcript_region relationship: part_of SO:0000662 ! spliceosomal_intron [Term] id: SO:0000842 name: gene_component_region namespace: sequence def: "A region of a gene that has a specific function." [] subset: SOFA synonym: "gene component region" EXACT [] is_a: SO:0001411 ! biological_region relationship: part_of SO:0000704 ! gene [Term] id: SO:0000851 name: CDS_region namespace: sequence def: "A region of a CDS." [SO:cb] subset: SOFA synonym: "CDS region" EXACT [] is_a: SO:0000836 ! mRNA_region relationship: part_of SO:0000316 ! CDS [Term] id: SO:0000852 name: exon_region namespace: sequence def: "A region of an exon." [RSC:cb] subset: SOFA synonym: "exon region" EXACT [] is_a: SO:0000833 ! transcript_region relationship: part_of SO:0000147 ! exon [Term] id: SO:0001000 name: cytosolic_16S_rRNA namespace: sequence def: "Cytosolic 16S rRNA is an RNA component of the small subunit of cytosolic ribosomes in prokaryotes." [SO:ke] comment: Renamed to cytosolic_16S_rRNA from rRNA_16S on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic 16S ribosomal RNA" EXACT [] synonym: "cytosolic 16S rRNA" RELATED [] synonym: "cytosolic 16S SSU RNA" EXACT [] synonym: "cytosolic rRNA 16S" EXACT [] xref: http://en.wikipedia.org/wiki/16S_ribosomal_RNA "wiki" is_a: SO:0000650 ! cytosolic_SSU_rRNA relationship: derives_from SO:0000704 ! gene [Term] id: SO:0001001 name: cytosolic_23S_rRNA namespace: sequence def: "Cytosolic 23S rRNA is an RNA component of the large subunit of cytosolic ribosomes in prokaryotes." [SO:ke] comment: Renamed from rRNA_23S to cytosolic_23S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic 23S LSU rRNA" EXACT [] synonym: "cytosolic 23S ribosomal RNA" RELATED [] synonym: "cytosolic 23S rRNA" EXACT [] synonym: "cytosolic rRNA 23S" EXACT [] is_a: SO:0000651 ! cytosolic_LSU_rRNA relationship: derives_from SO:0000704 ! gene [Term] id: SO:0001002 name: cytosolic_25S_rRNA namespace: sequence def: "Cytosolic 25S rRNA is an RNA component of the large subunit of cytosolic ribosomes most eukaryotes." [PMID:15493135, PMID:2100998, RSC:cb] comment: Renamed from rRNA_5S to cytosolic_5S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. subset: SOFA synonym: "cytosolic 25S LSU rRNA" EXACT [] synonym: "cytosolic 25S ribosomal RNA" EXACT [] synonym: "cytosolic 25S rRNA" EXACT [] synonym: "cytosolic rRNA 25S" EXACT [] is_a: SO:0000651 ! cytosolic_LSU_rRNA relationship: derives_from SO:0000704 ! gene [Term] id: SO:0001019 name: copy_number_variation namespace: sequence def: "A variation that increases or decreases the copy number of a given region." [SO:ke] subset: SOFA synonym: "CNP" EXACT [] synonym: "CNV" EXACT [] synonym: "copy number polymorphism" EXACT [] synonym: "copy number variation" EXACT [] xref: http://en.wikipedia.org/wiki/Copy_number_variation "wiki" is_a: SO:0001059 ! sequence_alteration [Term] id: SO:0001037 name: mobile_genetic_element namespace: sequence def: "A nucleotide region with either intra-genome or intracellular mobility, of varying length, which often carry the information necessary for transfer and recombination with the host genome." [PMID:14681355] subset: SOFA synonym: "INSDC_feature:mobile_element" EXACT [] synonym: "MGE" EXACT [] synonym: "mobile genetic element" EXACT [] xref: http://en.wikipedia.org/wiki/Mobile_genetic_element "wiki" is_a: SO:0001411 ! biological_region [Term] id: SO:0001039 name: integrated_mobile_genetic_element namespace: sequence def: "An MGE that is integrated into the host chromosome." [SO:ke] subset: SOFA synonym: "integrated mobile genetic element" EXACT [] is_a: SO:0001037 ! mobile_genetic_element [Term] id: SO:0001055 name: transcriptional_cis_regulatory_region namespace: sequence def: "A regulatory_region that modulates the transcription of a gene or genes." [PMID:9679020, SO:regcreative] comment: Previous parent term transcription_regulatory_region (SO:0001067) has been merged with this term on 11 Feb 2021 as part of the GREEKC consortium. See GitHub Issue #527. subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:transcriptional_cis_regulatory_region" EXACT [] synonym: "transcription-control region" EXACT [] synonym: "transcriptional cis regulatory region" EXACT [] is_a: SO:0005836 ! regulatory_region [Term] id: SO:0001056 name: splicing_regulatory_region namespace: sequence def: "A regulatory_region that modulates splicing." [SO:ke] comment: Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. subset: SOFA synonym: "splicing regulatory region" EXACT [] is_a: SO:0001055 ! transcriptional_cis_regulatory_region [Term] id: SO:0001059 name: sequence_alteration namespace: sequence alt_id: SO:1000004 alt_id: SO:1000007 def: "A sequence_alteration is a sequence_feature whose extent is the deviation from another sequence." [SO:ke] comment: Merged with partially characterized change in nucleotide sequence. subset: SOFA synonym: "INSDC_feature:misc_feature" BROAD [] synonym: "INSDC_feature:variation" EXACT [] synonym: "INSDC_note:sequence_alteration" EXACT [] synonym: "partially characterised change in DNA sequence" NARROW [] synonym: "partially_characterised_change_in_DNA_sequence" NARROW [] synonym: "sequence alteration" EXACT [] synonym: "sequence variation" RELATED [] synonym: "uncharacterised_change_in_nucleotide_sequence" NARROW [] is_a: SO:0000110 ! sequence_feature [Term] id: SO:0001063 name: immature_peptide_region namespace: sequence alt_id: BS:00129 def: "An immature_peptide_region is the extent of the peptide after it has been translated and before any processing occurs." [EBIBS:GAR] comment: Range. subset: biosapiens subset: SOFA synonym: "immature peptide region" EXACT [] is_a: SO:0000839 ! polypeptide_region [Term] id: SO:0001214 name: noncoding_region_of_exon namespace: sequence def: "The maximal intersection of exon and UTR." [SO:ke] comment: An exon either containing but not starting with a start codon or containing but not ending with a stop codon will be partially coding and partially non coding. subset: SOFA synonym: "noncoding region of exon" EXACT [] is_a: SO:0000852 ! exon_region [Term] id: SO:0001215 name: coding_region_of_exon namespace: sequence def: "The region of an exon that encodes for protein sequence." [SO:ke] comment: An exon containing either a start or stop codon will be partially coding and partially non coding. subset: SOFA synonym: "coding region of exon" EXACT [] is_a: SO:0000852 ! exon_region [Term] id: SO:0001235 name: replicon namespace: sequence def: "A region containing at least one unique origin of replication and a unique termination site." [ISBN:0716719207] subset: SOFA xref: http://en.wikipedia.org/wiki/Replicon_(genetics) "wiki" is_a: SO:0001411 ! biological_region [Term] id: SO:0001236 name: base namespace: sequence def: "A base is a sequence feature that corresponds to a single unit of a nucleotide polymer." [SO:ke] subset: SOFA xref: http://en.wikipedia.org/wiki/Nucleobase "wiki" is_a: SO:0001411 ! biological_region [Term] id: SO:0001248 name: assembly namespace: sequence def: "A region of the genome of known length that is composed by ordering and aligning two or more different regions." [SO:ke] subset: SOFA xref: http://en.wikipedia.org/wiki/Genome_assembly#Genome_assembly "wiki" is_a: SO:0001410 ! experimental_feature [Term] id: SO:0001409 name: biomaterial_region namespace: sequence def: "A region which is intended for use in an experiment." [SO:cb] subset: SOFA synonym: "biomaterial region" EXACT [] is_a: SO:0000001 ! region [Term] id: SO:0001410 name: experimental_feature namespace: sequence def: "A region which is the result of some arbitrary experimental procedure. The procedure may be carried out with biological material or inside a computer." [SO:cb] subset: SOFA synonym: "analysis feature" RELATED [] synonym: "experimental output artefact" EXACT [] synonym: "experimental_output_artefact" EXACT [] is_a: SO:0000001 ! region [Term] id: SO:0001411 name: biological_region namespace: sequence def: "A region defined by its disposition to be involved in a biological process." [SO:cb] subset: SOFA synonym: "biological region" EXACT [] synonym: "INSDC_misc_feature" BROAD [] synonym: "INSDC_note:biological_region" EXACT [] is_a: SO:0000001 ! region [Term] id: SO:0001412 name: topologically_defined_region namespace: sequence def: "A DNA region within which self-interaction occurs more often than expected by chance because of DNA-looping." [PMID:32782014, SO:cb] subset: SOFA synonym: "topologically defined region" EXACT [] is_a: SO:0000001 ! region [Term] id: SO:0001419 name: cis_splice_site namespace: sequence def: "Intronic 2 bp region bordering exon. A splice_site that adjacent_to exon and overlaps intron." [SO:cjm, SO:ke] subset: SOFA synonym: "cis splice site" EXACT [] is_a: SO:0000162 ! splice_site [Term] id: SO:0001420 name: trans_splice_site namespace: sequence def: "Primary transcript region bordering trans-splice junction." [SO:ke] subset: SOFA synonym: "trans splice site" EXACT [] is_a: SO:0000162 ! splice_site [Term] id: SO:0001483 name: SNV namespace: sequence def: "SNVs are single nucleotide positions in genomic DNA at which different sequence alternatives exist." [SO:bm] subset: SOFA synonym: "single nucleotide variant" EXACT [] is_a: SO:1000002 ! substitution created_by: kareneilbeck creation_date: 2009-10-08T11:37:49Z [Term] id: SO:0001527 name: peptide_localization_signal namespace: sequence def: "A region of peptide sequence used to target the polypeptide molecule to a specific organelle." [SO:ke] subset: SOFA synonym: "localization signal" RELATED [] synonym: "peptide localization signal" EXACT [] is_a: SO:0000839 ! polypeptide_region created_by: kareneilbeck creation_date: 2010-03-11T02:15:05Z [Term] id: SO:0001647 name: kozak_sequence namespace: sequence def: "A kind of ribosome entry site, specific to Eukaryotic organisms that overlaps part of both 5' UTR and CDS sequence." [SO:ke] subset: SOFA synonym: "kozak consensus" EXACT [] synonym: "kozak consensus sequence" EXACT [] synonym: "kozak sequence" EXACT [] xref: http://en.wikipedia.org/wiki/Kozak_consensus_sequence "wikipedia" is_a: SO:0000139 ! ribosome_entry_site created_by: kareneilbeck creation_date: 2010-06-07T03:12:20Z [Term] id: SO:0001654 name: nucleotide_to_protein_binding_site namespace: sequence def: "A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues." [SO:ke] subset: SOFA synonym: "nucleotide to protein binding site" RELATED [] is_a: SO:0000410 ! protein_binding_site created_by: kareneilbeck creation_date: 2010-08-03T12:26:05Z [Term] id: SO:0001679 name: transcription_regulatory_region namespace: sequence def: "A regulatory region that is involved in the control of the process of transcription." [SO:ke] comment: Obsoleted by David Sant on 11 Feb 2021 when it was merged with transcriptional_cis_regulatory_region (SO:0001055) to reduce redundancy and be consistent with Gene Ontology. See GitHub Issue #527. subset: SOFA synonym: "transcription regulatory region" EXACT [] is_obsolete: true created_by: kareneilbeck creation_date: 2010-10-12T03:49:35Z [Term] id: SO:0001683 name: sequence_motif namespace: sequence def: "A sequence motif is a nucleotide or amino-acid sequence pattern that may have biological significance." [http://en.wikipedia.org/wiki/Sequence_motif] subset: SOFA synonym: "sequence motif" RELATED [] xref: http://en.wikipedia.org/wiki/Sequence_motif "wikipedia" is_a: SO:0001411 ! biological_region created_by: kareneilbeck creation_date: 2010-10-14T04:13:22Z [Term] id: SO:0001720 name: epigenetically_modified_region namespace: sequence def: "A biological DNA region implicated in epigenomic changes caused by mechanisms other than changes in the underlying DNA sequence. This includes, nucleosomal histone post-translational modifications, nucleosome depletion to render DNA accessible and post-replicational base modifications such as cytosine modification." [http://en.wikipedia.org/wiki/Epigenetics, SO:ke] comment: Moved from is_a biological_region (SO:0001411) to is_a regulatory_region (SO:0005836) on 11 Feb 2021. GREEKC members pointed out that this would be a more appropriate location. See GitHub Issue #530. 11 Feb 2021 updated definition along with addition of epigenomically_modified_region (SO:0002332). Epigenetically modified region is now not inherited while epigenomically modified region is not annotated as inherited. See GitHub Issue #532 and issue #534. subset: SOFA synonym: "epigenetically modified region" RELATED [] is_a: SO:0005836 ! regulatory_region created_by: kareneilbeck creation_date: 2010-03-27T12:02:29Z [Term] id: SO:0001790 name: paired_end_fragment namespace: sequence def: "An assembly region that has been sequenced from both ends resulting in a read_pair (mate_pair)." [SO:ke] subset: SOFA synonym: "paired end fragment" EXACT [] is_a: SO:0000143 ! assembly_component created_by: kareneilbeck creation_date: 2011-04-14T01:48:20Z [Term] id: SO:0005836 name: regulatory_region namespace: sequence def: "A region of sequence that is involved in the control of a biological process." [SO:ke] subset: SOFA synonym: "INSDC_feature:regulatory" BROAD [] synonym: "INSDC_qualifier:other" EXACT [] synonym: "regulatory region" EXACT [] xref: http://en.wikipedia.org/wiki/Regulatory_region "wiki" is_a: SO:0000831 ! gene_member_region [Term] id: SO:0005855 name: gene_group namespace: sequence def: "A collection of related genes." [SO:ma] subset: SOFA synonym: "gene group" EXACT [] is_a: SO:0001411 ! biological_region [Term] id: SO:0100011 name: cleaved_peptide_region namespace: sequence def: "The cleaved_peptide_region is the region of a peptide sequence that is cleaved during maturation." [EBIBS:GAR] comment: Range. subset: biosapiens subset: SOFA synonym: "cleaved peptide region" EXACT [] is_a: SO:0000839 ! polypeptide_region relationship: part_of SO:0001063 ! immature_peptide_region [Term] id: SO:1000002 name: substitution namespace: sequence def: "A sequence alteration where the length of the change in the variant is the same as that of the reference." [SO:ke] subset: SOFA xref: loinc:LA6690-7 "Substitution" is_a: SO:0001059 ! sequence_alteration is_a: SO:0001411 ! biological_region [Term] id: SO:1000005 name: complex_substitution namespace: sequence def: "When no simple or well defined DNA mutation event describes the observed DNA change, the keyword \"complex\" should be used. Usually there are multiple equally plausible explanations for the change." [EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html] subset: SOFA synonym: "complex substitution" EXACT [] is_a: SO:1000002 ! substitution [Term] id: SO:1000008 name: point_mutation namespace: sequence def: "A single nucleotide change which has occurred at the same position of a corresponding nucleotide in a reference sequence." [SO:immuno_workshop] subset: SOFA synonym: "point mutation" EXACT [] xref: http://en.wikipedia.org/wiki/Point_mutation "wiki" is_a: SO:0001483 ! SNV [Term] id: SO:1000036 name: inversion namespace: sequence def: "A continuous nucleotide sequence is inverted in the same position." [EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html] subset: DBVAR subset: SOFA synonym: "inversion" EXACT dbvar [http://www.ncbi.nlm.nih.gov/dbvar/] xref: loinc:LA6689-9 "Inversion" is_a: SO:0001059 ! sequence_alteration is_a: SO:0001411 ! biological_region [Term] id: SO:1001284 name: regulon namespace: sequence def: "A set of units of gene expression directly regulated by a common set of one or more common regulatory gene products." [ISBN:0198506732, PMID:32665585] comment: Definition updated with Mejia-Almonte et.al PMID:32665585 on Aug 5, 2020. Added relationship has_part SO:0002300 subset: SOFA xref: http://en.wikipedia.org/wiki/Regulon "wiki" is_a: SO:0005855 ! gene_group relationship: has_part SO:0001411 ! biological_region [Term] id: SO:2000061 name: databank_entry namespace: sequence def: "The sequence referred to by an entry in a databank such as GenBank or SwissProt." [SO:ke] subset: SOFA synonym: "accession" RELATED [] synonym: "databank entry" EXACT [] is_a: SO:0000695 ! reagent [Typedef] id: adjacent_to name: adjacent_to namespace: sequence def: "A geometric operator, specified in Egenhofer 1989. Two features meet if they share a junction on the sequence. X adjacent_to Y iff X and Y share a boundary but do not overlap." [PMID:20226267, SO:ke] subset: SOFA [Typedef] id: associated_with name: associated_with namespace: sequence comment: This relationship is vague and up for discussion. [Typedef] id: complete_evidence_for_feature name: complete_evidence_for_feature namespace: sequence def: "B is complete_evidence_for_feature A if the extent (5' and 3' boundaries) and internal boundaries of B fully support the extent and internal boundaries of A." [SO:ke] comment: If A is a feature with multiple regions such as a multi exon transcript, the supporting EST evidence is complete if each of the regions is supported by an equivalent region in B. Also there must be no extra regions in B that are not represented in A. This relationship was requested by jeltje on the SO term tracker. The thread for the discussion is available can be accessed via tracker ID:1917222. is_transitive: true is_a: evidence_for_feature ! evidence_for_feature [Typedef] id: connects_on name: connects_on namespace: sequence def: "X connects_on Y, Z, R iff whenever Z is on a R, X is adjacent to a Y and adjacent to a Z." [PMID:20226267] comment: Example: A splice_junction connects_on exon, exon, mature_transcript. created_by: kareneilbeck creation_date: 2010-10-14T01:38:51Z [Typedef] id: contained_by name: contained_by namespace: sequence def: "X contained_by Y iff X starts after start of Y and X ends before end of Y." [PMID:20226267] comment: The inverse is contains. Example: intein contained_by immature_peptide_region. is_transitive: true created_by: kareneilbeck creation_date: 2010-10-14T01:26:16Z [Typedef] id: contains name: contains namespace: sequence def: "The inverse of contained_by." [PMID:20226267] comment: Example: pre_miRNA contains miRNA_loop. is_transitive: true created_by: kareneilbeck creation_date: 2010-10-14T01:32:15Z [Typedef] id: derives_from name: derives_from namespace: sequence subset: SOFA is_transitive: true [Typedef] id: disconnected_from name: disconnected_from namespace: sequence def: "X is disconnected_from Y iff it is not the case that X overlaps Y." [PMID:20226267] created_by: kareneilbeck creation_date: 2010-10-14T01:42:10Z [Typedef] id: edited_from name: edited_from namespace: sequence created_by: kareneilbeck creation_date: 2009-08-19T02:19:45Z [Typedef] id: edited_to name: edited_to namespace: sequence created_by: kareneilbeck creation_date: 2009-08-19T02:19:11Z [Typedef] id: evidence_for_feature name: evidence_for_feature namespace: sequence def: "B is evidence_for_feature A, if an instance of B supports the existence of A." [SO:ke] comment: This relationship was requested by nlw on the SO term tracker. The thread for the discussion is available can be accessed via tracker ID:1917222. is_transitive: true [Typedef] id: exemplar_of name: exemplar_of namespace: sequence def: "X is exemplar of Y if X is the best evidence for Y." [SO:ke] comment: Tracker id: 2594157. [Typedef] id: finished_by name: finished_by namespace: sequence def: "Xy is finished_by Y if Y part of X, and X and Y share a 3' boundary." [PMID:20226267] comment: Example CDS finished_by stop_codon. created_by: kareneilbeck creation_date: 2010-10-14T01:45:45Z [Typedef] id: finishes name: finishes namespace: sequence def: "X finishes Y if X is part_of Y and X and Y share a 3' or C terminal boundary." [PMID:20226267] comment: Example: stop_codon finishes CDS. created_by: kareneilbeck creation_date: 2010-10-14T02:17:53Z [Typedef] id: gained name: gained namespace: sequence def: "X gained Y if X is a variant_of X' and Y part of X but not X'." [SO:ke] comment: A relation with which to annotate the changes in a variant sequence with respect to a reference.\nFor example a variant transcript may gain a stop codon not present in the reference sequence. created_by: kareneilbeck creation_date: 2011-06-28T12:51:10Z [Typedef] id: genome_of name: genome_of namespace: sequence [Typedef] id: guided_by name: guided_by namespace: sequence created_by: kareneilbeck creation_date: 2009-08-19T02:27:04Z [Typedef] id: guides name: guides namespace: sequence created_by: kareneilbeck creation_date: 2009-08-19T02:27:24Z [Typedef] id: has_integral_part name: has_integral_part namespace: sequence def: "X has_integral_part Y if and only if: X has_part Y and Y part_of X." [http://precedings.nature.com/documents/3495/version/1] comment: Example: mRNA has_integral_part CDS. is_a: has_part ! has_part created_by: kareneilbeck creation_date: 2009-08-19T12:01:46Z [Typedef] id: has_origin name: has_origin namespace: sequence [Typedef] id: has_part name: has_part namespace: sequence def: "Inverse of part_of." [http://precedings.nature.com/documents/3495/version/1] comment: Example: operon has_part gene. [Typedef] id: has_quality name: has_quality namespace: sequence comment: The relationship between a feature and an attribute. [Typedef] id: homologous_to name: homologous_to namespace: sequence subset: SOFA is_symmetric: true is_a: similar_to ! similar_to [Typedef] id: integral_part_of name: integral_part_of namespace: sequence def: "X integral_part_of Y if and only if: X part_of Y and Y has_part X." [http://precedings.nature.com/documents/3495/version/1] comment: Example: exon integral_part_of transcript. is_a: part_of ! part_of created_by: kareneilbeck creation_date: 2009-08-19T12:03:28Z [Typedef] id: is_consecutive_sequence_of name: is_consecutive_sequence_of namespace: sequence def: "R is_consecutive_sequence_of R iff every instance of R is equivalent to a collection of instances of U:u1, u2, un, such that no pair of ux uy is overlapping and for all ux, it is adjacent to ux-1 and ux+1, with the exception of the initial and terminal u1,and un (which may be identical)." [PMID:20226267] comment: Example: region is consecutive_sequence of base. created_by: kareneilbeck creation_date: 2010-10-14T02:19:48Z [Typedef] id: lost name: lost namespace: sequence def: "X lost Y if X is a variant_of X' and Y part of X' but not X." [SO:ke] comment: A relation with which to annotate the changes in a variant sequence with respect to a reference.\nFor example a variant transcript may have lost a stop codon present in the reference sequence. created_by: kareneilbeck creation_date: 2011-06-28T12:53:16Z [Typedef] id: maximally_overlaps name: maximally_overlaps namespace: sequence def: "A maximally_overlaps X iff all parts of A (including A itself) overlap both A and Y." [PMID:20226267] comment: Example: non_coding_region_of_exon maximally_overlaps the intersections of exon and UTR. created_by: kareneilbeck creation_date: 2010-10-14T01:34:48Z [Typedef] id: member_of name: member_of namespace: sequence comment: A subtype of part_of. Inverse is collection_of. Winston, M, Chaffin, R, Herrmann: A taxonomy of part-whole relations. Cognitive Science 1987, 11:417-444. subset: SOFA is_transitive: true is_a: part_of ! part_of [Typedef] id: non_functional_homolog_of name: non_functional_homolog_of namespace: sequence def: "A relationship between a pseudogenic feature and its functional ancestor." [SO:ke] subset: SOFA is_a: homologous_to ! homologous_to [Typedef] id: orthologous_to name: orthologous_to namespace: sequence subset: SOFA is_symmetric: true is_a: homologous_to ! homologous_to [Typedef] id: overlaps name: overlaps namespace: sequence def: "X overlaps Y iff there exists some Z such that Z contained_by X and Z contained_by Y." [PMID:20226267] comment: Example: coding_exon overlaps CDS. created_by: kareneilbeck creation_date: 2010-10-14T01:33:15Z [Typedef] id: paralogous_to name: paralogous_to namespace: sequence subset: SOFA is_symmetric: true is_a: homologous_to ! homologous_to [Typedef] id: part_of name: part_of namespace: sequence def: "X part_of Y if X is a subregion of Y." [http://precedings.nature.com/documents/3495/version/1] comment: Example: amino_acid part_of polypeptide. subset: SOFA is_transitive: true [Typedef] id: partial_evidence_for_feature name: partial_evidence_for_feature namespace: sequence def: "B is partial_evidence_for_feature A if the extent of B supports part_of but not all of A." [SO:ke] is_a: evidence_for_feature ! evidence_for_feature [Typedef] id: position_of name: position_of namespace: sequence [Typedef] id: processed_from name: processed_from namespace: sequence def: "Inverse of processed_into." [http://precedings.nature.com/documents/3495/version/1] comment: Example: miRNA processed_from miRNA_primary_transcript. created_by: kareneilbeck creation_date: 2009-08-19T12:14:00Z [Typedef] id: processed_into name: processed_into namespace: sequence def: "X is processed_into Y if a region X is modified to create Y." [http://precedings.nature.com/documents/3495/version/1] comment: Example: miRNA_primary_transcript processed into miRNA. created_by: kareneilbeck creation_date: 2009-08-19T12:15:02Z [Typedef] id: recombined_from name: recombined_from namespace: sequence created_by: kareneilbeck creation_date: 2009-08-19T02:21:03Z [Typedef] id: recombined_to name: recombined_to namespace: sequence created_by: kareneilbeck creation_date: 2009-08-19T02:20:07Z [Typedef] id: sequence_of name: sequence_of namespace: sequence [Typedef] id: similar_to name: similar_to namespace: sequence subset: SOFA is_symmetric: true [Typedef] id: started_by name: started_by namespace: sequence def: "X is strted_by Y if Y is part_of X and X and Y share a 5' boundary." [PMID:20226267] comment: Example: CDS started_by start_codon. created_by: kareneilbeck creation_date: 2010-10-14T01:43:55Z [Typedef] id: starts name: starts namespace: sequence def: "X starts Y if X is part of Y, and A and Y share a 5' or N-terminal boundary." [PMID:20226267] comment: Example: start_codon starts CDS. created_by: kareneilbeck creation_date: 2010-10-14T01:47:53Z [Typedef] id: trans_spliced_from name: trans_spliced_from namespace: sequence created_by: kareneilbeck creation_date: 2009-08-19T02:22:14Z [Typedef] id: trans_spliced_to name: trans_spliced_to namespace: sequence created_by: kareneilbeck creation_date: 2009-08-19T02:22:00Z [Typedef] id: transcribed_from name: transcribed_from namespace: sequence def: "X is transcribed_from Y if X is synthesized from template Y." [http://precedings.nature.com/documents/3495/version/1] comment: Example: primary_transcript transcribed_from gene. created_by: kareneilbeck creation_date: 2009-08-19T12:05:39Z [Typedef] id: transcribed_to name: transcribed_to namespace: sequence def: "Inverse of transcribed_from." [http://precedings.nature.com/documents/3495/version/1] comment: Example: gene transcribed_to primary_transcript. created_by: kareneilbeck creation_date: 2009-08-19T12:08:24Z [Typedef] id: translates_to name: translates_to namespace: sequence def: "Inverse of translation _of." [http://precedings.nature.com/documents/3495/version/1] comment: Example: codon translates_to amino_acid. created_by: kareneilbeck creation_date: 2009-08-19T12:11:53Z [Typedef] id: translation_of name: translation_of namespace: sequence def: "X is translation of Y if Y is translated by ribosome to create X." [http://precedings.nature.com/documents/3495/version/1] comment: Example: Polypeptide translation_of CDS. created_by: kareneilbeck creation_date: 2009-08-19T12:09:59Z [Typedef] id: variant_of name: variant_of namespace: sequence def: "A' is a variant (mutation) of A = definition every instance of A' is either an immediate mutation of some instance of A, or there is a chain of immediate mutation processes linking A' to some instance of A." [SO:immuno_workshop] comment: Added to SO during the immunology workshop, June 2007. This relationship was approved by Barry Smith.