]> 2023-08-19 The Genomic Epidemiology Ontology aims to provide a comprehensive controlled vocabulary for infectious disease surveillance and outbreak investigations. It is an application ontology that draws on many other ontologies including anatomy, taxonomy, disease, symptoms, environment and food types for foodborn pathogen metadata. Genomic Epidemiology Ontology GENEPIO metadata section term as in existing standard term source The user interface label is the label that should be placed on a datum when presented on a form or report Damion Dooley user interface label user interface definition A user interface help annotation is a textual phrase to display with an entity on a form or report that provides some detail about what the entity is or how it is being used. Damion Dooley user interface help true Damion Dooley user interface hidden true A user interface feature is a pre-set list of features and their acceptable values that a user interface rendering system should use to display an entity on a form or report Damion Dooley user interface feature This annotation can be used to select the value specification that a referenced entity or specification component has for data entry. Can be scalar value specification / categorical tree specification etc. Damion Dooley obsolete: user interface value specification true UI regex normalize Damion Dooley user interface regular epression normalize UI regex validate Damion Dooley user interface regular expression validate UI regex format Damion Dooley user interface regular expression display format label see also part of Damion comments: In GenEpiO, If an entity has more than one value specification (in a conjunction), then it means the entitiy is a complex entity that can have a value for each specification. Minimum data requirements can be spelled out using some, min 1, max 10 etc. restrictions. realized in obsolete has string specification of true T1 after t2 iff:= t2 before_or_simulataneous_with t1 and not (t1 simultaeous_with t2) Damion Dooley Damion Dooley's note: Intention is to have it be the inverse of "before". This allows axioms like "B after A" to be placed in class B, rather than having to phrase them as "A before B" in class A. I believe time dependency relations (conditions) don't imply existence by themselves. If B after A, and B exists, this doesn't necessarily mean that A exists, only that if A does exist, it must be before B. To add that extra existence implication probably requires the "causally downstream of" and "immediately causally downstream of" RO relations, in a closed system where B cannot exist any other way. after obsolete: has string value specification true obsolete: has categorical specification of true obsolete: has categorical value specification true has unit is about specifies inheres in has participant location of located in aligned with T1 before t2 iff:= t1 before_or_simulataneous_with t2 and not (t1 simultaeous_with t2) has component adjacent to has input has input has output has output member of has member output of link two concepts, indicating a high degree of confidence that the concepts can be used interchangeably across a wide range of information retrieval applications. skos:exactMatch is a transitive property, and is a sub-property of skos:closeMatch. exactMatch An object relation between a LinkML SlotDefinition to a LocalName. An abstract class is a high level class or slot that is typically used to group common slots together and cannot be directly instantiated. Notes and comments about an element intended for external consumption. Date/time at which the element was created. Not Applicable; Missing; Not Collected; Not Provided; Restricted Access A value (string or URI) indicating the data collection state an instance of a slot has. A description of the element's purpose and use A list of terms from different schemas or terminology systems that have identical meaning. Id of the schema that defined the element. The imports entry that this element was derived from. Empty means primary source The ncname of the source of the name. 2021-04-26T22:15:07Z local_name_source A name assigned to an element in a given ontology. For slots with ranges of type number, the value must be equal to or lowe than this For slots with ranges of type number, the value must be equal to or higher than this true means that slot can have more than one value The unique name of the element within the context of the schema. Name is combined with the default prefix to form the globally unique subject of the target class. The string value of the slot must conform to this regular expression A list of possible values for a slot range. true if this slot is not required, but is recommended. true means that the slot must be present in the loaded definition. A reference. IAO:0000122 ('ready for release') status of the element. The uri that defines the possible values for the type definition URI is typically drawn from the set of URI's defined in OWL (https://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#Datatype_Maps) Description of what the value is doing. The identifier of a "value set" -- a set of identifiers that form the possible values for the range of a slot A persistent, unique identifier of a molecular sequence database entry. Damion Dooley EDAM sequence accession The name of a field in a database. database field name 1 1 Damion Dooley CLSI NCCLS National Committee on Clinical Laboratory Standards (NCCLS) On January 1, 2005 the National Committee on Clinical Laboratory Standards (NCCLS) changed its name to CLSI depth:1 order: ARO:3004303 # nonsensitive ARO:3004300 # intermediate ARO:3004302 # sensitive ARO:3004304 # sensitive - dose dependant ARO:3004301 # resistant http://purl.obolibrary.org/obo/PATO_0000001 http://purl.obolibrary.org/obo/CHEBI_50906 Note that this can refer to a material entity like an organ or muscle, or an immaterial entity (a site or fiat boundary) like skin, stomach cavity or lung surface. Trd deoxyribonucleic acid STR Sm Stm http://purl.obolibrary.org/obo/CHEBI_33281 obsolete: antibiotic true Am Amk Amx PAS Rif Pto CAP Cm Clr Cfz CYC Cs pyrazinamide Pza E ETB Emb ETH Eto Currently "K antigen" is a CHEBI_73772 chemical entity rather than a role. K antigen Inh isoniazid Kan Lfx Lzd MXF Mfx Mfx https://www.ncbi.nlm.nih.gov/books/NBK247415/bin/part3-m17.pdf Bdq Damion Dooley's note: 'K antigen' can't be a subclass of antigen as chemical entity is disjoint from role. Snomed: Concept 260824002 Ofx specimen extraction matrix "This observable is important where process of capturing sample can affect dna extraction." http://purl.obolibrary.org/obo/OGMS_0000031 state / province / territory / region lookup:http://purl.obolibrary.org/obo/GAZ_00000448 Transmission of a disease agent (infectious pathogen, toxic chemical, etc.) from a source that is common to those who acquire the disease. Common vehicles include air, water, food, injected substances. Legionellosis is an example of common vehicle spread in air that has passed through air conditioning equipment contaminated by the causal organism. HIV disease and hepatitis B and C can be spread among illicit drug users by the common vehicle of contaminated needles and syringes. Cholera and many other waterborne diseases are spread by the common vehicle of contaminated water. common vehicle transmission A role that inheres in a material that is input as the subject of interest in a scientific technique - either as an entity about which data is generated in an assay or study, or an entity that is transformed or modified in a material processing technique (e.g. the source from which a biological sample is taken) experimental subject role depth:1 https://en.wikipedia.org/wiki/Medical_state condition medical condition medical state Damion Dooley note: GenEpiO includes various health status descriptors here; many are annotated as included in the American AHA HIPPA or United Kingdom NHS terminologies. The challenge is to map them or order them in such a way that merged data can be analyzed. http://semanticscience.org/resource/SIO_010057 Condition Condition The process of drawing in by breathing. http://purl.obolibrary.org/obo/ExO_0000057 inhalation Redox potential, measured relative to a hydrogen cell, indicating oxidation or reduction potential MIxS A symptom onset date is a date-time entity that marks the start of one or more reported symptoms pertaining to an episode of human or animal illness Damion Dooley symptom onset date A substance, usually composed primarily of carbohydrates, fats, water and/or proteins, that can be eaten or drunk by an animal or human being for nutrition or pleasure. Damion Dooley foodon product type lookup lookup lookup food packing medium [Oceans and Seas] Census Regions and Divisions are groupings of States that subdivide the United States. Damion Dooley United States Census Bureau http://www.census.gov/econ/census/help/geography/regions_and_divisions.html A census region in the United States of America Damion Dooley A census region in the United States of America Damion Dooley A census region in the United States of America Damion Dooley A census region in the United States of America Damion Dooley directly governed city (North Korea) GSCID-BRC standard The GSCID-BRC Project and Sample Application Standard captures standardized human pathogen and vector sequencing metadata to support epidemiologic and genotype-phenotype association studies for human infectious diseases. The NIAID page for specifications has a broken link for Core Project. This is the existing page: https://www.niaid.nih.gov/research/dmid-metadata-standards-core-project Link page is: https://www.niaid.nih.gov/research/human-pathogen-and-vector-sequencing-metadata-standards Damion Dooley This standard was developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs), and NIAID and informed by discussions and input with numerous collaborating scientists. GSCID-BRC Project and Sample Application Standard MIxS standard MixS is a unified standard for describing sequence data provided by the Genomic Standards Consortium (GSC) Damion Dooley Minimum Information about any (x) Sequence (MIxS) MIMARKS standard The "minimum information about a marker gene sequence" (MIMARKS) is a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences. http://www.nature.com/nbt/journal/v29/n5/full/nbt.1823.html Minimum information about a marker gene sequence "other (metadata choice)" indicates that for the given categorical variable, a respondent has chosen an other or additional response not listed. This class cannot be referred to directly as a subclass of more than one class without involving inference from the respective classes. Instances of it may exist as a kind of metadata. Usually a reference to a parent categorical class, but not a subordinate, carries the same information. Damion Dooley other (metadata value) other relevance other Other: specify the sample scope that was used. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ other specimen scope other material Other: specify the material that was used. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ other material derived from specimen "other capture type" is a free text description provided to indicate a target capture specification not covered by the other types. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ other capture type Specify the project method Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ other project method This is an organizational category for grouping parameters relevant to a biomedical sample. Damion Dooley obsolete: draft GenEpiO BioSample standard true An NCBI antibiogram data item is an antimicrobial susceptibility and resistance datum related to drug resistant pathogens. This information is provided within an antibiogram table on BioSample records. Damion Dooley obsolete: NCBI antibiogram data item true HIPPA PHI guideline The American HIPPA Protected Health Information (PHI) guidelines cover information, including demographic information, which relates to: - the individual’s past, present, or future physical or mental health or condition, - the provision of health care to the individual, or - the past, present, or future payment for the provision of health care to the individual, and that identifies the individual or for which there is a reasonable basis to believe can be used to identify the individual. Protected health information includes many common identifiers (e.g., name, address, birth date, Social Security Number) when they can be associated with the health information listed above." https://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act American HIPPA Protected Health Information (PHI) guidelines The natural (as opposed to laboratory) full scientific taxonomic name of a subject related to a given investigation, study and/or specimen. Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0001567 obsolete: subject species true http://purl.obolibrary.org/obo/NCIT_C45908 NCBI BioSample: intersex obsolete: intersex true Isolation date inferred from BCCDC interviews date the strain was isolated Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0000021 obsolete: isolation date true Collection Culture isolation date suggested in metadata feedback year/month/day A culture isolation date is a date-time entity marking the end of a process in which a sample is isolated as a single colony or non-mixture culture Damion Dooley GROUP: IRIDA Ontology (Morag) GENEPIO culture isolation date Collection Frozen date NML Labware LIMS year/month/day A culture frozen date is a date-time entity marking the beginning of a process in which a sample culture is frozen for preservation. Damion Dooley GROUP: IRIDA Ontology (Emma) GENEPIO culture frozen date Collection Received date NML Labware LIMS year/month/day date lab received isolate NML LIMS GENEPIO isolate received date Collection Upload date NML Labware LIMS month/day/year date the isolate was entered into the database NML LIMS PulseNet: UploadDate GENEPIO isolate upload date Pathogen Isolate inferred from BCCDC interviews gut, spine, tongue, lung Name of body site where the specimen was obtained from, such as a specific organ or tissue. Damion Dooley's note: Although term mentions "site" in label, this cannot be placed under "site" as it is intended to refer to organism material, and leads to unsatisfiable terms otherwise. GROUP: MIxS GENEPIO subject anatomical site http://gensc.org/ns/mixs/host_body_site depth:1 order: NCIT:C115935 # healthy NCIT:C25610 # pathologic NCIT:C28554 # deceased diseased Health or disease status of a given subject at time of specimen collection. Currently this does not differ from the 'subject health status (GSCID-BRC)' item. It may be revised to have more options in the future. Damion Dooley subject health status at time of specimen collection 1 1 Environmental Isolate sediment, chicken feces, cheese Describes the physical, environmental and/or local geographical source of the biological sample from which the sample was derived. Damion Dooley NCBI Biosample specimen source context Pathogen Isolate feces, cerebral spinal fluid (CSF) Substance produced by the body, e.g. stool, mucus, where the specimen was obtained from. Damion Dooley subject body product Environmental Isolate Geographic location depth GROUP: MIxS value (5m) Please refer to the definitions of depth in the environmental packages GROUP: MIxS GENEPIO collection depth datum The environmental material level refers to the material that was displaced by the sample, or material in which a sample was embedded, prior to the sampling event. Environmental material terms are generally mass nouns. Examples include: air, soil, or water. EnvO (v 2013-06-14) terms can be found via the link: www.environmentontology.org/Browse-EnvO http://purl.obolibrary.org/obo/ENVO_00010483 obsolete: specimen source environmental material true A subject sex is the phenotypic sex of given subject (human or animal) related to a given investigation, study, and/or specimen. Damion Dooley Gender is not used because it is a term referencing human social and cultural convention. subject sex specification Environmental Isolate Land use where sample was taken Environment Canada Metadata descriptive; type of human use (park, farm, urban) Human use of land involving the management and modification of natural environment or wilderness into built environment. URI: http://en.wikipedia.org/wiki/Land_use GENEPIO specimen collection site land use Environmental Isolate Fecal Indicator Bacteria Number Environment Canada Metadata 12,456 cfu Number of indicator micro-organisms (colony forming units) present in a sample that have been used to suggest the presence of pathogens. URI: http://www.who.int/water_sanitation_health/dwq/iwachap13.pdf GENEPIO fecal indicator bacteria count Environmental Isolate Conductivity GROUP: MIxS milliSiemens per centimeter electrical conductivity of water GROUP: MIxS enviro package GENEPIO conductivity measurement datum Environmental Isolate Dissolved oxygen GROUP: MIxS micromole per kilogram concentration of dissolved oxygen GROUP: MIxS enviro package GENEPIO dissolved oxygen concentration GMI MDM standard The Global Microbial Identifier "Minimum data for Matching" (MDM) standard is a standard that defines essential contextual data fields to be included in genomic sequence repository records. Damion Dooley the Global Microbial Identifier Minimum Data for Matching (MDM) Standard http://www.globalmicrobialidentifier.org/-/media/Sites/gmi/News-and-events/2013/6th-meeting-2013-report.ashx?la=da Environmental Isolate Soluble inorganic material GROUP: MIxS soluble organic material name;measurement value concentration of substances such as ammonia, road-salt, sea-salt, cyanide, hydrogen sulfide, thiocyanates, thiosulfates, etc. GROUP: MIxS enviro package GENEPIO soluble inorganic material Environmental Isolate Soluble inorganic material GROUP: MIxS GENEPIO soluble inorganic material concentration Environmental Isolate Nitrite GROUP: MIxS micromole per liter concentration of nitrite GROUP: MIxS enviro package GENEPIO nitrite concentration Environmental Isolate Nitrate GROUP: MIxS micromole per liter concentration of nitrate GROUP: MIxS enviro package GENEPIO nitrate concentration Environmental Isolate Total phosphorous GROUP: MIxS micromole per liter total phosphorus concentration, calculated by: total phosphorus = total dissolved phosphorus + particulate phosphorus. Can also be measured without filtering, reported as phosphorus GROUP: MIxS enviro package GENEPIO total phosphorous concentration 12 Environmental Isolate Stream order Environment Canada Metadata value, 1-12 Stream order is a measure of the relative size of streams (The smallest tributaries are referred to as first-order streams, while the largest river in the world, the Amazon, is a twelfth-order waterway). GROUP: Environment Canada Metadata GENEPIO stream order categorical measurement datum Environmental Isolate Density of bacteria in sample Environment Canada Metadata OD reading or cfu's or cells per millilitre number of bacteria in sample GROUP: Environment Canada Metadata GENEPIO bacteria density Lab Analytic Host Primary Enzyme PulseNet Data Capture pick list (XbaI, BlnI) Restriction enzyme for first characterization GROUP: IRIDA Ontology (Emma) primary enzyme (LMAAI) 1 1 GENEPIO PFGE test specification Lab Analytic Host Primary PFGE Pattern NML LIMS First PFGE pattern from given diagnostic restriction enzyme GROUP: IRIDA Ontology (Emma) GENEPIO PFGE primary test Lab Analytic Host Secondary Enzyme PulseNet Data Capture pick list (XbaI, BlnI) Restriction enzyme for second characterization GROUP: IRIDA Ontology (Emma) GENEPIO secondary enzyme (LMACI) Lab Analytic Host Secondary PFGE Pattern NML LIMS Second PFGE pattern from given diagnostic restriction enzyme GROUP: IRIDA Ontology (Emma) GENEPIO PFGE secondary test Lab Analytic MLST Clonal Complex inferred from BCCDC interviews and Bionumerics pick list? Multilocus sequence typing (MLST) is a technique in molecular biology for the typing of multiple loci. The procedure characterizes isolates of microbial species using the DNA sequences of internal fragments of multiple housekeeping genes. Sequence types are grouped into clonal complexes by their similarity to a central allelic profile (genotype). As such, clonal complexes represent sequence types that share a number of identical alleles e.g. 5/7 URI: http://en.wikipedia.org/wiki/Multilocus_sequence_typing ; http://eburst.mlst.net/3.asp GENEPIO MLST clonal complex Lab Analytic MLST Sequence Type inferred from BCCDC interviews and Bionumerics pick list? Multilocus sequence typing (MLST) is a technique in molecular biology for the typing of multiple loci. The procedure characterizes isolates of microbial species using the DNA sequences of internal fragments of multiple housekeeping genes. For each housekeeping gene, the different sequences present within a bacterial species are assigned as distinct alleles and, for each isolate, the alleles at each of the loci define the allelic profile or sequence type (ST). GENEPIO MLST sequence typing The process of fingerprinting the core genome of a bacteria. Add under upcomming OBI "DNA fingerprinting assay". Damion Dooley Some bacteria like E coli have a highly variable genome, and so to type them methods must focus on typing the core genome that is common to all strains. GENEPIO core genome fingerprinting assay Lab Analytic CGF type inferred from BCCDC interviews pick list output of CGF GROUP: IRIDA Ontology (Emma) GENEPIO CGF type Lab Analytic 16S rRNA sequencing inferred from BCCDC interviews 16S ribosomal RNA sequencing is a sequencing method used to identify and compare bacteria present within a given sample. 16S rRNA gene sequencing is a well-established method for studying phylogeny and taxonomy of samples from complex microbiomes or environments that are difficult or impossible to study. URI: http://www.illumina.com/applications/microbiology/microbial-sequencing-methods/16S-rrna-sequencing.html GENEPIO 16S rRNA sequencing Lab Analytic Stx toxin type metadata feedback Stx1 or Stx2 pathogenic shiga toxin produced by STEC E. coli GROUP: IRIDA Ontology (Emma) GENEPIO Stx toxin type datum Lab Analytic Stx toxin subtype metadata feedback stx1a, stx1c shiga toxin variant GROUP: IRIDA Ontology (Emma) GENEPIO Stx toxin subtype datum Lab Analytic Stx1 Toxin PCR result NML Labware LIMS CT value? qPCR result of Stx1 gene amplification GROUP: IRIDA Ontology (Emma) GENEPIO Stx1 Toxin PCR result Lab Analytic Stx1 Cell Culture NML Labware LIMS level of Stx1 toxin derived tissue culture toxicity GROUP: IRIDA Ontology (Emma) GENEPIO Stx1 cell culture level Lab Analytic Stx2 Toxin PCR result NML Labware LIMS CT value? qPCR result of Stx2 gene amplification GROUP: IRIDA Ontology (Emma) GENEPIO Stx2 Toxin PCR result Lab Analytic Stx2 Cell Culture NML Labware LIMS value? level of Stx2 toxin derived tissue culture toxicity GROUP: IRIDA Ontology (Emma) GENEPIO Stx2 cell culture level An AccuProbe assay that uses a luminometer and DNA probe designed to identify a specific bacteria or fungus species within a culture. Damion Dooley Nucleic acid hybridization tests are based on the ability of complementary nucleic acid strands to specifically align and associate to form stable double-stranded complexes (6). The ACCUPROBE method uses a single-stranded DNA probe with a chemiluminescent label that is complementary to the ribosomal RNA of the target organism. After the ribosomal RNA is released from the target organism, the labeled DNA probe combines with the target organism’s ribosomal RNA to form a stable DNA:RNA hybrid. The Selection Reagent allows for the differentiation of non-hybridized and hybridized probes.The light signal emitted by the DNA:RNA hybrids is measured by a GEN-PROBE luminometer. A positive result is a luminometer reading equal to or greater than the cut-off. A value below this cut-off is the negative result. GENEPIO AccuProbe culture identification assay An AccuProbe reagent kit for identifying Listeria monocytogenes. Damion Dooley URI: http://www.hologic.com/sites/default/files/package%20inserts/103051F-EN-RevC.pdf GENEPIO AccuProbe Listeria monocytogenes culture identification reagent kit Lab Analytic NML Labware LIMS positive, negative, not done A categorical diagnostic result of an AccuProbe test. NML LIMS GENEPIO AccuProbe test result Ribotyping involves the fingerprinting of genomic DNA restriction fragments that contain all or part of the genes coding for the 16S and 23S rRNA. By digesting the genes with a specific restriction enzyme, fragments of different lengths are generated. By performing a Gel electrophoresis with the digested samples, the fragments can be visualised as lines on the gel, where larger fragments are close to the start of the gel, and smaller fragments further down. After blotting onto a matrix and probing, these lines form a unique pattern for each species and can be used to identify the origin of the DNA, almost like a barcode can identify a product. GENEPIO ribotyping This method is based on restriction endonuclease digestion of bacterial chromosomal DNA, followed by Southern hybridization to probes for sequences in the regions of bacterial DNA coding for the 5S-16S-23S (the Escherichia coli rrnB rRNA operon) rRNA operon. The probes have been developed that are directed to highly conserved regions of the rRNA operon present in all eubacteria and can therefore be used for ribotyping most bacteria GENEPIO riboprinting an automated system that takes a purified bacterial suspension, lyses the cells, extracts the DNA, restriction endonuclease digests the DNA, separates the digest on a gel, transfers the DNA bands to a membrane, probes the bands with non-radioisotope-labeled, 5S-16S-23S rRNA-specific probes (Southern hybridization), photographs the membrane, and finally compares the bar code-like pattern to databases in order to identify the genus and species. GENEPIO Qualicon (DuPont) RiboPrinter Microbial Characterization System a library of RiboPrinter recognised patterns GENEPIO DuPont identification pattern library Lab Analytic Riboprinter DUP Number NML Labware LIMS DUP-PST1-1211 DuPont identification number from the DuPont identification library NML LIMS GENEPIO RiboPrinter DUP Number Lab Analytic Riboprinter DUP similarity NML Labware LIMS The identification of an isolate was determined when the corresponding patterns matched one of the patterns of the DuPont Identification library of the RiboPrinter1 with a similarity of >0.85. The similarity threshold for an isolate joining a ribogroup is an adaptive value between 0.90 and 0.96, depending on the size of the ribogroup. Similarity of pattern to one of the patterns of the DuPont Identification library of the RiboPrinter GENEPIO RiboPrinter DUP similarity Sequencing Sequencing Run Date NML NGS Archive yyyy/mm/dd Date the sequencing run was performed GROUP: IRIDA Ontology (Emma) GENEPIO sequencing run date Sequencing Sequencing Location inferred from BCCDC interviews location the sequencing run was performed GROUP: IRIDA Ontology (Emma) GENEPIO sequencing location Sequencing Platform https://www.ncbi.nlm.nih.gov/books/NBK54984/table/SRA_Glossary_BK.T._platform_descriptor_t/ Illumina (MiSeq, HiSeq), PacBio, Ion Torrent, (Roche 454), SOLiD ABI A sequencing plaform (brand) is a name of a company that produces sequencer equipment. GROUP: IRIDA Ontology (Emma) GENEPIO sequencing platform (brand) Sequencing Workflow Derived from Sample Sheet Resequencing, FastQ only the workflow is the pre-defined sequence of steps to run the automated sequencing pipeline. GROUP: IRIDA Ontology (Emma) GENEPIO sequencing workflow Sequencing Application Derived from Illumina Sample Sheet amplicon sequencing, WGS the sequencing application is the use or purpose of the sequencing GROUP: IRIDA Ontology (Emma) GENEPIO sequencing application Derived from Illumina Sample Sheet the sequencing conditions and the analyte being measured GROUP: IRIDA Ontology (Emma) obsolete: sequencing assay true Sequencing Chemistry Derived from Illumina Sample Sheet amplicon, resequencing (WGS) lab method to determine the order of nucleotides in a DNA molecule GROUP: IRIDA Ontology (Emma) GENEPIO sequencing chemistry Sequencing Read Length Derived from Illumina Sample Sheet 151, 251 number of base pairs per read GROUP: IRIDA Ontology (Emma) GENEPIO read length A GSCID/BRC data item is a field in one of the GSCID/BRC Project and Sample Application Standard subsets. Damion Dooley obsolete: GSCID-BRC data item true Sequencing Multiplex identifiers Derived from Illumina Sample Sheet S017 Molecular barcodes, called Multiplex Identifiers (MIDs), that are used to specifically tag unique samples in a sequencing run. Sequence should be reported in uppercase letters GROUP: MIxS GENEPIO sequencing run multiplex identifiers Sequencing Sample Name Derived from Illumina Sample Sheet specification: http://support.illumina.com/content/dam/illumina-support/documents/documentation/system_documentation/miseq/miseq-sample-sheet-quick-ref-guide-15028392-j.pdf 1,2,3, AX234, etc. Control1, Tank1, Froglet1 A sequencing run sample identifier is an alphanumeric identifier for a sample. In bioinformatics processing this identifier is assigned to a sample in order to track it through the process of sequencing and analysis. GROUP: IRIDA Ontology (Emma) Illumina SampleSheet: Sample_ID PulseNet: SubmittedNumber sample name sample title GENEPIO sequencing run sample identifier Sequencing Number of cycles Derived from Illumina Sample Sheet 500 Number of sequencing chemistry cycles GROUP: IRIDA Ontology (Emma) GENEPIO sequencing count of chemistry cycles Sequencing Sequencing Kit Derived from Illumina Sample Sheet MiSeq reagent kit Pre-filled, ready-to-use reagent cartridges. Used to produce improved chemistry, cluster density and read length as well as improve quality (Q) scores. Reagent components are encoded to interact with the sequencing system to validate compatibility with user-defined applications. GROUP: IRIDA Ontology (Emma) GENEPIO sequencing kit Sequencing Sequencing Kit version Derived from Illumina Sample Sheet v2, v3 A string datum which is the version/configuration of the reagent cartidge. GROUP: IRIDA Ontology (Emma) GENEPIO sequencing kit version Sequencing Adapters Derived from Illumina Sample Sheet adapter A and B sequence Adapters provide priming sequences for both amplification and sequencing of the sample-library fragments. Both adapters should be reported; in uppercase letters GROUP: MIxS GENEPIO adapter sequence Sequencing Adapter Trimming Derived from Illumina Sample Sheet The removal of adapter sequences from demultiplexed data reads GROUP: IRIDA Ontology (Emma) GENEPIO read adapter trimming Sequencing inferred from BCCDC interviews Illumina metagenomic protocol (part 15044223), with Nextera Packaged kits (containing adapters, indexes, enzymes, buffers etc), tailored for specific sequencing workflows, which allow the simplified preparation of sequencing-ready libraries for small genomes, amplicons, and plasmids. GROUP: IRIDA Ontology (Emma) and http://applications.illumina.com/applications/sequencing/ngs-library-prep.html library preparation kit Sequencing Number of base pairs inferred from BCCDC interviews 300 000 The estimated size of the genome prior to sequencing. Of particular importance in the sequencing of (eukaryotic) genome which could remain in draft form for a long or unspecified period. GROUP: MIxS GENEPIO number of expected base pairs per genome Run QC Number of Reads Passing Filter Output from MiSeq 0.92 Raw data are filtered to remove any reads that do not meet the overall quality as measured by a chastity filter. The chastity of a base call is calculated as the ratio of the brightest intensity divided by the sum of the brightest and second brightest intensities. Clusters passing filter are represented by PF in analysis reports. Clusters pass filter if no more than one base call in the first 25 cycles has a chastity of < 0.6. URI: http://support.illumina.com/sequencing/sequencing_instruments/miseq/questions.html GENEPIO number of reads passing filter Run QC Cluster density Output from MiSeq 800 k/mm^2 Cluster Generation is a process by which libraries amplified into clonal clusters on a flow cell. The density of those clusters in the lane, is the cluster density. URI: https://www.broadinstitute.org/files/shared/illuminavids/clusterGenSlides.pdf sequencer flow cell cluster density Run QC above 30 Qscore Output from MiSeq >75% percentage of reads with a Phred quality score over 30, which indicates less than a 1/1000 chance that the base was called incorrectly URI: http://en.wikipedia.org/wiki/Phred_quality_score GENEPIO phred quality score Assembly QC Assembly Standard Term the GROUP: MIxS term encompasses assembly method; estimated error rate; method of calculation Assembly method refers to how the reads were assembled into contigs for which either a de novo or mapping (reference based) strategy is used. GROUP: MIxS GENEPIO assembly method Assembly QC Assembly status from within "Finishing strategy" in GROUP: MIxS standard draft or finished "Attempting to convey the relative integrity and reliability of the data, six levels or terms have been proposed and applied to describe genome sequences." See subclass choices. Damion Dooley CITATION: Chain, P.S., Grafham, D.V., Fulton, R.S., Fitzgerald, M.G., Hostetler, J., Muzny, D., Ali, J., Birren, B., Bruce, D.C., Buhay, C., et al. (2009). Genomics. Genome project standards in a new era of sequencing. Science 326, 236-237. GENEPIO assembly status Assembly QC Coverage from within "Finishing strategy" in GROUP: MIxS 30 Coverage (read depth or depth) is the average number of reads representing a given nucleotide in the reconstructed sequence. URI: http://en.wikipedia.org/wiki/Shotgun_sequencing GENEPIO Damion Dooley's note: Wikipedia defines calculation with: "the length of the original genome (G), the number of reads(N), and the average read length(L) as N * L/G" read coverage Assembly QC Number of Contigs 20 http://purl.obolibrary.org/obo/IAO_0000428 A contig count is the count of contigs that belong to a sequence assembly. Damion Dooley GENEPIO contig count Assembly QC Assembly name Standard Term name and version of assemby Name/version of the assembly provided by the submitter that is used in the genome browsers and in the community GROUP: MIxS GENEPIO assembly name (identifier) Assembly QC Annotation Algorithm Standard Term Prokka Program used for annotation GROUP: IRIDA Ontology (Emma) GENEPIO genome annotation algorithm Assembly QC Annotation Source Standard Term annotation source description For cases where annotation was provided by a community jamboree or model organism database rather than by a specific submitter GROUP: MIxS GENEPIO genome annotation source Damion Dooley obsolete: NCBI BioSample data item true CDC ID Damion Dooley Damion Dooley An NCBI BioProject data item is a datum within the NCBI BioProject standard collection Damion Dooley obsolete: NCBI BioProject data item true The INDISC consortium "Minimum data for Matching" standard aims to define a format to capture reads and minimum metadata. It is developed by the "Repository and storage of sequence and meta-data" workgroup 2. It is used in the Global Microbial Identifier platform. Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0000036 obsolete: INSDC MDM standard true A MixS data item is a term or field defined in the Genomic Standards Consortium MIxS standard. Damion Dooley obsolete: MIxS data item true The 'draft data standard' class contains draft representations of formal or defacto third party standards and related components. These representations may have been crafted without any involvement of related standards issuers and so no guarantee can be provided about their currency, accuracy or veracity. Damion Dooley draft data standard obsolete: HIPPA PHI data item true obsolete: PulseNet data item true The I2B2 Workbench Data Protection role standard is a standard which names and lists data protection roles between a user and a given dataset that define the level of detail a user has access to. Damion Dooley Derived from the Data Protection Track of the I2B2 Workbench software specification: https://www.i2b2.org/software/files/PDF/current/IM_Architecture.pdf I2B2 Data Protection role standard depth:1 order: NCIT:C115935 # healthy NCIT:C25610 # pathologic / diseased NCIT:C28554 # deceased A description of whether a given subject organism appeared healthy, sick or deceased at the time of specimen extraction. If sick or deceased, additional details should be provided in project-specific fields. Damion Dooley CS8 subject health status (GSCID-BRC) The GSCID-BRC Core Sample Standard is an extension of the GMI MDM data standard Damion Dooley https://www.niaid.nih.gov/research/dmid-metadata-standards-core-sample GSCID-BRC Core Sample standard A Core Project item is a field in the GSCID/BRC Project and Sample Application Core Project Standard. Damion Dooley https://www.niaid.nih.gov/research/dmid-metadata-standards-core-project GSCID-BRC Core Project standard Specimen category is a categorical variable that broadly indicates the source of a specimen in order for data processing systems like NCBI's biosample submission portal to anticipate other fields related to the source. Currently this distinguishes between "clinical or host associated" specimens and "environmental/food or other pathogen" specimens. Damion Dooley specimen category day, week, month, year Age unit of the age measurement of a subject (individual organism). Damion Dooley Feb 20, 2017: GSCID-BRC calls for an explicit field here whereas NCBI Biosample and MIxS don't. Damion Dooley NIAID GSCID-BRC metadata working group subject age - unit The National Center for Biotechnology Information offers standards for submitting genomic project data to their BioProject and BioSample databases. Damion Dooley National Center for Biotechnology Information standard NCBI standard A datum containing the city or region or more precise geographical location identifier for the site of a specimen collection event. Damion Dooley Adapted from NIAID GSCID-BRC metadata working group Note that this is expressed in a particular format in some standards. specimen collection location - city or region lookup The country of the site of a given specimen collection event. Damion Dooley NIAID GSCID-BRC metadata working group specimen collection location - country The Darwin Core is body of standards. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing reference definitions, examples, and commentaries. Damion Dooley http://tdwg.github.io/dwc/terms/index.htm Darwin Core Terms A datetime measurement related to a person's travel with respect to potential epidemilogical implications. Damion Dooley travel history item The date of a person's departure from their primary residence (at that time) on a journey to one or more other locations. Damion Dooley travel start date The date of a person's return to some residence from a journey originating at that residence. Damion Dooley travel end date Infection aquired during travel is a boolean datum indicating a diagnosis of a traveller's infection during a given trip. Damion Dooley infection aquired during travel This is a model of the datums involved in an epidemiology investigation. Damion Dooley draft pathogenic epidemiology case investigation record The draft epi case exposure model describes datums related to the possible transmission of a pathogen between a case person and other people, places, animals etc. Damion Dooley draft pathogenic epi case exposure record A description of a food item that a case patient has come in contact with. Damion Dooley draft pathogenic epi case food record food brought back from travel An epidemiology contact network model is a definition of the information necessary to connect pathogen transmission patterns between individuals or animals in connection with an outbreak investigation. Damion Dooley epidemiology contact network model An epi contact network - human is contact information about a person who may be involved in transmitting a pathogen. Damion Dooley epi contact network model - human This is a model that provides location and transportation information about an animal that may be involved in transmitting a pathogen to a human or other animal. Damion Dooley epi contact network model - animal A note is text that provides human-readable information on one or more subjects. Damion Dooley note An epi case general info model contains datum specifications about an epidemiology investigation case including the person involved, contact information and their demographics. Damion Dooley draft pathogenic epi case general information record A record of the details around how, when, and from what an isolate was extracted. Damion Dooley draft pathogenic epi case isolate detail record A specimen source substance is an organism substance or food product or environmental substance from which the specimen was extracted. Damion Dooley specimen source substance An isolate sequence filename is the file name (http://edamontology.org/data_1050) of the contig assembly file created by a bioinformatics assembly process. Damion Dooley isolate sequence filename A cluster identifier is an identifier that locates a genomic variant of an organism within a community of past or present genetically related organisms. Damion Dooley cluster identifier A laboratory test performed on a specimen from a patient with regard to potential pathogenic disease. Damion Dooley draft pathogenic epi case test record A lab test requestor is the individual or agency requesting that one or more tests be performed on a specimen. Damion Dooley contact specification - lab test requester A food cultural origin datum indicates the cultural origin of a food product. Damion Dooley food cultural origin A record of a patient's recent travel - departure and return, mode of travel and locations visited - as it may pertain to an outbreak pathogen transmission event. Damion Dooley draft pathogenic epi case travel record Region (England) (entry subject to GAZ replacement) An epi network contact is a person who may have recieved or transmitted an infectious disease related to an outbreak investigation. Damion Dooley contact specification - epi network contact This model contains data type specifications for whole genome sequencing and epidemiology investigation contextual data. Damion Dooley public health lab epidemiology contextual data standard This is a set of datums involved in laboratory processing of whole-genome-sequenced isolates. Damion Dooley draft whole genome sequencing lab test record Damion Dooley draft WGS lab bioinformatics process record Damion Dooley draft WGS lab sequencing record An NCBI SRA meta information item is a field specification for a NCBI Sequence Read Archive record. Damion Dooley NCBI SRA meta information standard https://www.ncbi.nlm.nih.gov/books/NBK47529/#_SRA_Quick_Sub_BK_Experiment_ An isolate raw read filename is the file name (http://edamontology.org/data_1050) of the raw read file created by a genomic sequencing assay. isolate raw read filename Sequencing The version identifier of a packaged kit tailored for specific sequencing workflows. Damion Dooley library preparation kit version The name and version of software used in a bioinformatics workflow to improve the quality of sequencing reads. Damion Dooley read trimming and filtering software Damion Dooley bioinformatics pipeline name Damion Dooley bioinformatics pipeline version Damion Dooley bioinformatics pipeline protocol Damion Dooley draft WGS lab sequencing quality metrics record Damion Dooley draft WGS lab assembly quality metrics record A general label indicating the primary study goal. These are only relevant for Primary submission projects (not Umbrella projects). Damion Dooley Damion Dooley's note: According to http://trace.ddbj.nig.ac.jp/news/2014-11-12_e.html, "A BioProject record can have multiple project data types" project data type Genome assembly project utilizing already existing sequence data including data that was submitted by a different group Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ assembly project A sequencing project involving clone-ends Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ clone ends project A sequencing project involving DNA methylation, histone modification, and/or chromatin accessibility datasets Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ epigenomics project Exome resequencing project Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ exome resequencing project A whole, or partial, genome sequencing project (with or without a genome assembly) Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ genome sequencing project The Relational Sequencing TB Data Platform (ReSeqTB) catalogs a vast amount of genotypic, phenotypic and related metadata from Mycobacterium tuberculosis (Mtb) strains to enable the development of clinically useful, WHO-endorsed in vitro diagnostic assays for rapid drug susceptibility testing of Mtb. Damion Dooley https://platform.reseqtb.org/ draft ReSeq Tuberculosis data platform standard In expressions like: "<= 8mg/L" The measurement comparator is a sign indicating that a measurement is above, equal to, or below a given threshold. Damion Dooley test threshold measurement comparator < The "less than" comparator indicates that a given substance was present at less than a given quantity or concentration. Damion Dooley < less than <= The "less than or equal to" comparator indicates that a given substance was present at less than or equal to a given quantity or concentration. Damion Dooley <= less than or equal to == The "equal to" comparator indicates that a given substance was present at the given quantity or concentration. Damion Dooley == equal to >= The "greater than or equal to" comparator indicates that a given substance was present at greater than or equal to a given quantity or concentration. Damion Dooley >= greater than or equal to > The "greater than" comparator indicates that a given substance was present at greater than a given quantity or concentration. Damion Dooley > greater than order: OBI:0001616 # specimen identifier OBI:0001614 # GenBank ID GENEPIO:0001100 # antibiogram drug test model GENEPIO:0002062 # amr testing reference standard GENEPIO:0002045 # amr resistance testing method GENEPIO:0002047 # ... version or reagent GENEPIO:0002049 # amr testing platform GENEPIO:0002056 # ... vendor The NCBI antibiogram standard details the reporting specifications for antimicrobial susceptibility and resistance data derived from drug resistant pathogens. This information is submitted as an antibiogram table on BioSample records. Damion Dooley NCBI Antibiogram standard https://www.ncbi.nlm.nih.gov/biosample/docs/antibiogram/ http://purl.obolibrary.org/obo/GENEPIO_0001007 obsolete: NCBI isolate AMR testing model true A restaurant or an eatery, is a business which prepares and serves food and drinks to customers in exchange for money. Damion Dooley https://en.wikipedia.org/wiki/Restaurant http://purl.obolibrary.org/obo/ENVO_01000934 obsolete: restaurant true iPHIS standard The Canadian Integrated Public Health Information System (iPHIS) System provides a standard for holding reportable disease cases from participating jurisdictions. Damion Dooley Canadian Integrated Public Health Information System (iPHIS) Standard 254532233433235152213423 The MIRU24 - international standard is a mycobacterial interspersed repetitive units (MIRU) typing method that classifies Mycobacterium tuberculosis complex (MBTC) bacteria according to a multiple locus VNTR [variable number of tandem repeats] analysis (MLVA) typing scheme of counts of repeats at 24 sequence loci. The ordering of loci is as follows: MIRU 04, MIRU 26, MIRU 40, MIRU 10, MIRU 16, MIRU 31, 424 577, 2165, 401, 3690, 4156, 2163, 1955, 4052, MIRU 02, MIRU 23, MIRU 39, MIRU 20, MIRU 24, MIRU 27, 2347, 2461, 3171 MIRU24 - international standard The MIRU24 - Canadian standard is a Mycobacterium tuberculosis typing method exactly like MIRU24 - international standard except that the report of the ordering of the matching loci is different. Damion Dooley The ordering of loci is as follows: MIRU 02, MIRU 04, MIRU 10, MIRU 16, MIRU 20, MIRU 23, MIRU 24, MIRU 26, MIRU 27, MIRU 31, MIRU 39, MIRU 40, 424, 577, 1955, 2163, 2165, 2347, 2401, 2461, 3171, 3690, 4052, 4156 MIRU24 - Canadian standard spray-wading water pool water obsolete: hot tub true A cafeteria is a type of food service location in which there is little or no waiting staff table service, whether a restaurant or within an institution such as a large office building or school. Damion Dooley https://en.wikipedia.org/wiki/Cafeteria http://purl.obolibrary.org/obo/ENVO_01000969 obsolete: cafeteria true http://purl.obolibrary.org/obo/ENVO_00003864 obsolete: bakery true A delicatessen or deli is a retail establishment that sells a selection of unusual or foreign prepared foods. Damion Dooley https://en.wikipedia.org/wiki/Delicatessen http://purl.obolibrary.org/obo/ENVO_01000970 obsolete: delicatessen true A food kiosk is a kiosk (a booth with an open window on one side) that sells food. Damion Dooley http://purl.obolibrary.org/obo/ENVO_01000974 obsolete: food kiosk true A grocery store is a retail store that primarily sells food. Damion Dooley http://purl.obolibrary.org/obo/ENVO_01000984 obsolete: grocery store true A specialty/ethnic food store is a store specializing in a particular variety of food or food of cultural / regional origin. Damion Dooley http://purl.obolibrary.org/obo/ENVO_01000988 obsolete: specialty/ethnic store true A market, or marketplace, is a location where people regularly gather for the purchase and sale of provisions, livestock, and other goods Damion Dooley http://purl.obolibrary.org/obo/ENVO_01000987 obsolete: market true A time measurement datum that pertains to a patient's medical treatment. Damion Dooley treatment history datum A restaurant providing prepared meals or other food items that the purchaser intends to eat elsewhere Damion Dooley https://en.wikipedia.org/wiki/Take-out http://purl.obolibrary.org/obo/ENVO_01000972 obsolete: take out restaurant true A treatment start date is a date/time datum which indicates the start of a particular medical treatment for a patient. Damion Dooley treatment start date canoeing / kayaking / boating hiking camping Damion Dooley http://purl.obolibrary.org/obo/PCO_0000033 obsolete: social gathering true http://purl.obolibrary.org/obo/PCO_0000035 obsolete: party true http://purl.obolibrary.org/obo/PCO_0000038 obsolete: wedding true http://purl.obolibrary.org/obo/PCO_0000039 obsolete: baby shower true http://purl.obolibrary.org/obo/PCO_0000037 obsolete: potluck true http://purl.obolibrary.org/obo/PCO_0000034 obsolete: community event true A long term care facility provides a type of residential care. It is a place of residence for people who require continual nursing care and have significant difficulty coping with the required activities of daily living. Damion Dooley https://en.wikipedia.org/wiki/Nursing_home_care http://purl.obolibrary.org/obo/ENVO_01000932 obsolete: long term care facility true A treatment end date is a date/time datum which indicates the end of a particular medical treatment for a patient. Damion Dooley treatment end date Damion Dooley food related exposure event Some bacteria are associated with particular foods. Need relationship Damion Dooley infection-specific food detail A tuberculosis treatment antibiotic is an antibiotic used in the treatment of tuberculosis. Damion Dooley tuberculosis treatment drug “Treatment of Tuberculosis Guidelines”, WHO, Geneva, 2010, 30 www.who.int/tb/ - See more at: http://www.tbfacts.org/tb-drugs/#sthash.nibuzCLV.dpuf vegetarian A time measurement datum that pertains to an organism's physical location. Damion Dooley organism location history datum This will take on a sub-herarchy of food types. food avoidance special diet water related exposure event municipal water A time measurement datum indicating the year a person immigrated to a given country. Damion Dooley immigration year of arrival bottled water An event (occuring at some location and for some duration of time) where Damion Dooley infectious disease exposure event Damion Dooley human related exposure event Damion Dooley animal related exposure event Identification or description of the specific individual from which this sample was obtained Damion Dooley https://www.ncbi.nlm.nih.gov/biosample/docs/attributes/ NCBI BioSample isolate (human name or description) Damion Dooley http://purl.obolibrary.org/obo/ENVO_01000923 obsolete: petting zoo true Damion Dooley agricultural fair An isolate passage history is a record of the cyclic growth process that led to it, including number of times its was put through the process, and the method used. Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0001835 isolate passage history datum For Eschericia coli: STEC, UPEC A pathotype is a common name for a group of organisms (of the same species) that have the same pathogenicity on a specified host https://en.wiktionary.org/wiki/pathotype NCBI pathotype Damion Dooley pet treat Damion Dooley pet food (raw) An epidemiology travel datum is a datum regarding a particular human's travel trip destination, mode, and motivation, with respect to an epidemiology investigation. Damion Dooley human exposure event - travel datum boat / cruise travel train travel bus travel car travel airline travel A travel mode is a type of transportation which a given human has used while travelling on a particular trip. Damion Dooley travel mode A travel reason is a motivation for traveling on a particular trip expressed by the traveller. Damion Dooley travel reason travel for pleasure business activity religious missionairy activity rural backpacking activity visiting relative/friends travel tour unsure of travel reason The tuberculosis draft standard is a GenEpiO set of draft genomic and clinical data items related to Mycobacterium tuberculosis that various agencies are using in their own reporting. draft tuberculosis - specimen contextual data standard Damion Dooley behavioural risk factor http://purl.obolibrary.org/obo/GENEPIO_0001686 obsolete: E. coli serotype part true Snomed: Concept 260827009 O antigen Snomed: Concept 260823008 H antigen Damion Dooley homeless status A serotype or serovar is a distinct variation within a species of bacteria or virus or among immune cells of different individuals. These microorganisms, viruses, or cells are classified together based on their cell surface antigens, allowing the epidemiologic classification of organisms to the sub-species level Damion Dooley https://en.wikipedia.org/wiki/Serotype serovar/serotype common name Salmonella serovar Salmonella serovar name Salmonella antigenic formula Damion Dooley drug use status Damion Dooley incarceration status Damion Dooley HIV status Damion Dooley HIV risk status Damion Dooley A.I.D.S status Damion Dooley drug abuse status Damion Dooley injection drug use status Damion Dooley recreational drug use status Damion Dooley methadone use status Damion Dooley substantial alcohol use/abuse status Damion Dooley smoking status A country of birth is the country that a given person (or animal) was born in. Damion Dooley country of birth Tb_1 (with respect to an Illumina SampleSheet.csv spreadsheet) 13 (with respect to a ReSeqTB SNP pipeline sample report) A sequence number is a numeric identifier of a sequence with respect to a particular data table or dataset. sequence number Damion Dooley The 'date last seen alive' is the date that a subject (patient) was last known (seen, heard, messaged) to be alive. date last seen alive second line or reserve tuberculosis drug tuberculosis infection anatomical site http://purl.obolibrary.org/obo/GENEPIO_0001007 obsolete: NCBI Antibiogram model true order: GENEPIO:0001187 # antimicrobial resistance test drug GENEPIO:0002112 # drug MIC GENEPIO:0002080 # drug minimum inhibitory concentration unit GENEPIO:0001001 # measurement comparator NCIT:C85539 # antimicrobial resistance phenotype GENEPIO:0002181 # antimicrobial resistance phenotype - ECOFF This is the minimal set of parameters for drug test results for a particular antimicrobial agent. Damion Dooley draft antibiogram drug test model http://the-vet.net/DVMWiz/Vetlibrary/Lab-%20Microbiology%20Guide%20to%20Interpreting%20MIC.htm https://www.ncbi.nlm.nih.gov/biosample/docs/antibiogram/ A date-time datum that marks the predicted start of possible human or animal exposure to a pathogen agent, directly or indirectly, based on a suspected pathogen and first symptom date-time. Damion Dooley predicted exposure start A date-time datum that marks the predicted end of possible human or animal exposure to a pathogen agent, directly or indirectly, based on a suspected pathogen and first symptom date-time. Damion Dooley predicted exposure end Map: - project that results in non-sequence map data such as genetic map, radiation hybrid map, cytogenetic map, optical map, and etc. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ map project Metagenome: sequence analysis of environmental samples Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ metagenome project Metagenome assembly: a genome assembly generated from sequenced environmental samples Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ metagenome assembly project Other: a free text description is provided to indicate Other data type Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ other project data type Phenotype or Genotype: project correlating phenotype and genotype Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ phenotype or genotype project Proteome: large scale proteomics experiment including mass spec. analysis Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ proteome project Random Survey: sequence generated from a random sampling of the collected sample; not intended to be comprehensive sampling of the material. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ random survey project Targeted locus (loci): project to sequence specific loci, such as a 16S rRNA sequencing Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ targeted locus (loci) project Transcriptome or Gene expression: large scale RNA sequencing or expression analysis. Includes cDNA, EST, RNA_seq, and microarray. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ transcriptome or gene expression A sequencing project with a primary goal of identifying large or small sequence variation across populations. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ variation project (Afghanistan|Albania|Algeria|American Samoa|Andorra|Angola|Anguilla|Antarctica|Antigua and Barbuda|Arctic Ocean|Argentina|Armenia|Aruba|Ashmore and Cartier Islands|Atlantic Ocean|Australia|Austria|Azerbaijan|Bahamas|Bahrain|Baltic Sea|Baker Island|Bangladesh|Barbados|Bassas da India|Belarus|Belgium|Belize|Benin|Bermuda|Bhutan|Bolivia|Borneo|Bosnia and Herzegovina|Botswana|Bouvet Island|Brazil|British Virgin Islands|Brunei|Bulgaria|Burkina Faso|Burundi|Cambodia|Cameroon|Canada|Cape Verde|Cayman Islands|Central African Republic|Chad|Chile|China|Christmas Island|Clipperton Island|Cocos Islands|Colombia|Comoros|Cook Islands|Coral Sea Islands|Costa Rica|Cote d'Ivoire|Croatia|Cuba|Curacao|Cyprus|Czech Republic|Democratic Republic of the Congo|Denmark|Djibouti|Dominica|Dominican Republic|East Timor|Ecuador|Egypt|El Salvador|Equatorial Guinea|Eritrea|Estonia|Ethiopia|Europa Island|Falkland Islands (Islas Malvinas)|Faroe Islands|Fiji|Finland|France|French Guiana|French Polynesia|French Southern and Antarctic Lands|Gabon|Gambia|Gaza Strip|Georgia|Germany|Ghana|Gibraltar|Glorioso Islands|Greece|Greenland|Grenada|Guadeloupe|Guam|Guatemala|Guernsey|Guinea|Guinea-Bissau|Guyana|Haiti|Heard Island and McDonald Islands|Honduras|Hong Kong|Howland Island|Hungary|Iceland|India|Indian Ocean|Indonesia|Iran|Iraq|Ireland|Isle of Man|Israel|Italy|Jamaica|Jan Mayen|Japan|Jarvis Island|Jersey|Johnston Atoll|Jordan|Juan de Nova Island|Kazakhstan|Kenya|Kerguelen Archipelago|Kingman Reef|Kiribati|Kosovo|Kuwait|Kyrgyzstan|Laos|Latvia|Lebanon|Lesotho|Liberia|Libya|Liechtenstein|Line Islands|Lithuania|Luxembourg|Macau|Macedonia|Madagascar|Malawi|Malaysia|Maldives|Mali|Malta|Marshall Islands|Martinique|Mauritania|Mauritius|Mayotte|Mediterranean Sea|Mexico|Micronesia|Midway Islands|Moldova|Monaco|Mongolia|Montenegro|Montserrat|Morocco|Mozambique|Myanmar|Namibia|Nauru|Navassa Island|Nepal|Netherlands|New Caledonia|New Zealand|Nicaragua|Niger|Nigeria|Niue|Norfolk Island|North Korea|North Sea|Northern Mariana Islands|Norway|Oman|Pacific Ocean|Pakistan|Palau|Palmyra Atoll|Panama|Papua New Guinea|Paracel Islands|Paraguay|Peru|Philippines|Pitcairn Islands|Poland|Portugal|Puerto Rico|Qatar|Republic of the Congo|Reunion|Romania|Ross Sea|Russia|Rwanda|Saint Helena|Saint Kitts and Nevis|Saint Lucia|Saint Pierre and Miquelon|Saint Vincent and the Grenadines|Samoa|San Marino|Sao Tome and Principe|Saudi Arabia|Senegal|Serbia|Seychelles|Sierra Leone|Singapore|Sint Maarten|Slovakia|Slovenia|Solomon Islands|Somalia|South Africa|South Georgia and the South Sandwich Islands|South Korea|South Sudan|Southern Ocean|Spain|Spratly Islands|Sri Lanka|State of Palestine|Sudan|Suriname|Svalbard|Swaziland|Sweden|Switzerland|Syria|Taiwan|Tajikistan|Tanzania|Tasman Sea|Thailand|Togo|Tokelau|Tonga|Trinidad and Tobago|Tromelin Island|Tunisia|Turkey|Turkmenistan|Turks and Caicos Islands|Tuvalu|USA|Uganda|Ukraine|United Arab Emirates|United Kingdom|Uruguay|Uzbekistan|Vanuatu|Venezuela|Viet Nam|Virgin Islands|Wake Island|Wallis and Futuna|West Bank|Western Sahara|Yemen|Zambia|Zimbabwe|Historical Country Names|Belgian Congo|British Guiana|Burma|Czechoslovakia|Former Yugoslav Republic of Macedonia|Korea|Netherlands Antilles|Serbia and Montenegro|Siam|USSR|Yugoslavia|Zaire)(:[a-zA-Z0-9 ]+)? help: Consult the http://www.insdc.org/country.html country list for valid options. The INSDC country qualifier is a textual controlled vocabulary used to indicate the country of origin of a DNA sample. It can also have the following format: country:sub_region, such as: "Canada:Vancouver". Damion Dooley INSDC country qualifier phenotype Phenotype: phenotypic measurements for submission to dbGaP Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ phenotype project objective genome material Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ genome material from specimen TB lineage 7 - Ethiopia An object aggregate which has as members whole organs or parts of organs, possibly from different organisms. 2021-04-19T23:36:25Z organs or organ parts A travel destination which a given human has travelled to on a particular trip and is outside Canada. 2021-05-27T04:35:17Z travel outside Canada A travel destination which a given human has travelled to on a particular trip and is outside a given province/territory of interest. 2021-05-27T04:36:13Z travel outside province/territory An instance of a "data field category" that contains the "bioproject accession field", "biosample accession field", and "GISAID accession field". A data field category which receives a range of one or more data fields. 2022-11-23T00:33:27.896Z data field category A livespan history item is a date-associated event pertaining to the life of a subject. Damion Dooley lifespan history item untreated water treated water - ambient temperature epidemiology investigation data item Unclear semantics and origin obsolete distribution event true obsolete supply point / event true well water - treated Well water - untreated organism datum http://purl.obolibrary.org/obo/GENEPIO_0001171 http://purl.obolibrary.org/obo/OBI_0100026 http://purl.obolibrary.org/obo/PATO_0000047 a role which inheres in an organism and is realized by the process of being alive organism role isolate organism role animal role plant role Damion Dooley taxonomic datum A course of treatment designed to resolve an infectious disease Damion Dooley infectious disease treatment This is a collection of drugs used to test antimicrobial resistance. This is a class of chemicals by use. They are organized more indirectly in CHEBI via 'has role' some 'antimicrobial agent'. Note: NCBI provides special requirements for antibiotic test-selection for BETA-LACTAMASE, see https://www.ncbi.nlm.nih.gov/biosample/docs/beta-lactamase/ Damion Dooley note: Used http://cts.fiehnlab.ucdavis.edu/conversion/batchConvert to convert antibiotic name to CHEBI entries. 60% match, some matches off, had to curate hits manually. Note that this is a longer list of antibiotics that we will need to incorporate, perhaps through ARO: https://www.ncbi.nlm.nih.gov/biosample/docs/antibiogram/?format=txt Damion Dooley antimicrobial resistance test drug Damion Dooley Judgements here are marked down after an agent's (doctor/veterenary) analysis of symptoms. analytic datum Damion Dooley http://purl.obolibrary.org/obo/DOID_4 obsolete: type of illness true A specimen which has been confirmed by laboratory test(s) to contain a disease-causing agent such as a pathogen Damion Dooley specimen with confirmed presence of disease agent dietary restriction A person which has a patient care role. Damion Dooley patient type obsolete: mother true NCBI BioSample depth:1 lookup A subject disease outcome is an assessment of the persistence of a given disease course in a subject (patient) Damion Dooley Note that this is closely related to OGMS_0000063 "disease course" that lists chronic, progressive, transient and accute processes. subject disease outcome treated water - warm/hot jacuzzi water spa water whirlpool water A host datum is a datum that pertains to infection-related information about an organism (animal or human) who is likely bearing a pathogen Damion Dooley patient as host datum pregnancy history item date of conception Damion Dooley lifespan history record http://purl.obolibrary.org/obo/FOODON_03450002 obsolete: food cooking method true http://purl.obolibrary.org/obo/FOODON_03470107 obsolete: food preservation method true infectious disease case datum laboratory data Damion Dooley specimen record The type of material from which the specimen was obtained. Specimens are usually categorized as food, body products or tissues, or environmental material. Damion Dooley specimen type (host or environmental context) specimen source material category Organism has a symbiotic relationship with host symbiotic role http://purl.obolibrary.org/obo/GENEPIO_0000113 obsolete: NCBI BioSample attribute package true Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0000025 http://purl.obolibrary.org/obo/GENEPIO_0000028 obsolete: host sample material true Damion Dooley food specimen detail patient specimen environmental NCBI Biosample: environmental (swab or sampling) environmental specimen A specimen extracted from an animal Damion Dooley animal specimen built environment specimen food specimen natural environment specimen The type of person a patient had epidemiological (i.e. close proximity) contact with. Damion Dooley exposure event person Damion Dooley An exposure group statistics record is a sumary of disease outbreak statistics on a certain date of a group of humans or animals potentially exposed to a pathogen. exposure group statistics record symptom history item Damion Dooley health state record http://purl.obolibrary.org/obo/NCIT_C17627 obsolete: swab true [WHICH DEFINITION?] See http://medical-dictionary.thefreedictionary.com/aspirate Damion Dooley obsolete: fluid - aspirate true Plant rinse is a rinse derived from plant material from one or more plants. Damion Dooley plant rinse Damion Dooley outbreak animal exposure event location Damion Dooley animal exposure process Damion Dooley animal exposure event by material Damion Dooley exposure by human contact Damion Dooley unnecessary distinction obsolete: human exposure activity type true Damion Dooley outbreak human exposure event location A process in which one or more humans may be exposed to one or more contagious human infectious disease carriers. Damion Dooley human to human exposure process water exposure by source human water activity This list would be customized to include organizations local to an application's software installation. It is an example of a set of ontology terms that would best be organized by a central authority if federated installations need to share investigator contact info for example. health authority order: HP:0000118 # symptom FLU:0000976 # onset GENEPIO:0001789 # cessation GENEPIO:0001784 # duration Damion Dooley symptom record A visual pattern caused by DNA fragments concentrated in stratefied bands in a Pulsed-field gel electrophoresis (PFGE) gel plate. PFGE pattern phage type salmonella (DT) phage type e-coli 0157 (PT) Damion Dooley sequence record A type of built structure that is suspected to be related to an outbreak investigation. Damion Dooley outbreak exposure event location http://purl.obolibrary.org/obo/ENVO_00000070 Damion Dooley lab test history record obsolete: earth surface true A data specification is a specification of a data structure that can hold a categorical, textual or numeric variable, or that has such variables or other data structures as component parts. A data standard, may be defined as a data specification, but a data specification isn't necessarily a formal or de-facto data standard - it would have to be sanctioned by a community of users and developers to take on that status. Damion Dooley data specification A specimen history item is a time-related datum about a particular specimen Damion Dooley specimen history item Damion Dooley diagnostic test Damion Dooley obsolete: sample datum true A laboratory sequencing datum is a datum related to the sequencing assay process applied to an isolate. Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0001681 obsolete: laboratory sequencing datum true genomic sequencing annotation datum Damion Dooley ISSUE: lab test results are appropriate to lab test types. Logic around that? lab test record sequencing assembly datum A container id is an identifier that refers to a container object about a patient specimen. Container ids are referenced in clinical patient records reported to be suspect, or lab test results. Damion Dooley container record Damion Dooley identifier of a person with the context of a health care system health care personal identifier A strain identifier is the unique microbial or eukaryotic strain name from a reference database that a sample has been matched to. Damion Dooley strain identifier A line list is a collection of suspected or known disease outbreak cases. It may also include details about healthy patients associated with an exposure event (picnic, concert etc.) who have been interviewed in order to compare food intake, assess risk factors, etc. Note that the health status of people associated with an exposure event can change from healthy or unknown status, to ill patient. Damion Dooley line list object This class is a temporary holding bin for items that need a class hierarchy home, definition, and/or subclass items. Some classes have subclasses that are currently disjoint in other ontologies like ENVO, so awaiting resolution on such issues. Damion Dooley awaiting ontology review The IRIDA user ID is a user account identifier provided by the IRIDA.ca system. Damion Dooley The count of healthy individuals or animals on a certain date/time from a group that was likely exposed to a pathogen Damion Dooley exposure group healthy count The count of ill individuals or animals on a certain date/time from a group that was likely exposed to a pathogen Damion Dooley exposure group ill count human general activity Damion Dooley An exposure can be narrowed down to a single date, or can be reasoned to exist with a lower and/or upper bound date. A single date exposure has lower and upper bound dates set as the same value. exposure history record Damion Dooley isolate history record Damion Dooley An outbreak date may have an official start, but because it might be ongoing, it may not have an end (yet). outbreak history record Damion Dooley Needs to cover the various pregnancy events pregnancy history record Damion Dooley Issue: a diagnosis is a set of diseases, each of which can be marked primary or secondary "Illness causality" with respect to a particular case? Requires a relation annotation. diagnosis record Damion Dooley exposure event record Damion Dooley hospitalization event record An exposure can be narrowed down to a single date, or can be reasoned to exist with a lower and/or upper bound date. A single date exposure has lower and upper bound dates set as the same value. exposure history item hospital admission date A date-time datum that marks the end of possible human or animal exposure to a pathogen agent, directly or indirectly. Damion Dooley exposure event end A date-time datum that marks the start of a human or animal exposure event to a suspected pathogen agent, directly or indirectly. Damion Dooley This datum is NOT predicted as a function of pathogen and first symptom date. exposure event start hospitalization history item isolate history item An outbreak date may have an official start, but because it might be ongoing, it may not have an end (yet). outbreak history item hospital discharge date A notification date is the date that an outbreak is suspected by a laboratory or other agency and is reported per outbreak detection guidelines. Damion Dooley notification date An outbreak date is on which disease onset appears to have occured Damion Dooley outbreak date An outbreak over date is the date on which a disease outbreak appears to have abated Damion Dooley outbreak over test history item lab test result date A data item that is a set of entities, referenced by mention, that don't necessarily have something in common. 0000-0002-8844-9165 0000-0002-9578-0788 2022-01-31T03:45:04Z While "enumeration" is a "data item" not everything referenced within an enumeration qualifies as a "data item". Not to be confused with "data set" which describes "data item"s of the same type that have something in common. data enumeration An analytic datum which describes the observed or detectable signs, and experienced symptomps of an illness, injury, or condition. 0000-0002-9578-0788 https://en.wikipedia.org/wiki/Signs_and_symptoms 2022-01-31T23:08:50Z signs & symptoms signs and symptoms A particular attribute (feature, quality or process history) of a food sample. Damion Dooley The underlying facets will be supplied from the FoodOn food ontology. food specimen datum order: FOODON:03411564 # food source FOODON:00002381 # food product by organism FOODON:03311737 # processed food product NCIT:C71898 # brand name A food specimen type is a categorization of a food specimen, by the organism it is or came from, or by a class of food products. Damion Dooley food specimen specification An agency that has reported a disease cluster. Damion Dooley disease cluster notification agency This is a subset of OBI investigation identifier. It pertains only to NCBI Biosample projects Damion Dooley http://purl.obolibrary.org/obo/OBI_0001628 obsolete: NCBI BioProject ID true The count of individuals or animals for which no health status is available on a certain date/time from a group that was likely exposed to a pathogen. Damion Dooley exposure group unknown health status count A role which inheres in a person and is realized by the process of being in the context of civilization Damion Dooley person role An exposed person role is the role a person takes on who has been exposed to a pathogen. In serious situations this may call for special treatment, e.g. quarantine. Damion Dooley exposed person role parent-guardian role Damion Dooley illness causality pre-existing condition To restore to the normal state after some pathologic process. Damion Dooley Dorland's Illustrated Medical Dictionary, 30th ed. improving recovering resolving health trend Damion Dooley close interpersonal relationship exposure Damion Dooley Found to be redudant with another term, deprecrated in favor of said term. obsolete: household exposure true Damion Dooley exposure to individual with diarrhea Damion Dooley other human exposure Damion Dooley sexual contact exposure Damion Dooley exposure via diaper changing (human fecal) container identifier PulseNet: Outbreak outbreak identifier GENEPIO reference genome size The assembly genome size is the sum of all lengths of a sequence assembly's contigs. Damion Dooley GENEPIO As noted in Quast documentation: " The total assembly size may increase (and in some cases exceeds the genome size) due to contaminants (see Chitsaz et al. (2011)), misassembled contigs, repeats, and hubs that contribute to multiple contigs. " assembly genome size Identifier for a reference genome (used to build or compare an assembly) that is used in the genome browsers and in the community Damion Dooley GENEPIO reference genome identifier Assembly QC The 'genome size delta ratio' is a ratio of the difference between a sample's genome assembly size and the size of a given reference genome, over the reference genome size. The formula: | (reference_genome_size - assembly_genome_size) | / reference_genome_size This yeilds a decimal / percentage ranging from 0 = exact length, to say .1 which is equivalent to a 10% variance. If the ratio is too far from a quality control ratio that depends on the species in question, this indicates a bad assembly or a mis-identified sample. Damion Dooley GENEPIO genome size delta ratio Assembly QC This is the upper acceptable limit of the 'genome size delta ratio' with respect to a particular species. For Salmonella a 0.10 variance in length between an assembly size and a reference genome size may be acceptable. For E-coli a more generous 0.30 or .40 variance should be allowed, given the pathogenic subspecies’ highly dynamic genome. GENEPIO genome size delta ratio QC threshold Assembly QC http://purl.obolibrary.org/obo/IAO_0000428 The contig N50 length QC threshold is a minimum length threshold that a contig N50 length datum should be above for good QC. A lower length is indicative of assembly problems (insufficient read depth) or a mismatched reference genome. Damion Dooley GENEPIO contig N50 length QC threshold Assembly QC 2000 The contig N99 length QC threshold is the minimum length that a contig N99 or NG99 length datum can be for satisfactory QC. This threshold avoids having too many gene coding regions prematurely clipped inside short contigs. An assembly having 1% or more of its content in less than 2Kbp chunks is cause for concern. Bacterial genes are generally about 1Kbp in length. Damion Dooley GENEPIO contig N99 length QC threshold lookup Homo sapiens The species of the host organism from which a specimen (pathogen) organism was obtained. Use the full taxonomic name, eg, "Homo sapiens". Damion Dooley subject organism (host) taxonomic species lookup Assembly QC A single datum having to do with a metric for genomic sequence assembly quality control. A "quality control measurement" can be defined as a measurement that has some norm within a definable context, e.g. for a certain subspecies. It may be a calculated quantity. A "quality control metric" is an upper or lower bound threshold that a measurement is compared to. Surpassing the threshold may indicate an error situation. A "measure" can indicate an amount of activity, consumption, transformation, etc. but doesn't itself convey an over-stepping of bounds in the way that a metric does. Damion Dooley GENEPIO assembly quality control datum Damion Dooley GSCID-BRC:Clinical/host-associated pathogen NCBI:Pathogen.cl clinical or host-associated pathogen Clinical/host-associated pathogen Pathogen.cl Assembly QC http://purl.obolibrary.org/obo/IAO_0000428 The N99 length definition is identical to the N50 definition except that the threshold for determining the contigs is 1% of the nucleotides. This contig basepair length can be compared to a minimum threshold to determine if gene annotation (where contigs greater than say 2000bp are necessary) would generally succeed. Damion Dooley GENEPIO contig N99 Assembly QC http://purl.obolibrary.org/obo/IAO_0000428 The upper threshold of contig counts that are acceptable in a genome assembly. This may vary based on the overall genome length and depth of coverage. A large contig count may indicate insufficient read coverage. Damion Dooley contig count QC threshold Indicates how the knowledge derived from the project can be applied, and to what field(s). Damion Dooley's note: NAIAD GRCID-BRC has Project Relevance map to OBO Foundry Study Design (http://purl.obolibrary.org/obo/OBI_0500000) but this is a bit of a stretch. Damion Dooley https://www.niaid.nih.gov/research/dmid-metadata-standards-core-project project relevance agricultural project relevance medical knowledge relevance industrial processing relevance evolution modelling relevance environmental project relevance model organism relevance The NCBI BioProject model includes all datums associated with the current NCBI BioProject specification. Damion Dooley obsolete: NCBI BioProject model true http://purl.obolibrary.org/obo/GENEPIO_0000147 obsolete: NCBI SRA model true http://www.ncbi.nlm.nih.gov/biosample/docs/attributes/ http://www.ncbi.nlm.nih.gov/biosample/docs/packages/Pathogen.cl.1.0/ Damion Dooley notes: NCBI doesn't quite make the distinction but it may be helpful to distinguish sample submission facilitator vs submitter/repository organization contact? So should we have a BioSample_submission_facilitator field and a BioSample_contact_information field? Damion Dooley Damion's note: There are a few different kinds of biosample, and required fields depend on the type selected. draft NCBI BioSample model - mixed clinical and environmental/food/other monoisolate A monoisolate sample scope involves specimens from a single animal, cultured cell-line, inbred population, or possibly a heterogeneous population when a single genome assembly is generated from a pooled sample because multiple individuals are needed to collect enough material and an inbred line is not available; however, this situation is not preferred. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ monoisolate monoisolate sample scope multiisolate A multiisolate sample scope contains multiple individuals that represent distinct specimen collections, a population (representative of a species). Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ This is often used for variation or phenotype and genotype studies. This should not be used when multiple genomes will be annotated. Eventually, multiple locus_tag prefixes will be able to be assigned to a single multiisolate genome sequencing project, but currently only a single prefix can be registered per project. Therefore, individual monoisolate projects need to be registered when more than one genome will be annotated. multiisolate sample scope multispecies A multi-species sample scope involves a sample representing multiple species. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ multispecies sample scope environment An environment sample scope indicates that the species content of the sample is not known. Generally, nucleic acid is directly isolated from an environmental sample for analysis. This is used for metagenome studies. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ environment sample scope A synthetic sample scope involves a sample synthesized in a laboratory. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ synthetic sample scope Damion Dooley single cell other Other project objective: specify an objective not listed above Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ other project objective Damion Dooley GSCID-BRC:environmental/food/other NCBI:Pathogen.env environmental, food or other pathogen environmental/food/other Pathogen.env whole Whole: the project makes use of the whole sample material (most common case). Use this for whole genome sequencing studies, transcriptome studies that are not targeting specific loci, epigenetic studies of a genome, and metagenomes or unbiased transcriptome studies of metagenomes. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ whole genomic data capture raw sequence reads A raw sequence reads project objective has the goal of submission of raw reads to SRA or Trace repositories Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ raw sequence read objective sequence A sequence project objective has a goal of submission of sequence data to standard archival sequence databases (yielding accession.version identifiers; e.g., whole genome shotgun, cDNA sequences, transcript shotgun assemblies) Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ sequence project objective analysis An "analysis project objective" indicates there was an other analysis not otherwise indicated, includes submission of BAM files. This defn. could be a negation of the "not otherwise indicated". Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ analysis objective assembly Submission of genome assembly (AGP data) Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ assembly objective annotation Sequence annotation data Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ sequence annotation objective sequence variation Variation: identification of sequence variation data for submission to dbSNP or dbVAR Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ sequence variation project objective epigenetic markers DNA methylation, histone modification, chromatin accessibility datasets Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ epigenetic markers objective expression Expression: assays of transcript or protein existence or abundance Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ transcript or protein expression objective maps Maps: non-sequence based map data; e.g., genetic, radiation hybrid, cytogenetic, etc. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ mapping objective Phenotypic descriptive data. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ phenotypic observation data from specimen select sequence if any sequence data is generated from this project. sequencing Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ sequencing project method array Select Array if that is the primary method and no sequence data is submitted Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ array project method mass spectrometry Select Mass Spectrometry if that is the primary method Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ mass spectrometry project method order: NCIT:C40974 # first name NCIT:C40975 # last name NCIT:C42775 # email address GENEPIO:0001756 # phone contact specification - person A categorical investigation datum is a datum pertaining to some entity examined for scientific research or diagnostic purposes and which has a scope that is broader than one of the clinical, environmental or epidemiological domains. Damion Dooley investigation datum An organism and its host share physical space but no evidence exists to determine if one benefits from this arrangement. commensal role draft isolate details model A subject age is the age since birth of a given organism that is involved in an investigation or study at a given time. Damion Dooley subject age draft NCBI component model draft IRIDA BioSample model draft GenEpiO epidemiology case record A subject description is a field containing additional information about an organism (related to an investigation, study and/or specimen) that is not included in other defined vocabulary fields. Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0001835 subject description host description NCBI BioSample The host profile or identifiers text is the "identification or description of the specific individual from which this sample was obtained". Damion Dooley NCBI BioSample: isolate host profile or identifers A subject identifier is an identifier of a subject organism within the context of a particular investigation, study, or specimen extraction event. Damion Dooley This varies in its type (numeric or string) and its central repository depending on its context (use in a particular application/instance). Note: GSCID-BRC specification states " 'Host' is not the preferred term since some specimens may lack any detectable pathogen". In other words, a person would technically (logically) only have a host identifier AFTER they have been diagnosed as having a pathogen. A health care personal identifier, or a case identifier would be better than a more ephemeral "host id" subject identifier Name of disease in a subject that is related to a given investigation, study and/or specimen. Damion Dooley NIAID GSCID-BRC metadata working group Controlled vocabulary. Human specimen source: https://bioportal.bioontology.org/ontologies/DOID or https://www.ncbi.nlm.nih.gov/mesh/1000067 subject disease INSDC term (top level) A categorical choice recorded when a measurement value was known to be recorded in the past but the observed value cannot be located or retrieved for some reason. NCBI Biosample: missing missing INSDC term (top level) 'not applicable' is an appropriate value for the measure of femur length for a patient with a missing limb A categorical choice recorded when a measurable datum does not apply to a given context. Damion Dooley not applicable INSDC term (lower level) A categorical choice recorded when a datum was not measured with respect to some entity during some process. NCBI Biosample: not collected not collected draft isolate identifier model draft GenEpiO isolate testing model draft NCBI isolate source location model draft GenEpiO isolate environment model A subject disease stage is a measure of the acuteness of a subject organism's diagnosed disease, if any, at a given time in an investigation, study or specimen extraction event. Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0001835 subject disease stage environmental datum temperature of sample de-novo assembly http://purl.obolibrary.org/obo/GENEPIO_0000025 obsolete: sample body source true contig pre-analysis minimum length QC threshold A 'contig length QC threshold' is an integer setting which can be applied by an algorithm to an array of assembly and annotation contig lengths to remove (filter out) all items less than the threshold value. The reduced array is used for calculation of statistics. Damion Dooley http://quast.bioinf.spbau.ru/manual.html QUAST specifies which of its report stats are affected by the "--min-contig" minimum contig length parameter. contig length QC threshold The NG50 length is the same as the N50 length except that the length of the reference genome is used in the calculation rather than the assembly genome size. Damion Dooley "Note that N50 is calculated in the context of the assembly size rather than the genome size. Therefore, comparisons of N50 values derived from assemblies of significantly different lengths are usually not informative, even if for the same genome. To address this, the authors of the Assemblathon competition derived a new measure called NG50. The NG50 statistic is the same as N50 except that it is 50% of the known or estimated genome size that must be of the NG50 length or longer. This allows for meaningful comparisons between different assemblies. In the typical case that the assembly size is not more than the genome size, the NG50 statistic will not be more than the N50 statistic." - wikipedia contig NG50 A contig length is the count of base pairs in a given sequence assembly contig. Damion Dooley contig length Assembly QC http://purl.obolibrary.org/obo/IAO_0000428 Contig L50 is the number of contigs equal to or longer than contig N50. The length of the assembly itself is used in the calculation. Damion Dooley QUAST manual contig L50 Assembly QC http://purl.obolibrary.org/obo/IAO_0000428 The LG99 count is like the L99 count except that the length of the reference genome is used in the calculation rather than the sum of assembly contig lengths. Damion Dooley contig LG99 An organism isolate datum is any datum pertaining to an isolate organism Damion Dooley obsolete: specimen from subject datum true A product of an animal, as in blood, feces, sputum, etc. that has no particular anatomical site Damion Dooley animal body product This is the scientific role or category that the subject organism or material has with respect to an investigation. Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0001237 obsolete: specimen source material category true draft GenEpiO general information record An antigenic formula is a string composed of codes representing categorized results of tests performed on various viral, bacterial or immune cell surface antigens. Damion Dooley antigenic formula http://medical-dictionary.thefreedictionary.com/antigen lookup This is the taxonomic species descriptor of a specimen isolate (an organism found within a specimen.) Damion Dooley specimen organism taxonomy (species) An identifier moe3l indicates an identifier, the organization it has been issued by, and the status of the identifier - whether it is a primary one, active or archaic. Damion Dooley identifier model help:This should be populated with a selection from INSDC.org controlled vocabulary for providers: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/coll_dump.txt The curator organization name is the name of an organization that manages a given repository of entities (samples, isolates, sequences ...) Damion Dooley curator organization name INSDC:institution_code:BRS IRIDA:ACK Agriculture and Agri-Food Canada An isolate identifier is an identifier assigned to a given isolate by a particular agency that is handling or storing it. Damion Dooley isolate identifier An identifier status describes the principle role of an identifier assigned to an entity by its curator or handler. An entity may have a number of identifiers assigned to it over time. An identifier may be phased-out as can happen when its content is transferred to another curator. Damion Dooley identifier status GenEpiO A data protection role is a role (or user permission) that a user has with respect to particular data. A data server may be able to provide a protected(modified) dataset with metadata indicating what parts have a protection role with respect to the user requesting this information. A data protection role would be inherited from a user's relation to an entitiy to the entity's components or subclasses. Damion Dooley data protection role An identifier used in the initial submission of content to a repository. original submission This was created to help define a contributing organization to a repository. However this still needs organizing. The "collected_by" field is free text but contact name AND organization could be associated with this in controlled vocabulary. NCBI Contributing organization / project food - liquid draft NCBI BioSample identifier model An INSDC institution code is an identifier of an organization from a list of sequence repository organizations managed by INSDC. Damion Dooley http://www.insdc.org/controlled-vocabulary-culturecollection-qualifier INSDC institution code draft IRIDA epidemiological case model An identifier that had been used previously but is no longer promoted. Damion Dooley previous (archaic) identifier A geographical location datum is a datum that refers to a real, hypothesized or fanciful spatial location Damion Dooley geographic location datum A categorical tree specification datum is a categorical value specification which takes its value from the ontology URI identifier of any one of its subclass items; these may be organized in a hierarchy. Damion Dooley http://purl.obolibrary.org/obo/OBI_0001930 A categorical tree specification datum may have its own URI identifier as its value. This could be interpreted to mean that the datum was recorded but that no particular categorical subclass distinction was observed. This in effect makes the datum act as a boolean variable. obsolete: categorical tree specification true Damion Dooley A geographic coordinate entity composed of both latitude and longitude components. latitude and longitude coordinate (ISO 6709) order: NCIT:C25464 # country ENVO:00000005 # state / province etc. Geographical origin of the sample; use the appropriate name from this list http://www.insdc.org/documents/country-qualifier-vocabulary. Use a colon to separate the country or ocean from more detailed information about the location, eg "Canada: Vancouver" or "Germany: halfway down Zugspitze, Alps" Damion Dooley draft NCBI BioSample geo_loc_name model draft GenEpiO isolate sequencing model A case ID is a unique identifier associated with a particular episode of care for an individual or animal. Damion Dooley case identifier An animal that has had a substantial part of its reproductive organs removed surgically. Damion Dooley NCBI BioSample: neuter Not a PATO phenotypic sex catigorical value. neuter NCBI BioSample 'pooled male and female' is a categorical value used when the male and female categories or levels of a factor are combined. If used to describe a single organism, then the organism will be one or the other. Not a PATO phenotypic sex catigorical value. Damion Dooley NCBI BioSample: 'pooled male and female' pooled male and female A categorical choice recorded while the data for a given measurable datum has not finished being collected, is awaiting a conclusion, or has not yet been communicated. Damion Dooley undetermined in process swimming / wading Damion Dooley NECESSARY? : What details are categorical and what are freehand? This item was originally too broad; moved sub-items to Treatment. infection detail datum This is problematic - its not clear if this is reporting that antibiotics were administered, or whether they are recommended as a result of diagnosis. Damion Dooley antibiotics required The International Nucleotide Sequence Database Collaboration (INSDC) is a long-standing foundational initiative that operates between DDBJ, EMBL-EBI and NCBI. INSDC covers the spectrum of data raw reads, though alignments and assemblies to functional annotation, enriched with contextual information relating to samples and experimental configurations. Damion Dooley International Nucleotide Database Collaboration standard INSDC INSDC standard Damion Dooley outcome of pregnancy associated with illness INSDC term (lower level) a categorical choice indicating that information of an expected format was not given; a value may be given at the later stage. not provided Damion Dooley stable health status http://purl.obolibrary.org/obo/GENEPIO_0002026 induced abortion live birth Damion Dooley http://purl.obolibrary.org/obo/VT_0002292 obsolete: gestation duration (weeks) true date of pregnancy outcome contact specification - physician obsolete: matrix - solid true contact specification - parent/guardian contact specification - patient draft GenEpiO case epidemiology record order: NCIT:C25464 # Country ENVO:00000005 # major administrative subdivision aka state/province/territory/region (via GenEpiO -> GAZ ) NCIT:C80234 # Municipality (via GenEpiO -> GAZ cities) NCIT:C25690 # Street Address NCIT:C25621 # postal code The address component of a contact specification. contact specification - address GENEPIO_0001829 obsolete: draft IRIDA model true A datum related to the sequencing assay process applied to an isolate. Damion Dooley laboratory sequencing datum http://purl.obolibrary.org/obo/OBI_0001901 http://purl.obolibrary.org/obo/OBI_0400103 Damion Dooley host illness associated with draft subject demographic specification K(1|2a|2ac|3|4|5|6|7|8|9|10|11|12|13|14|15|16|18a|18ab|19|20|22|23|24|26|27|28|29|30|31|34|37|39|40|41|42|43|44|45|46|47|49|50|51|52|53|54|56|96|55|74|82|84|85ab|85ac|87|92|93|95|97|98|100|101|102|103|X104|X105|X106) E. coli K antigen specification 1 6 other antigen specification E. coli serotype specification Damion's note: This will be converted to a categorical tree specification list of serovars by name or id asap. Salmonella serovar specification A lab test specification indicates the inputs and outputs (results / conclusions) of a test. Damion Dooley test specification (1/2|1/2a|1/2b|1/2c|3a|3b|3c|4a|4ab|4b|4c|4d|4e|1|7) listeria antigen obsolete: matrix - fluid true listeria serotype 1/2a listeria serotype 1/2b listeria serotype 1/2c listeria serotype 1/2 listeria serotype 3a listeria serotype 3b listeria serotype 4a listeria serotype 4b listeria serotype 4c listeria serotype 4d listeria serotype 4e non-monocytogenes listeria antigen listeria serotype 3c textual test result This is an organizational category for grouping parameters relevant to biomedical research projects. Damion Dooley obsolete: draft GenEpiO BioProject standard true Damion Dooley NCBI BioSample: collection_code NCBI BioSample collection code draft NCBI BioSample host model food - solid lab test datum Xbal Ascl Smal Blnl Apal Kpnl Taxonomy below subspecies; sometimes used in viruses to denote subgroups taken from a single isolate. Damion Dooley NCBI Biosample subgroup help:This is an optional additional subtype classification as per organism convention A viral subtype is a classification name for a virus according to some protocol. Damion Dooley NCBI BioSample viral subtype 1 1 serotype specification IRIDA An integer and character referring to a pattern of opaque or confluent lysis caused by a panel of bacteriophage infections e.g. 14a. Damion Dooley SNOMED: http://www.snomedbrowser.com/Codes/Details/272405002 phage type specification obsolete: swab - food contact surface true ATCC:26370 An NCBI culture collection is a composite identifier containing “institution code:collection code”. Annotation with a culture_collection qualifier implies that the sequence was obtained from a sample retrieved (by the submitter or a collaborator) from the indicated culture collection, or that the sequence was obtained from a sample that was deposited (by the submitter or a collaborator) in the indicated culture collection. See the description for the proper format and list of allowed institutes, http://www.insdc.org/controlled-vocabulary-culturecollection-qualifier. NCBI culture collection UAM:Mamm:52179 The NCBI specimen voucher is an identifier composed of “institution code:collection code:specimen id” parts. In addition to a specimen identifier, it includes an institution-code (and optional collection-code) taken from a controlled vocabulary maintained by the INSDC that denotes the museum or herbarium collection where the specimen resides. INSDC:specimen_voucher NCBI specimen voucher http://www.insdc.org/controlled-vocabulary-specimenvoucher-qualifier. culture identifier human; dog; horse The natural language (non-taxonomic) name of the type of organism (human or animal) that is the subject of a given investigation, study and/or specimen. Damion Dooley's note: as of 2017 this is a problematic field for NCBI Biosample because it is slotted into "host" field, yet field is listed as "host common name" which does not fit with requirement that scientific name be provided. Person: Damion Dooley subject organism common name temperature of air - average daily obsolete: swab - inanimate object surface true obsolete: swab - meat true obsolete: swab - rectal swab true obsolete: swab - tissue true unknown source patient sample human clinical specimen "environment (swab or sampling)" is a specific response for NCBI BioSample that could be mapped or unified with "environment sample" used elsewhere in GenEpiO Damion Dooley http://purl.obolibrary.org/obo/GENEPIO_0001246 obsolete: environmental (swab or sampling) true avian - wild avian - domesticated human clinical specimen - invasive In chemistry, pH is a numeric scale used to specify the acidity or basicity(alkalinity) of an aqueous solution Damion Dooley Wikipedia Normally falls within the bounds of 0 - 14 pH measurement precipitation The assembly and annotation objective is the set of objectives that assembly and annotation techniques (including instruments and software platforms) applied to a sample are meant to satisfy. This impacts on the quality control thresholds applied to the sequencing project. Damion Dooley assembly and annotation project objective Damion Dooley line list row record This class contains components under development for fulfilling GenEpiO and other standard data specifications. Components are relevant to epidemiological, laboratory, clinical or environmental pathogen sample data collection and analysis. Damion Dooley draft GenEpiO component record region lookup:http://purl.obolibrary.org/obo/GAZ_00000448 A subnational region is a type of subnational entity similar to a nation's state, province, or territory. Damion Dooley subnational region A data standard related to public health research and service delivery. Damion Dooley public health data standard Damion Dooley Reagent: material studied was obtained by chemical reaction, precipitation. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ reagent derived material from specimen transcriptome Transcriptome: transcript and/or expression data. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ transcriptome data from specimen proteome Proteome material is protein or peptide sequences. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ proteome data derived from specimen purified chromosome Purified chromosome: one or more chromosomes or replicons were experimentally purified. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ purified chromosome from specimen targeted locus/loci TargetedLocusLoci: capturing specific loci (gene, genomic region, bar code standard). Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ Targeted Locus Loci data capture clone ends CloneEnds: capturing clone end data. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ clone end data capture exome Exome: capturing exon-specific data. Damion Dooley NCBI BioSample https://www.ncbi.nlm.nih.gov/books/NBK54364/ exome specific data capture random survey Sequences generated from a random sampling of the collected sample; not intended to be comprehensive sampling of the material Is this with respect to an isolate or a specimen derived culture? Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54364/ random survey data capture This is a collection of various human (and animal contact location) contact information specifications used in data standards. Damion Dooley contact specification order: NCIT:C40978 # phone GENEPIO:0001896 # phone type contact specification - phone Damion Dooley millimetres per day obsolete: millisiemens true Damion Dooley millisiemens per centimetre Damion Dooley cubic metre per second umol/kg Damion Dooley micromole per kilogram Damion Dooley micromole per litre Damion Dooley quality control record Standard draft: describing the minimum information needed for submission to a public database. Damion Dooley 1 - standard draft High quality draft: describing sequences with little to no manual review. Damion Dooley 2 - high quality draft Improved high quality draft: in which data is either reviewed by people or machines to some extent to indicate that most of the genetic data is assembled correctly, but some errors may still be present. Damion Dooley 3 - improved high quality draft Annotation-directed improvement: in which genetic information in various gene regions is represented as accurately as possible. Damion Dooley 4 - annotation-directed improvement Non-contiguous finished: which includes sequences that have been reviewed by both people and machines and would be considered complete except for “recalcitrant regions” that are proving problematic for genome closure. Damion Dooley 5 - non-contiguous finished Finished: which describes un-gapped complete sequences that have minimal errors, if any. The biology of the microbe will determine whether this finished genome consists of more than one chromosome. Damion Dooley 6 - finished Assembly QC http://purl.obolibrary.org/obo/IAO_0000428 The NG99 length is the same as the N99 length except that the length of the reference genome is used in the calculation rather than the sum of assembly contig lengths. Damion Dooley GENEPIO contig NG99 Assembly QC http://purl.obolibrary.org/obo/IAO_0000428 Contig L50 is the number of contigs equal to or longer than contig N50. The length of the assembly itself is used in the calculation. Damion Dooley contig LG50 Assembly QC http://purl.obolibrary.org/obo/IAO_0000428 Contig L99 is the number of contigs equal to or longer than contig N99. The length of the assembly itself is used in the calculation. Damion Dooley contig L99 A subject age at time of specimen extraction is the age (since birth) of the organism at the time a given specimen was extracted. Damion Dooley subject age at time of specimen collection A public health related data standard involving a particular species of pathogen. Damion Dooley pathogen specific data standard The draft tuberculosis contextual data standard is a data standard for the collection of contextual information of a given tuberculosis case that pertain to typing, clinical treatment, and potential research questions. Damion Dooley draft tuberculosis contextual data standard 1 PulseNet:Outbreak The PulseNet Canada Salmonella submission standard details the fields required in a PuseNet Canada pathogen submission report for Salmonella in spreadsheet format Damion Dooley draft PulseNet Canada Salmonella submission standard http://www.pulsenetinternational.org/assets/PulseNet/uploads/wgs/PND18-NCBI-Biosample-Submission.pdf A PFGE pattern resulting from DNA fragmentation induced by the Xbal enzyme. PFGE Xbal pattern A PFGE pattern resulting from DNA fragmentation induced by the Blnl enzyme. PFGE Blnl pattern NLEP ID An NLEP ID is a unique identifier code for a sample registered with the Canadian "National Laboratory for Enteric Pathogens" (NLEP) Damion Dooley Pulsenet Canada spreadsheet. National Laboratory for Enteric Pathogens Identifier PulseNet spreadsheet Date an uploaded sample was re-uploaded or had its metadata modified. Damion Dooley PulseNet: UploadModifiedDate upload modified date A travel destination is a destination city (or populated place of some scale) which a given human has travelled to on a particular trip. Damion Dooley travel destination A symptom duration is a date-time entity which is the duration (in hours or days or weeks etc.) that one or more symptoms persist for in a particular human or animal Damion Dooley This may be a calculated field if symptoms are reported to cease on a particular date. Often however symptom duration is reported more generally as a date or time range since cesation date is less certain. symptom duration The city where a specimen was collected (taken) from a subject or environment Damion Dooley specimen collection location - city lookup Damion Dooley obsolete: sample city true This is the genus-level taxonomic name of an organism playing a pathogen role Damion Dooley genus of pathogen This is the species-level taxonomic name of an organism playing a pathogen role Damion Dooley species of pathogen A symptom cessation date is a date-time entity that marks the end of one or more reported symptoms pertaining to an episode of human or animal illness Damion Dooley symptom cessation date This is a catch-all category for listing specimen related terms Damion Dooley specimen datum http://purl.obolibrary.org/obo/GENEPIO_0002105 http://purl.obolibrary.org/obo/OBI_0001616 isolate identifier specification The 'data obfuscated' role between a user and given data entails that the user is authorized to see particular fields that have been transformed consistently such that statistics performed on those fields would yeild the same results as statistics applied to the un-obfuscated data. The purpose of obfuscation is to prevent related individuals or organizations from being identified. Damion Dooley data obfuscated https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/ The 'data aggregated' role between a user and given data entails that the user is authorized to see aggregated results (like the patient count) via preset reports, queries or views. The results are not obfuscated. Damion Dooley data aggregated A 'limited data set' role between a user and given data entails that a user is authorized to see some of the fields but the remainder are withheld (or encrypted). Damion Dooley limited data set help:In this dataset, selected data have been de-identified (anonymized) for access by dataset users. A 'de-identified data' role between a user and given data entails that the user is authorized to see only those fields (including those that are encrypted) which do not contain information that could be use to identify particular individuals (whether patients or medical practitioners) or potentially organizations in the case where an investigation may implicate them. Damion Dooley's note: There is a case in the "draft tuberculosis contextual data standard" that has this be a boolean selection. If a categorical field is linked directly, and it has no underlings, this indicates that it is being treated as a boolean feature. Damion Dooley Generally de-identified data should include HIPAA patient privacy fields as indicated in https://www.hipaa.com/hipaa-protected-health-information-what-does-phi-include/ de-identified data A 'data protected' role between a user and given data entails that a user must be authorized in order to view that data. Damion Dooley data protected 1 1 A datum containing the name of persons or institute who collected a given specimen. Damion Dooley https://www.ncbi.nlm.nih.gov/biosample/docs/attributes/ specimen collector Damion Dooley NCBI BioSample culture identifier A user identifier is an identifier provided as part of an authentication process of a client (user) to a service (server). User identifiers normally persist over time, in contrast to session identifiers which are temporary. Damion Dooley user identifier sequence identifier metagenomic species identification Note: REPLACE WITH Citation: SIO_000174 ???? document citation identifier organization identifier collection identifier variant calling / pathogen identification and clustering isolate (pathogen) identification http://purl.obolibrary.org/obo/OBI_0001616 obsolete: NCBI BioSample ID - true 1) Escherichia coli O104:H4 str. C227-11 clinical isolate 2010_333_NC-6 2) CD8+ T cells from female TSG6-knockout BALB/c mouse 3) Human metagenome isolated from urine of healthy female A specimen (or sample) title should be short and informative. Each specimen title must be unique in a submission. NCBI BioSample record documentation refers to a "BioSample title", stating that it is "auto-generated if one is not supplied by the submitter." This is distinct from any assigned BioSession accession, or other "external sample identifier" that may have been issued by the source database or repository. Damion Dooley specimen title draft GenEpiO isolate source context INSDC term (lower level) A categorical choice recorded when the data for a given measurable datum is available but not shared publicly because of information privacy concerns. restricted access State (United States of America) A health status trend is a short-term prognosis about whether health is improving, worsening or unchanging Damion Dooley health status trend datum http://www.itis.gov The Integrated Taxonomic Information System (ITIS) contains authoritative taxonomic information on plants, animals, fungi, and microbes of North America and the world. Damion Dooley Integrated Taxonomic Information System ITIS ITIS https://www.nlm.nih.gov/mesh/meshhome.html MeSH is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. Damion Dooley Medical Subject Headings MSH MESH https://ncit.nci.nih.gov/ncitbrowser/ The National Cancer Institute Thesaurus (NCIt) provides reference terminology for many NCI and other systems. It covers vocabulary for clinical care, translational and basic research, and public information and administrative activities. Damion Dooley National Cancer Institude Thesaurus NCI_Thesaurus NCIt NCIT The NIFSTD ontology of the Neuroscience Information Framework is composed of a collection of OWL modules with separate modules covering major domains of neuroscience: anatomy, cell, subcellular, molecule, organism, function and dysfunction. Damion Dooley https://wiki.nci.nih.gov/display/VKC/NIFSTD+Ontology NIFSTD SNOMED CT is a large clinical health terminology product owned and distributed by SNOMED International. Damion Dooley http://www.snomed.org/snomed-ct SNOMEDCT SNOMEDCT http://www.fao.org/fishery/collection/asfis/en ASFIS is a database curated by the United Nations Food And Agriculture Organization (FAO) Fisheries and Aquaculture Statistics and Information Branch (FIPS). It contains over 12,700 species items selected according to their interest or relation to fisheries and aquaculture. Damion Dooley UN FAO Fisheries and Aquaculture Statistics and Information Branch database FAO ASFIS ASFIS FTT TGN Wikipedia is a free online encyclopedia that aims to allow anyone to edit articles. Damion Dooley https://en.wikipedia.org/wiki/Wikipedia Wikipedia SWEETRealm The GeoNames geographical database covers all countries and contains over eleven million placenames that are available for download free of charge. Damion Dooley http://www.geonames.org/ Geonames http://ecolexicon.ugr.es/en/index.htm EcoLexicon is a terminological resource developed by the LexiCon Research Group at the University of Granada. Damion Dooley EcoLexicon A standard established for use by the Integrated Rapid Infectious Disease Analysis (IRIDA) project, which is a Canadian-led initiative to build an open source, end-to-end platform for infectious disease genomic epidemiology. Damion Dooley http://www.irida.ca/ draft IRIDA standard ISO 3166-1 is part of the ISO 3166 standard published by the International Organization for Standardization (ISO), and defines codes for the names of countries, dependent territories, and special areas of geographical interest. Damion Dooley https://en.wikipedia.org/wiki/ISO_3166-2 ISO3166-1 ISO 3166-2 is part of the ISO 3166 standard published by the International Organization for Standardization (ISO), and defines codes for identifying the principal subdivisions (e.g., provinces or states) of all countries coded in ISO 3166-1. Damion Dooley https://en.wikipedia.org/wiki/ISO_3166-1 ISO3166-2 Information Technology – Codes For The Identification Of The States And Equivalent Areas Within The United States, Puerto Rico, And The Insular Areas Damion Dooley American National Standards Institute ANSI INCITS 38 Damion Dooley The Geographic Names Information System (GNIS), developed by the U.S. Geological Survey in cooperation with the U.S. Board on Geographic Names, contains information about physical and cultural geographic features in the United States and associated areas, both current and historical. https://nhd.usgs.gov/gnis.html GNISID LTER order: OBI:0001628 # investigation identifier OBI:0001616 # specimen identifier GENEPIO:0001808 # specimen title GENEPIO:0000113 # specimen category sep:00196 # specimen description OBI:0001479 # specimen from organism GENEPIO:0001640 # specimen organism taxonomy (species) GENEPIO:0000027 # specimen source context GENEPIO:0001722 # NCBI specimen voucher GENEPIO:0001429 # strain identifier GENEPIO:0001798 # NCBI BioSample culture identifier GENEPIO:0001717 # viral subtype OMP:0000207 # serotype phenotype SO:0001027 # genotype GENEPIO:0001079 # serovar GENEPIO:0001716 # subgroup GENEPIO:0001055 # NCBI pathotype GENEPIO:0001054 # isolate passage history GENEPIO:0001656 # latitude and longitude coordinate GENEPIO:0001657 # NCBI BioSample geo_loc_name model GENEPIO:0001051 # NCBI BioSample isolate (human name or description) GENEPIO:0001648 # NCBI Contributing organization / project GENEPIO:0001797 # specimen collector OBI:0001619 # specimen collection date OBI:0001888 # sequencing facility contact person GENEPIO:0001651 # INSDC institution code GENEPIO:0001721 # NCBI culture collection GENEPIO:0001567 # subject (host) taxonomic species GENEPIO:0001724 # subject organism common name GENEPIO:0001775 # subject age at time of specimen collection GENEPIO:0000026 # subject health status at time of specimen extraction GENEPIO:0000031 # subject sex GENEPIO:0001614 # subject description GENEPIO:0001617 # subject disease GENEPIO:0001625 # subject disease stage GENEPIO:0001212 # subject disease outcome A National Center for Biotechnology Information BioSample specification defines fields and terms that provide "descriptions of biological source materials used in experimental assays". Damion Dooley http://www.ncbi.nlm.nih.gov/biosample/ NCBI BioSample Pathogen.cl standard https://www.ncbi.nlm.nih.gov/biosample/docs/packages/Pathogen.cl.1.0/ 1 1 The National Center for Biotechnology Information BioProject standard is a collection of datums that describe biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project. Damion Dooley http://www.ncbi.nlm.nih.gov/bioproject/ NCBI lists project data type and other controlled vocabulary fields at : https://www.ncbi.nlm.nih.gov/books/NBK54364/ NCBI BioProject standard 1 1 1 Objective The format of the BioProject Accession is five alpha-letters followed by one to six numbers. For example PRJNA43021 Project Accession Methodology http://www.aha.org/content/00-10/overview0302.pdf http://www.webcitation.org/5VBWPbFgY The American Hospital Association has issued HIPAA Updated Guidelines for Releasing Information on the Condition of Patients Damion Dooley American Hospital Association HIPAA Standard AHA HIPPA AHA HIPPA standard http://www.ncbi.nlm.nih.gov/pubmed/ PubMed comprises more than 26 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites. NCBI PubMed PMID PubMed PulseNet, a national and international surveillance system used to identify and respond to foodborne disease outbreaks, has a standardized laboratory method and data submission standard for pathogen isolate typing test results and related isolate contextual data. From this information clusters of disease can be identified that might represent unrecognized outbreaks. Pulse Net International BioSample Metadata PulseNet PulseNet submission standard AHA HIPPA: http://www.aha.org/content/00-10/overview0302.pdf Good - Vital signs are stable and within normal limits. Patient is conscious and comfortable. Indicators are excellent. Damion Dooley AHA Hippa: Good good health status Good satisfactory Fair - Vital signs are stable and within normal limits. Patient is conscious, but may be uncomfortable. Indicators are favorable. Damion Dooley fair health status fair Stable Serious - Vital signs may be unstable and not within normal limits. Patient is acutely ill. Indicators are questionable. Damion Dooley serious health status Serious A health status history datum is a record of the longer-term pattern or presence of a given patient's health status. Damion Dooley health status history datum (0,0) is the origin of the cartesian coordinate system. "In mathematics, the origin of a Euclidean space is a special point, usually denoted by the letter O, used as a fixed point of reference for the geometry of the surrounding space. In physical problems, the choice of origin is often arbitrary, meaning any choice of origin will ultimately give the same answer. This allows one to pick an origin point that makes the mathematics as simple as possible, often by taking advantage of some kind of geometric symmetry." Damion Dooley https://en.wikipedia.org/wiki/Origin_(mathematics) origin https://en.wikipedia.org/wiki/Medical_state United Kingdom National Health Service trusts, boards, etc. have some terminology in common for public disclosure of patient health conditions. Damion Dooley National Health Service terminology NHS standard Damion Dooley Damion Dooley's note: This is a contentious term. AHA calls for it never to be used, since the common phrase "critical but stable" is misleading by the very nature of "critical". The NHS however has it as one of its central terms. The use of "stable" also points to a health trend, rather than a state. A separate "health status trend datum" has been set up to capture this. critical but stable health status critical but stable NHS Terms: https://en.wikipedia.org/wiki/Medical_state Damion Dooley comfortable health status Comfortable Damion Dooley progressing well health status progressing well Damion Dooley Damion Dooley note: This term (weakly) implies that the patient is recovered enough to be released from hospital. Technically this isn't a health status though, it is a patient medical service state. discharged from hospital discharged A data item which is about the collection state of a datum at some point in time. Damion Dooley note: We'll need to incorporate NIAID GSCID-BRC Sample data standards mention: Unknown/Not Applicable/Censored Allowed Unknown/not applicable/not available/ available upon request Damion Dooley A given datum's datum status value is metadata, i.e. a statement about a particular datum, often when no actual data entry or categorical choice for that datum has been made. datum status INSDC missing value reporting terms: http://www.ebi.ac.uk/ena/about/missing-values-reporting A categorical choice indicating that a datum is available for retrieval. Damion Dooley In theory any datum that has a recorded value has 'recorded' for its datum status. recorded Patient is awaiting physician and/or assessment. Very similar to 'datum status' 'in process' status. Damion Dooley undetermined health status - awaiting assessment Damion Dooley deteriorating worsening health trend This covers the description of data and abstract structure in n-dimensional space. Within space there are coordinate systems, coordinates, and entities made up of coordinates. Damion Dooley /^[N+](:<lat>\d(\.\d+)?)$/, $lat /^S(:<lat>\d(\.\d+)?)$/, '-' + $lat /^(:<lat>[-S+N])?(:<degree>[1-8]?[0-9](:<decimal>\.\d{1,6})?)$/, $lat$degree.$decimal /^(:<lat>[-S+N])?(:<degree>90(:<decimal>\.0{1,6})?))$/, $lat$degree.$decimal /^(\d(\.\d+)?)$/, '+\1' A decimal latitude measurement in degrees in conformance with the ISO 6709 standard. This is failing Hermit reasoner since it rejects totalDigits, and fractionDigits: 'has primitive data type' exactly 1 xsd:decimal[>= -90.0 , <= 90.0 , totalDigits 8 , fractionDigits 6]' Damion Dooley /^(:<long>[-W+E])?(:<degree>(1[0-7]|[1-9])?[0-9])(:<decimal>\.\d{1,6})?)$/, $long$degree.$decimal /^(:<long>[-W+E])?(:<degree>180(:<decimal>\.0{1,6})?)$,$long$degree.$decimal A decimal longitude measurement in degrees in conformance with the ISO 6709 standard. Damion Dooley Damion Dooley A 2D coordinate system is a coordinate system in 2 dimensional space. Each coordinate is composed of an offset from the origin of each dimension Damion Dooley Damion Dooley Damion Dooley A datetime range intermediate datum is a datetime item grouped under a process, and occuring after some process start and before some process end. Damion Dooley datetime range intermediate Damion Dooley datetime range item The ending date of a process datetime range end Damion Dooley The starting date of a process datetime range start Damion Dooley Damion Dooley A designated area on earth is one or more areas defined by one or more boundaries. A boundary can be defined by a polygon perimeter, a lat/long and radius, or a fiat-boundary geographic featureset like a named river channel or mountain ridge; it may have one or more names associated with it. Note that areas associated with a name may change their boundary definitions over time. Damion Dooley A Canadian postal code is a postal code that pertains to a geographic region of Canada. Note that a Canadian postal code can cover multiple communities. Secondly, postal codes can be retired, and then reintroduced elsewhere. Historical data associations may need to take the date of postal code mapping into account. A "USA zip code" is a five digit (Zone Improvement Plan) or 9 digit postal code pertaining to a region of the United States of America. The optional trailing 4 digits of a 9 digit zip code provide greater granularity of delivery target. Damion Dooley http://purl.obolibrary.org/obo/NCIT_C25720 obsolete: US postal code true Damion Dooley entity history record Damion Dooley generic range history record Damion Dooley A set does not have a direct distance metric between its members. Damion Dooley Damion Dooley Damion Dooley Damion Dooley A product name and model number of a manufacturer's genomic (dna) sequencer. Damion Dooley sequencing instrument model 1 1 A circle is a simple closed shape in Euclidean geometry. It is the set of all points in a plane that are at a given distance from a given point, the centre; equivalently it is the curve traced out by a point that moves so that its distance from a given point is constant. Damion Dooley https://en.wikipedia.org/wiki/Circle circle NCBI SRA: ILLUMINA Illumina platform NCBI SRA: COMPLETE_GENOMICS Damion Dooley's note: This platform is in-house and not a product per se. Complete Genomics states: "Complete Genomics’ sequencing platform employs high-density DNA nanoarrays that are populated with DNA nanoballs (DNBs™). Base identification is performed using an unchained ligation-based read technology known as combinatorial probe-anchor ligation (cPAL™). The sequencing instrumentation is custom-developed to support this process. Details are described in our Science publication (Drmanac et al., 2010)." Complete Genomics platform 0 1 Damion Dooley earth area boundary census NCBI SRA: LS454 Roche LS454 platform NCBI SRA: PACBIO_SMRT PacBio platform NCBI SRA: AB SOLiD 4hq SOLiD 4hq System NCBI SRA: AB SOLiD 5500 SOLiD 5500 NCBI SRA: AB SOLiD 5500xl SOLiD 5500xl NCBI SRA: AB SOLiD PI System SOLiD PI System NCBI SRA: AB SOLiD System 2.0 SOLiD System 2.0 NCBI SRA: AB SOLiD System 3.0 SOLiD System 3.0 A two dimensional entity is an entity composed of one or more points on a two dimensional surface. Damion Dooley 2D entity NCBI SRA: Ion Torrent PGM NCBI SRA: 454 GS FLX Titanium 454 Genome Sequencer FLX Titanium NCBI SRA: 454 GS 454 Genome Sequencer NCBI SRA: 454 GS Junior 454 Genome Sequencer Junior https://www.ncbi.nlm.nih.gov/books/NBK54984/table/SRA_Glossary_BK.T._library_descriptor_te/?report=objectonly Whether any method was used to select for or against, enrich, or screen the material being sequenced. Damion Dooley NCBI SRA: LIBRARY_SELECTION library selection Selection of methylated DNA fragments using an antibody raised against 5-methylcytosine or 5-methylcytidine (m5C). NCBI SRA: 5-methylcytidine antibody 5-methylcytidine antibody method Cap-analysis gene expression. NCBI SRA: CAGE CAGE method Cot-filtered highly repetitive genomic DNA NCBI SRA: CF-H CF-H method Cot-filtered moderately repetitive genomic DNA NCBI SRA: CF-M CF-M method Cot-filtered single/low-copy genomic DNA NCBI SRA: CF-S CF-S method Cot-filtered theoretical single-copy genomic DNA NCBI SRA: CF-T CF-T method Chromatin immunoprecipitation NCBI SRA: ChIP ChIP method Deoxyribonuclease (MNase) digestion NCBI SRA: DNAse DNAse method Hypo-methylated partial restriction digest NCBI SRA: HMPR HMPR method Selection by hybridization in array or solution. NCBI SRA: Hybrid Selection Hybrid Selection method Enrichment by methyl-CpG binding domain. NCBI SRA: MBD2 protein methyl-CpG binding domain MBD2 protein methyl-CpG binding domain method Methyl Filtrated NCBI SRA: MF MF method Micrococcal Nuclease (MNase) digestion NCBI SRA: MNase MNase method Methylation Spanning Linking Library NCBI SRA: MSLL MSLL method Source material was selected by designed primers. NCBI SRA: PCR PCR method Rapid Amplification of cDNA Ends. NCBI SRA: RACE RACE method Source material was selected by randomly generated primers. NCBI SRA: RANDOM PCR RANDOM PCR method Random selection by shearing or other method. NCBI SRA: RANDOM RANDOM method Source material was selected by reverse transcription PCR NCBI SRA: RT-PCR RT-PCR method Reproducible genomic subsets, often generated by restriction fragment size selection, containing a manageable number of loci to facilitate re-sampling. NCBI SRA: Reduced Representation Reduced Representation method DNA fractionation using restriction enzymes. NCBI SRA: Restriction Digest Restriction Digest method PolyA selection or enrichment for messenger RNA (mRNA). complementary DNA. NCBI SRA: cDNA cDNA method Physical selection of size appropriate targets. NCBI SRA: size fractionation size fractionation method Other library enrichment, screening, or selection process. NCBI SRA: other other library method https://www.ncbi.nlm.nih.gov/books/NBK54984/table/SRA_Glossary_BK.T._library_descriptor_te/?report=objectonly The library source specifies the type of source material that is being sequenced. Damion Dooley library source Genomic DNA (includes PCR products from genomic DNA). Damion Dooley NCBI SRA: GENOMIC genomic source Mixed material from metagenome. Damion Dooley NCBI SRA: METAGENOMIC metagenomic source Transcription products from community targets Damion Dooley NCBI SRA: METATRANSCRIPTOMIC metatranscriptomic source Other, unspecified, or unknown library source material. Damion Dooley NCBI SRA: OTHER other library source Synthetic DNA. Damion Dooley NCBI SRA: SYNTHETIC synthetic source Transcription products or non genomic DNA (EST, cDNA, RT-PCR, screened libraries). Damion Dooley NCBI SRA: TRANSCRIPTOMIC transcriptomic source Viral RNA. Damion Dooley NCBI SRA: VIRAL RNA viral RNA source Sequencing technique intended for this library. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54984/table/SRA_Glossary_BK.T._library_descriptor_te/?report=objectonly library strategy Sequencing of overlapping or distinct PCR or RT-PCR products. For example, metagenomic community profiling using SSU rRNA NCBI SRA: AMPLICON amplicon strategy MethylC-seq. Sequencing following treatment of DNA with bisulfite to convert cytosine residues to uracil depending on methylation status. NCBI SRA: Bisulfite-Seq Bisulfite-Seq strategy Clone end (5', 3', or both) sequencing. NCBI SRA: CLONEEND clone end strategy Genomic clone based (hierarchical) sequencing. NCBI SRA: CLONE clone strategy Concatenated Tag Sequencing NCBI SRA: CTS CTS strategy Direct sequencing of chromatin immunoprecipitates. NCBI SRA: ChIP-Seq ChIP-Seq strategy Sequencing of hypersensitive sites, or segments of open chromatin that are more readily cleaved by DNaseI. NCBI SRA: DNase-Hypersensitivity DNase-Hypersensitivity strategy Single pass sequencing of cDNA templates NCBI SRA: EST EST strategy Sequencing intended to finish (close) gaps in existing coverage. NCBI SRA: FINISHING finishing strategy Full-length sequencing of cDNA templates NCBI SRA: FL-cDNA FL-cDNA strategy Direct sequencing of methylated fractions sequencing strategy. NCBI SRA: MBD-Seq MBD-Seq strategy Direct sequencing following MNase digestion. NCBI SRA: MNase-Seq MNase-Seq strategy Methylation-Sensitive Restriction Enzyme Sequencing strategy. NCBI SRA: MRE-Seq MRE-Seq strategy Methylated DNA Immunoprecipitation Sequencing strategy. NCBI SRA: MeDIP-Seq MeDIP-Seq strategy Library strategy not listed. NCBI SRA: OTHER other library strategy Shotgun of pooled clones (usually BACs and Fosmids). NCBI SRA: POOLCLONE pool clone strategy Random sequencing of whole transcriptome. NCBI SRA: RNA-Seq RNA-Seq strategy Random sequencing of a whole chromosome or other replicon isolated from a genome. NCBI SRA: WCS WCS strategy Random sequencing of the whole genome. NCBI SRA: WGS WGS strategy Random sequencing of exonic regions selected from the genome. NCBI SRA: WXS WXS strategy Free form text describing the protocol by which the sequencing library was constructed. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54984/table/SRA_Glossary_BK.T._library_descriptor_te/?report=objectonly library construction protocol The submitter's name for this library. Damion Dooley https://www.ncbi.nlm.nih.gov/books/NBK54984/table/SRA_Glossary_BK.T._library_descriptor_te/?report=objectonly NCBI SRA: LIBRARY_NAME library name A two dimensional boundary is a continuous path (loop) defined on a two dimensional surface. Damion Dooley 2D boundary cfu/100mL A colony forming unit count which is a count of viable bacterial numbers in 100 milliliters of liquid. Should this be deprecated? Need to find working examples that actually use / 100mL. Damion Dooley colony forming unit per 100 milliliter A multiplier in the decimal number system that has been applied to an accompanying number or set of numbers often to simplfy their appearance. Numbers described by multipliers may have lost significant digits through rounding or trimming. Damion Dooley decimal quantity unit draft tuberculosis - dataset features and provenance draft tuberculosis - case specimen characteristics draft tuberculosis - case subject characteristics draft tuberculosis - case risk factors draft tuberculosis - case clinical presentation draft tuberculosis - case laboratory testing draft tuberculosis - case medication draft isolate whole genome sequencing model draft tuberculosis - case model A Mycobacterium tuberculosis lineage is a named family of strains of tuberculosis, named usually with reference to their shared geographic origin. Damion Dooley Mycobacterium tuberculosis lineage A TB-Lineage identification tool using spoligotypes and MIRU data: http://tbinsight.cs.rpi.edu/run_tb_lineage.html Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans http://www.nature.com/ng/journal/v45/n10/full/ng.2744.html TB lineage 2 - East-Asian, includes Bejing TB lineage 3 - East-African Indian, includes CAS/Delhi TB lineage 4 - Euro-American, includes Haarlem, LAM3, X, T, F11, H37Rv TB lineage 1 - Indo-Oceanic TB lineage 5 - West African 1 TB lineage 6 - West African 2 Damion Dooley An isolate preparation facility is a type of organization that provides the service of isolating organisms of interest from provided specimens. isolate preparation facility plant product geospatial name A personal health datum is a datum that pertains to the health situation or history of an individual. Personal health datum covers the following information 'has member' some 'analytic datum' 'has member' some 'behavioural risk factor' 'has member' some 'country of birth' 'has member' some 'dietary restrictions' 'has member' some 'Disease Attributes' 'has member' some 'health status history datum' 'has member' some 'health status trend datum' 'has member' some 'host hospital service status' 'has member' some 'subject disease stage' 'has member' some 'subject health status' 'has member' some 'subject health status (AHA)' 'has member' some 'subject health status (GSCID-BRC)' 'has member' some 'subject health status (NCIT)' 'has member' some 'subject health status (UK NHS)' 'has member' some 'subject health status at time of specimen extraction' 'has member' some 'tuberculosis disease anatomical site' 'has member' some comorbidity 'has member' some disease 'has member' some symptom Damion Dooley personal health datum A host hospital service status indicates the state of a host's interaction with hospital medical services. Damion Dooley host hospital service status Damion Dooley admitted to hospital Received treatment but not admitted. treated and released (not admitted to hospital) Treated and Released Received treatment. Transferred to a different facility. treated and transferred Treated and Transferred Damion Dooley satisfactory health status satisfactory depth:1 subject health status (UK NHS) depth:1 subject health status (AHA) http://purl.obolibrary.org/obo/GENEPIO_0002019 stable health trend http://purl.obolibrary.org/obo/GENEPIO_0001669 An ordinal variable is a categorical variable whose values have sequential order, but no distance metric between them. Damion Dooley http://purl.obolibrary.org/obo/STATO_0000228 obsolete: ordinal variable true mapping (reference based) data protection item subject provided consent Damion Dooley An ordinal tree specification is a value specification entity whose subclasses each represent a possible value selection, and which have a linear order with respect to their siblings. ordinal value specification consent datum agency provided consent 2019-11-17T00:43:00Z draft ISO sequence repository field Centre of circle, ellipse or parabola; centre of mass. A centre point is a point used in the description of an n-dimentional entity from which other (given) aspects of the entity are most efficienty described. A point of symmetry. The entity must have only one such point. Probably a better formal math defn? Damion Dooley centre point This covers the following entities: 'has member' some 'antimicrobial phenotype' 'has member' some 'antimicrobial resistance test drug' 'has member' some 'antimicrobial resistance test platform' 'has member' some 'antimicrobial resistance testing method' 'has member' some 'antimicrobial resistance testing method version or reagent' 'has member' some 'antimicrobial resistance testing reference standard' 'has member' some 'antimicrobial resistance testing reference standard version' 'has member' some 'antimicrobial resistance tissue specificity' 'has member' some 'antimicrobial resistant plasmid type' 'has member' some 'automated testing platform vendor' 'has member' some 'draft antibiogram drug test model' 'has member' some 'in-vitro microbial susceptibility test' 'has member' some 'measurement comparator' 'has member' some 'MIC unit' 'has member' some 'MIC value' 'has member' some 'Minimum Inhibitory Concentration Test' 'has member' some 'test drug maximum concentration' 'has member' some 'test drug minimum concentration' 'has member' some 'tuberculosis treatment drug' Damion Dooley This is a collection of antibiotic resistance process parameters and outputs phenotypic antimicrobial drug susceptibility test datum The resistance phenotype of an isolate represents the interpretation of an MIC value with regard to some breakpoint threshold e.g. resistant (R), sensitive (S), intermediate (I), wild type (WT), or non-wild type (NWT) Damion Dooley http://purl.obolibrary.org/obo/NCIT_C85539 obsolete: antimicrobial resistance phenotype true Intermediate resistance indicates that an isolate is observed to offer an intermediate level of resistance to a given antibiotic. Damion Dooley http://purl.obolibrary.org/obo/NCIT_C85560 obsolete: antibiotic intermediate resistance (I) true Non-susceptible indicates that an isolate is observed to offer complete resistance to a given antibiotic. Damion Dooley http://purl.obolibrary.org/obo/NCIT_C85561 obsolete: antibiotic nonsensitive (high resistance) true A not defined resistance indicates that an isolate was not tested against a given antibiotic, or the result of the test was inconclusive. Damion Dooley note: Is this the same as "indeterminate"? Damion Dooley N ND antibiotic resistance not defined Resistant indicates that an isolate is observed to offer a greater than intermediate level of resistance to a given antibiotic. Damion Dooley http://purl.obolibrary.org/obo/NCIT_C85562 obsolete: antibiotic resistant (R) true Susceptible indicates that an isolate is observed to offer a detectable but less than intermediate level of resistance to a given antibiotic. Damion Dooley http://purl.obolibrary.org/obo/NCIT_C85563 obsolete: antibiotic sensitive (S) true Susceptible - dose dependent indicates that an isolate is observed to offer resistance to a given antibiotic when a sufficient dose is given. Damion Dooley http://purl.obolibrary.org/obo/NCIT_C85559 obsolete: sensitive - dose dependent true A MIC test measurement resulting from a laboratory dilution method. Damion Dooley's note: is this still useful? Currently "MIC dilution measurement specification' refers directly via 'is about' to "broth dilution method" but that may be too specific. Damion Dooley MIC dilution measurement datum The type of assay used to determine the minimum inhibitory concentration. Damion Dooley http://purl.obolibrary.org/obo/NCIT_C85540 obsolete: antimicrobial resistance testing method true http://purl.obolibrary.org/obo/NCIT_C85555 obsolete: agar dilution evidence true The commercial kit or product used to determine the MIC e.g. E-Test. If a commercial product was not used, include the type of media used. Damion Dooley antimicrobial resistance testing method device or reagent antimicrobial resistant plasmid type The instrumentation used to determine minimum inhibitory concentration values. Damion Dooley http://purl.obolibrary.org/obo/ARO_3004390 obsolete: antimicrobial resistance testing platform true http://purl.obolibrary.org/obo/ARO_3004400 obsolete: Microscan true http://purl.obolibrary.org/obo/ARO_3004401 obsolete: Phoenix true http://purl.obolibrary.org/obo/ARO_3004402 obsolete: Sensititre true http://purl.obolibrary.org/obo/ARO_3004403 obsolete: Vitek true http://purl.obolibrary.org/obo/ARO_3004409 obsolete: Trek true obsolete: manual - noncommercial true The manufacturer of an antibiotic resistance lab typing platform Damion Dooley http://purl.obolibrary.org/obo/ARO_3004404 obsolete: antimicrobial resistance testing platform vendor true http://purl.obolibrary.org/obo/ARO_3004407 obsolete: Siemens true http://purl.obolibrary.org/obo/ARO_3004406 obsolete: Biomérieux true http://purl.obolibrary.org/obo/ARO_3004405 obsolete: Becton Dickinson true A manual in vitro diagnostic device used by laboratories to determine the MIC (Minimum Inhibitory Concentration) and whether or not a specific strain of bacterium or fungus is susceptible to the action of a specific antimicrobial. Damion Dooley E-Test Epsilometer test Etest (device) GM-NEG The clinical and laboratory guidelines or standards that prescribe the threshold values for determining resistance phenotypes. Damion Dooley http://purl.obolibrary.org/obo/ARO_3004360 obsolete: antimicrobial resistance testing reference standard true BSAC is a British inter-professional organisation involved in antibiotic education, research and leadership Damion Dooley BSAC obsolete: British Society for Antimicrobial Chemotherapy (BSAC) true The Clinical and Laboratory Standards Institute develops and implements clinical laboratory testing standards Damion Dooley NCCLS http://clsi.org CLSI On January 1, 2005 the National Committee on Clinical Laboratory Standards (NCCLS) changed its name to CLSI obsolete: Clinical and Laboratory Standards Institute (CLSI) true DIN is recognized by the Federal Government of Germany as the competent standards organization for Germany and as the national standards body representing Germany in non-governmental international standards organizations. Damion Dooley http://www.din.de/en/din-and-our-partners/din-e-v DIN Deutsches Institut für Normung obsolete: German Institute for Standardization (DIN) true EUCAST is a standing committee jointly organized by ESCMID, ECDC and European national breakpoint committees; it deals with breakpoints and technical aspects of phenotypic in vitro antimicrobial susceptibility testing. Damion Dooley http://www.eucast.org/ EUCAST obsolete: European Committe on Antimicrobial Susceptibility Testing (EUCAST) true Damion Dooley NCCLS obsolete: National Committee on Clinical Laboratory Standards (NCCLS) true La SFM a vocation à rassembler les microbiologistes de France et des pays francophones, travaillant dans les différents domaines de la microbiologie médicale, industrielle, et environnementale, en physiologie, génétique, taxonomie, hygiène, agents antimicrobiens, ...concernant les bactéries, virus, champignons et parasites. The French Society of Microbiology (SFM) is a non-profit association which aims to bring together microbiologists from French-speaking countries, working in the domains of bacteria, viruses, Fungi and parasites, and related medical, industrial and environmental microbiology, physiology, genetics, taxonomy, hygiene, and antimicrobial agents. This isn't at moment considered a valid EBI antibiogram choice? Damion Dooley http://www.sfm-microbiologie.org/ SFM obsolete: Société Française de Microbiologie (SFM) true Replaced by domain appropriate term created in the Comprehensive Antibiotic Resistance Database (CARD). obsolete: SIR true Replaced by domain appropriate term created in the Comprehensive Antibiotic Resistance Database (CARD). obsolete: WRG true first line tuberculosis drug last resort antibiotic NIAID standard NIAID provides The GSCID/BRC Project and Sample Application Standard. It is designed to capture standardized human pathogen and vector sequencing metadata to support epidemiologic and genotype-phenotype association studies for human infectious diseases. Damion Dooley https://www.niaid.nih.gov/research/human-pathogen-and-vector-sequencing-metadata-standards National Institute of Allergy and Infectious Disease standard A RefSeq accession identifier is an accession identifier for selected NCBI-curated genomic, transcript and protein sequences. These identifiers are distinct from DDBJ/EMBL/GenBank identifiers. Damion Dooley RefSeq accession identifier http://www.ncbi.nlm.nih.gov/refseq/about/ A Genomic Standards Consortium standard is an international, community-driven standard to facilitate genomic data integration, discovery and comparison. Damion Dooley Genomic Standards Consortium (GSC) standard http://gensc.org/ A landform, biome or built environment at site that a given specimen was collected from. Damion Dooley specimen collection site - geographic feature A particular human patient, or a given animal or plant. An investigative subject which is an organism. Damion Dooley subject organism help=The number of passages should be expressed as a numerical value. The number of serial iterations that an isolate is grown in one environment. Damion Dooley number of passages help:The passage protocol should include, when applicable, inoculum size, media type, temperature and duration of incubation. A passage protocol should detail the number of serial iterations that an isolate is grown in one environment, and the conditions of that environment. Damion Dooley passage protocol The unit provided for an antibiotic drug test dosage. Damion Dooley http://purl.obolibrary.org/obo/ARO_3004372 obsolete: drug minimum inhibitory concentration unit true order: GENEPIO:0001237 # specimen type (host or environmental context) OBI:0001619 # specimen collection date OBI:0000659 # specimen collection process GENEPIO:0002094 # specimen collection device GENEPIO:00001600 # draft sequence repository data - geographic location ENVO:00010483 # environmental material ENVO:00002297 # environmental feature GENEPIO:0001567 # subject (host) taxonomic species GENEPIO:0000110 # subject health status GENEPIO:0001617 # subject disease GENEPIO:0000028 # subject body product GENEPIO:0000025 # specimen source anatomical site FOODON:03411564 # draft sequence repository data - food processing draft sequence repository data - specimen collection order: GENEPIO:0002105 # Specimen processing organization role NCIT:C93874 # organization name NCIT:C40974 # first name NCIT:C40975 # last name NCIT:C93582 # job title NCIT:C42775 # email address GENEPIO:0001756 # phone NCIT:C25464 # Country ENVO:00000005 # major administrative subdivision -> GAZ state/province/territory/region) NCIT:C80234 # Municipality” (via GenEpiO -> GAZ cities) NCIT:C25690 # Street Address Contact information for an organization related to specimen collection, processing or storage. Damion Dooley contact specification - specimen related organization dateFormat:ISO 8601 order: GENEPIO:0002082 # specimen related organization GENEPIO:0002081 # specimen collection GENEPIO:0002086 # geographic location GENEPIO:0002084 # isolate GENEPIO:0002087 # isolate history GENEPIO:0002106 # food specimen GENEPIO:0002088 # antibiogram GENEPIO:0002085 # sequencing GENEPIO:0002090 # sequencing QC GENEPIO:0002089 # virulence GENEPIO:0002092 # submission to EBI GENEPIO:0002091 # submission to NCBI/DDBJ This draft specification provides a collection of fields related to the contextual data of a specimen, its genomic sequencing, and its pathogenic epidemiology. Damion Dooley draft sequence repository contextual data standard order: GENEPIO:0001640 # specimen taxonomy GENEPIO:0001429 # strain identifier GENEPIO:0001644 # isolate identifier OBI:0000079 # culture medium GENEPIO:0002107 # isolation medium draft sequence repository data - isolate order: GENEPIO:0000069 # sequencing date OBI:0000079 # culture medium” (via MicrO, yields a list of formulas). GENEPIO:0002093 # sequencing dna extraction method EFO:0000683 # replicate GENEPIO:0000085 # library preparation kit GENEPIO:0000149 # library preparation kit version GENEPIO:0000075 # sequencing chemistry GENEPIO:0001921 # sequencing instrument model GENEPIO:0000150 # read trimming and filtering software GENEPIO:0000084 # read adapter trimming software GENEPIO:0002095 # read paired-end merging software GENEPIO:0000090 # assembly method GENEPIO:0000151 # bioinformatics pipeline name GENEPIO:0000153 # bioinformatics pipeline protocol GENEPIO:0000152 # bioinformatics pipeline version GENEPIO:0000095 # genome annotation algorithm draft sequence repository data - sequencing order: GENEPIO_0001656 # latitude and longitude ISO 6709 GENEPIO:0000118 # specimen collection country GENEPIO:0002097 # specimen collection state/prov/ter/region GENEPIO:0001785 # specimen collection city The geographical origin of the sample by city, region, country or latitude and logitude. Damion Dooley draft sequence repository data - geographic location order: GENEPIO:0002078 # number of passages GENEPIO:0002079 # passage protocol draft sequence repository data - isolate passage history order: GENEPIO:0001100 # antimicrobial drug tests GENEPIO:0002101 # test drug minimum concentration GENEPIO:0002100 # test drug maximum concentration GENEPIO:0002045 # antimicrobial resistance testing method GENEPIO:0002047 # antimicrobial testing method version or reagent GENEPIO:0002062 # antimicrobial resistance testing reference standard GENEPIO:0002049 # antimicrobial resistance testing platform A dataset of the minimal inhibitory concentrations (value, unit, sign (<,>, =)) and resistance phenotypes (resistant, sensitive or undetermined) of the sequenced isolate against different antibiotics tested. Damion Dooley draft sequence repository data - antibiogram order: GENEPIO:0002130 # Virulence factor name GENEPIO:0002131 # Virulence testing protocol GENEPIO:0002132 # detection limit The virulence factors determined to be present in the sequenced isolate by phenotypic or target amplification methods e.g. Shiga toxins, hemolysins. Damion Dooley draft sequence repository data - virulence order: OBI:0001941 # N50 GENEPIO_0000092 # sequencing depth / read coverage Measurements or calculated quantities used to assess the extent and success of the sequence assembly process. Metric thresholds are species-specific. Damion Dooley draft sequence repository data - sequencing quality control order: OBI:0001616 # specimen identifier GENEPIO:0001644 # isolate identifier GENEPIO:0001429 # strain identifier GENEPIO:0001640 # specimen organism taxonomy GENEPIO:0000113 # specimen category OBI:0001619 # specimen collection date GENEPIO:0001797 # specimen collector GENEPIO:0001657 # geo_loc_name GENEPIO:0001656 # latitude and longitude coord GENEPIO:0000027 # specimen source context GENEPIO:0001567 # subject organism taxonomy GENEPIO:0001724 # subject organism common name GENEPIO:0001617 # subject disease Damion Dooley draft sequence repository data - submission to NCBI/DDBJ order: GENEPIO:0001644 # isolate identifier GENEPIO:0001429 # strain identifier GENEPIO:0001640 # specimen organism taxonomy GENEPIO:0001079 # serovar GENEPIO:0002135 # EBI environmental feature GENEPIO:0000113 # specimen category OBI:0001619 # specimen collection date GENEPIO:0001797 # specimen collector GENEPIO:0002086 # geographic location GENEPIO:0000027 # specimen source context GENEPIO:0002076 # geographic feature GENEPIO:0001567 # subject taxonomy GENEPIO:0000110 # subject health status Damion Dooley draft sequence repository data - submission to EBI https://www.ebi.ac.uk/ena/submit/pathogen-data The procedure used to obtain genomic DNA from a sample through chemical, physical or mechanical means. Damion Dooley sequencing dna extraction method datum Moved term to OBI, deprecated in favour of the OBI version. obsolete: specimen collection device true The name and version of the software used to merge paired-end reads before assembly. read paired-end merging software Sequencing The name and version of the software product used for removal of adapter sequences from demultiplexed data reads GROUP: IRIDA Ontology (Emma) GENEPIO read adapter trimming software An ontology identifier or textual name of the most precise geographical location available for the site of a specimen collection event. Damion Dooley Adapted from NIAID GSCID-BRC metadata working group Note that this is expressed in a particular format in some standards. specimen collection location - state/province/territory/region lookup A wild type (WT) resistance phenotype indicates a bacterial pathogen belongs to a "Naïve", susceptible wild-type population with respect to a given antimicrobial. No acquired or mutational resistance mechanisms are present to the antimicrobial in question. Damion Dooley https://sisu.ut.ee/sites/default/files/amr/files/pab_new_legislation_on_amr.pdf http://purl.obolibrary.org/obo/ARO_3004432 obsolete: wild type (WT) true A non-wild type (WT) resistance phenotype indicates a bacterial pathogen has an acquired or mutational resistance mechanism present with respect to the antimicrobial in question. The bacteria has a reduced susceptibility to this agent. Damion Dooley https://sisu.ut.ee/sites/default/files/amr/files/pab_new_legislation_on_amr.pdf http://purl.obolibrary.org/obo/ARO_3004433 obsolete: non-wild type (NWT) true This is the maximum concentration of a drug applied during an an assay. Damion Dooley test drug maximum concentration This is the minimum concentration of a drug applied during an an assay. Damion Dooley test drug minimum concentration The tissue type used to select breakpoints from a particular standard, for the interpretation of MIC results. Damion Dooley http://purl.obolibrary.org/obo/ARO_3004430 obsolete: antimicrobial resistance tissue specificity true depth:1 order: NCIT:C115935 # healthy NCIT:C126054 # non-pathalogical NCIT:C25610 # pathologic NCIT:C82508 # life threatening NCIT:C28554 # deceased subject health status (NCIT) order: NCIT:C40974 # first name NCIT:C40975 # last name NCIT:C93582 # job title NCIT:C42775 # email address GENEPIO:0001756 # phone contact specification - professional role An organization related to the collection, isolate production, sequencing or storage of specimen material or data. Damion Dooley specimen processing organization type order: GENEPIO:0001533 # food specimen type FOODON:03450002 # food cooking method FOODON:03440011 # extent of heat treatment FOODON:03430113 # physical state, shape or form FOODON:03470107 # food preservation method FOODON:03490100 # container or wrapping FOODON:03480020 # food packing medium FOODON:03500010 # food contact surface GENEPIO:0000139 # food cultural origin FOODON:03460111 # treatment applied to food This specification offers food source, product type, packaging and processing attributes relevant to describing a food specimen. draft sequence repository data - food specimen An isolation medium is a culture medium which has the disposition to encourage growth of particular bacteria to the exclusion of others in the same growth environment. Chris has proposed this term for OBI as "enrichment culture medium" Damion Dooley isolation medium order: GENEPIO:0000121 # start date GENEPIO:0000122 # end date GENEPIO:0001783 # travel destination GENEPIO:0001064 # travel mode GENEPIO:0001065 # travel reason This record details a person's travel movement and duration for epidemiological analysis. Damion Dooley draft travel log specification order: OBI:0001616 # BioSample_ID GENEPIO:0001640 # Species GENEPIO:0001187 # Antibiotic_Name GENEPIO:0002062 # AST_Standard GENEPIO:0002111 # Breakpoint_version NCIT:C85540 # Laboratory_typing_method GENEPIO:0002112 # Measurement GENEPIO:0002080 # Measurement_units GENEPIO:0001001 # Measurement_sign NCIT:C85539 # Resistance_phenotype GENEPIO:0002181 # Resistance_phenotype - ECOFF GENEPIO:0002049 # Platform A standard for European Bioinformatics Institute (EBI) antibiogram data submission Damion Dooley draft EBI antimicrobial susceptibility test V. 2017-08-30 https://github.com/EBI-COMMUNITY/compare-amr/blob/master/Antimicrobial-susceptibility-testing-data-submission_V_2017-08-30.pdf A MIC test measurement resulting from a laboratory diffusion method. Damion Dooley's note: is this still useful? Currently "MIC diffusion measurement specification' refers directly via 'is about' to "Disk Diffusion Method". Damion Dooley MIC diffusion measurement datum The version of the antimicrobial reistance testing reference standard protocol used in assessing an isolate. Damion Dooley antimicrobial resistance testing reference standard version This is a component of an antibiogram that enables a choice of MIC measurements based on either the diffusion or dilution method. These methods call for different numeric input ranges and units. draft sequence repository data - antibiogram MIC measurement 1 time value specification The process of collecting a portion of feces from an organism. Emma Griffiths stool collection feces collection The process of collecting a portion of urine from an organism. Emma Griffiths sputum collection The process of removal and collection of specimen material from the surface of an entity by washing, or a similar application of fluids. Emma Griffiths rinsing for specimen collection The process of collecting specimen material using a swab collection device. Emma Griffiths swabbing for specimen collection The process of collecting bodily fluids that have been discharged from blood vessels usually arising from inflammation or injury. Emma Griffiths exudate collection The removal or collection of specimen material through the use of suction. Emma Griffiths vacuuming for specimen collection Emma Griffiths NOTE: there are different types of tubes with coloured caps indicating different preservatives etc. tube containing media or preservative Emma Griffiths NOTE: there are different types of tubes with coloured caps indicating different preservatives etc. tube containing antimicrobial A sterilized sampling bag that is puncture proof tabs for protection from damage due to wire-end protrusion and leak-proof closures. Emma Griffiths Whirlpak sampling bag Emma Griffiths spoon Emma Griffiths fork A shovel-like utensil that has a deep curved dish and a short handle and is used for digging into a soft substance for lifting out a portion. Emma Griffiths 2023-06-21T13:54:32Z hand scoop Emma Griffiths trier A device which generates a vacuum to provide suction of material. Emma Griffiths vaccum device investigation identifier specification order: GENEPIO:0001640 # taxonomy GENEPIO:0002086 # geographic location GENEPIO:0002076 # geographic feature GENEPIO:0002106 # food FOODON:03510136 # food consumer GENEPIO:0002094 # specimen collection device GENEPIO:0000025 # anatomical site GENEPIO:0000028 # body product GENEPIO:0001724 # host organism common name GENEPIO:0001567 # host taxonomy UBERON:0000105 # life cycle GENEPIO:0001639 # antigenic formula GENEPIO:0001718 # serotype specification GENEPIO:0000045 # PFGE test specification This is a draft (under development) collection of fields for capturing the contextual data for Enterobase (https://enterobase.warwick.ac.uk/) records. Damion Dooley draft Enterobase contextual data standard The name of the virulence factor molecule produced by a pathogen that specifically causes disease, or that influences the host's function to allow the pathogen to thrive. Emma Griffiths virulence factor name help:The protocol for determining virulence should include, when applicable, inoculum preparation, platforms and instrumentation, conditions, cell lines and animal models. The laboratory protocol used to determine virulence phenotypes and markers. Emma Griffiths virulence testing protocol name help:Detection limits should include the numerical cut-off (threshold) value and units for determining positive results e.g. qPCR value, CFUs. The detection limit denotes the smallest measure that can be detected with reasonable certainty for a given analytical procedure. Emma Griffiths detection limit A disease cluster is an unusually high incidence of a particular disease or disorder occurring in close proximity in terms of both time and geography. Damion Dooley https://en.wikipedia.org/wiki/Disease_cluster disease cluster A disease cluster in which two or more cases have been linked by an infectious disease transmission process. Damion Dooley infectious disease cluster help:used only with the source feature key; source feature keys containing the /environmental_sample qualifier should also contain the /isolation_source qualifier. entries including /environmental_sample must not include the /strain qualifier Identifies sequences derived by direct molecular isolation from a bulk environmental DNA sample (by PCR with or without subsequent cloning of the product, DGGE, or other anonymous methods) with no reliable identification of the source organism. Environmental samples include clinical samples, gut contents, and other sequences from anonymous organisms that may be associated with a particular host. They do not include endosymbionts that can be reliably recovered from a particular host, organisms from a readily identifiable but uncultured field sample (e.g., many cyanobacteria), or phytoplasmas that can be reliably recovered from diseased plants (even though these cannot be grown in axenic culture). ENA Webin:environmental_feature EBI ENA Webin environmental feature The European Nucleotide Archive (ENA) provides a comprehensive record of the world's nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation. Webin is an ENA interface and standard for sequence submissions. https://www.ebi.ac.uk/ena EBI European Nucleotide Archive Webin https://www.ebi.ac.uk/ena/WebFeat/ help:Breadth of coverage should be reported as a percentage value (e.g. 95%) to a fold of coverage (e.g. 10X). Eighty percent of the reference genome was covered by sequence fragments with a coverage depth of 4X; therefore, the breadth of coverage was 80% (4:5). A data item which is the amount of a reference sequence covered by a sequence of interest. Emma Griffiths https://orcid.org/0000-0002-1107-9135 https://orcid.org/0000-0002-9578-0788 OBI:0002879 breadth of coverage The mean contig length is the count of base pairs in the average size contig of the sequence assembly. Emma Griffiths mean contig length order: GENEPIO:0000084 # read adapter trimming (y/n) GENEPIO:0002096 # read adapter trimming software GENEPIO:0002095 # read paired-end merging software GENEPIO:0000150 # read trimming and filtering software The procedures used to remove adapters from raw sequence reads, trim low quality bases and where applicable, merge paired-end reads. Emma Griffiths raw sequence data processing datum Name of strain from which sample was obtained OR name of isolate from which sample was obtained. Damion Dooley specimen strain or isolate identifier \s*(\d(.\d{1,4})?|[1-8]\d(.\d{1,4})?|90(.0{1,4})?)\s*[WEwe]\s*(\d(.\d{1,4})?|[1-9]\d(.\d{1,4})?|1[0-7]\d(.\d{1,4})?|180(.0{1,4})?)\s*[NSns]\s* The geographical coordinates of the location where the sample was collected. Specify as degrees latitude and longitude in format "d[d.dddd] N|S d[dd.dddd] W|E", eg, 38.98 N 77.11 W Damion Dooley https://www.ncbi.nlm.nih.gov/biosample/docs/attributes/ latitude and longitude coordinate (NCBI BioSample format) decimal value specification floating point value specification unsigned integer A 32 bit number having a range 0 - 4294967295. unsigned integer value specification 1 1 date value specification 1 A specification for the size of an assembled genome as measured in base pairs. This is a decimal because the megabasepair unit may involve a fractional component. assembly genome size specification sequencing facility name specification date (ISO 8601) A date should be recorded as YYYY-MM-DD according to ISO 8601. HOWEVER ... OWL 2.0 does not include xsd:date as a permitted data type, so this value specification only accepts xsd:dateTime date value specification (ISO 8601) strain identifier specification 1 5 100 A MIC test measurement resulting from a laboratory diffusion method. Damion Dooley MIC diffusion measurement specification unsigned short A 16 bit integer ranging from 0 - 65535 unsigned short integer value specification 1 genomic sequence length specification 1 0.01 2048.0 A MIC test measurement resulting from a laboratory dilution method. Damion Dooley MIC dilution measurement specification The date on which a specimen was collected, in ISO 8601 format. specimen collection date (ISO 8601) date of diagnosis (ISO 8601) 1 0 A subject age is the age since birth of a given organism that is involved in an investigation or study at a given time. Damion Dooley subject age specification subject age at time of specimen collection specification 1 boolean A boolean value specification indicates a presence (true value) or absence (false value) of a given feature. Damion Dooley boolean value specification specimen identifier specification [A-Za-z0-9]+([_.\-][A-Za-z0-9]+)*\@[A-Za-z0-9]+([.\-][A-Za-z0-9]+){1,3} email address specification region, province, state or territory specification A specification for a geographic location that a specimen was collected at. specimen collection site specification subject organism common name specification A subject identifier specification is a specification including datatype pertaining to a given standard's subject identifier field. Damion Dooley subject identifier specification Name of disease in a subject that is related to a given investigation, study and/or specimen. Damion Dooley NIAID GSCID-BRC metadata working group Controlled vocabulary. Human specimen source: https://bioportal.bioontology.org/ontologies/DOID or https://www.ncbi.nlm.nih.gov/mesh/1000067 subject disease specification unsigned long integer A 64 bit integer ranging from 0 - 18446744073709551615 unsigned long integer value specification The physical location that a specimen was collected at. The location may be described by geographic coordinates, city or other geopolitical entity, biome, etc. Damion Dooley specimen collection site The unit provided for an antibiotic drug test dosage. Damion Dooley drug MIC unit specification Damion Dooley A datum specification for the maximum concentration of a drug applied during an an assay. test drug maximum concentration specification This is the minimum concentration of a drug applied during an an assay. Damion Dooley test drug minimum concentration specification (1/2|1/2a|1/2b|1/2c|3a|3b|3c|4a|4ab|4b|4c|4d|4e|1|7) listeria antigen specification 1 url URL specification A travel destination is a destination city (or populated place of some scale) which a given human has travelled to on a particular trip. Damion Dooley travel destination specification A specimen source substance is an organism substance or food product or environmental substance from which the specimen was extracted. Damion Dooley specimen source substance specification Damion Dooley note: OBI has a variety of subclasses like "blood specimen" and "colon specimen" etc. which we have not added here. The are mapped to UBERON terms indirectly through the logic "derives from some [uberon term]". GenEpiO takes a different approach in describing this class with lists for "subject anatomical site", "subject body product", and "plant anatomy". specimen from organism specification 1 0 duration specification A field specification is the specification of a data field used in a data standard or database table. This includes at least a database field name, and optionally display name, description, and data type (binary blob, numeric, categorical, textual or cross-reference). Damion Dooley 2019-04-17T23:04:53Z obsolete: field specification true Damion Dooley http://purl.obolibrary.org/obo/ARO_3004431 obsolete: antimicrobial resistance phenotype - epidemiological cut-off values true Damion Dooley 2019-04-17T23:41:35Z PulseNet Canada field Damion Dooley 2019-04-17T23:41:56Z PulseNet Canada Salmonella field Damion Dooley 2019-05-03T02:44:38Z specimen collection service Damion Dooley 2019-05-03T02:49:24Z bioinformatics analysis service An isolate preparation service prepares isolates for molecular sequencing. Damion Dooley 2019-05-03T02:51:51Z isolate preparation service order: GENEPIO:0002222 # specimen collection service GENEPIO:0002226 # specimen repository/biobank service GENEPIO:0002224 # isolate preparation service OBI:0001904 # sequencing service OBI:0000992 # DNA sequencing service GENEPIO:0002223 # bioinformatics analysis service Damion Dooley 2019-05-03T02:52:19Z specimen related service Damion Dooley 2019-05-03T03:24:33Z specimen repository/biobank service Vital signs are unstable and not within normal limits. Patient may be unconscious. Indicators are unfavorable. 2019-11-19T22:17:43Z grave critical health status critical critical A datum indicating one of six defined subspecies of Salmonella. 2019-11-25T08:45:41Z Salmonella subspecies A PFGE pattern resulting from DNA fragmentation induced by the Spel enzyme. 2019-11-25T09:07:26Z PFGE Spel pattern A collection of organisms that is indicative of faecal contamination. 2020-02-08T20:15:43Z Fecal indicator organism collections include bacterial groups such as thermotolerant coliforms or E. coli. Hence, they only infer that pathogens may be present fecal indicator An isolate derived from a food sample. Damion Dooley 2019-06-28T23:41:26Z food isolate A method for typing The RNA component of the small (30S) subunit of prokaryotic ribosomes. 2020-04-24T19:06:05Z microbial 16S RNA typing h(1|2|3|4|5|6|7|8|9|10|11|12|14|15|16|17|18|19|21|23|24|26|27|28|29|30|31|32|33|34|35|36|37|38|39|40|41|42|43|44|45|46|47|48|49|51|52|53|54|55|56) E. coli H antigen specification O([1-9][0-9][0-9]) E. coli O antigen specification 1 string string value specification A sequencing platform brand provided by Helicos corporation (now defunct). 2020-04-24T22:50:49Z Helicos platform A sequencing platform provided by the Ion Torrent company. 2020-04-24T22:54:06Z Ion Torrent platform A sequencing platform provided by Life Technologies Corporation Sequencing by Oligonucleotide Ligation and Detection 2020-04-24T22:55:31Z SOLiD sequening platform obsolete: drag swab true Rinse is the liquid output of rinsing plant, animal or other material with liquid with the objective of extracting specimen material from that liquid. Damion Dooley 2019-06-28T20:07:25Z rinse Animal rinse is a rinse derived from animal material of one or more animals. Damion Dooley 2019-06-28T20:11:45Z animal rinse obsolete: surface wipe true A low flat-bottomed laboratory container for growing a layer of organisms such as bacteria, molds, and cells on a thin layer of nutrient medium. https://en.wikipedia.org/wiki/Culture_plate 2022-09-23T10:59:09Z culture plate A fluid that is obtained from rinsing meat for the purpose of collecting materials and organisms that are on its surface. 2022-09-23T11:24:53Z fluid from meat rinse A sequencing platform provided by the BGI Genomics company. https://en.wikipedia.org/wiki/BGI_Group 2022-09-23T11:41:23Z BGI Genomics platform A sequencing platform provided by the MGI company. https://en.wikipedia.org/wiki/MGI_(company) 2022-09-23T11:43:21Z MGI platform A fecal specimen collection method in which feces is obtained by inserting the collection device into the anus of the host, or the feces is captured as it is excreted. Emma J Griffiths 2022-09-23T11:47:33Z fecal grab A DNA sequencer which is manufactured by the Illumina corporation using sequence-by-synthesis chemistry that fits on a benchtop and uses P1 and P2 flow cells. Planned Obsolescence: this term is a placeholder for a term requested in another ontology. Once the appropriate ontology term is available, this term’s identifier will be obsoleted with a “term replaced by” id of the other term. 2023-06-21T12:33:56Z Illumina NextSeq 1000 An adapter for MinION or GridION DNA sequencers manufactured by the Oxford Nanopore corporation that enables sequencing on smaller, single-use flow cells. Planned Obsolescence: this term is a placeholder for a term requested in another ontology. Once the appropriate ontology term is available, this term’s identifier will be obsoleted with a “term replaced by” id of the other term. https://store.nanoporetech.com/flongle.html 2023-06-21T13:02:13Z Oxford Nanopore Flongle A personal health datum which specifies host health complications at time of specimen collection. 0000-0002-9578-0788 2022-01-31T23:52:14Z host health complications at time of specimen collection A personal health datum which specifies host pre-existing conditions and risk factors at time of specimen collection. 0000-0002-9578-0788 2022-01-31T23:52:41Z pre-existing conditions and risk factors at time of specimen collection A personal health datum which specifies host signs and symptoms at time of specimen collection. 0000-0002-9578-0788 2022-01-31T23:53:13Z signs and symptoms at time of specimen collection A sampling strategy in which samples are taken during real-life experiments which test directly whether proposed interventions actually work. environmental testing sampling strategy A sampling strategy in which individuals are sampled in the context of experiments or observations performed as part of clinical research. 2022-09-23T11:08:21Z clinical trial sampling strategy A sampling strategy in which samples are taken during real-life experiments which test directly whether proposed interventions actually work. https://en.wikipedia.org/wiki/Field_experiment 2022-09-23T11:12:34Z field experiment sampling strategy A collection of contextual data attributes pertaining to a pathogen in an environmental sample, as specified by the International Nucleotide Sequence Database Collaboration (INSDC). 2022-09-23T10:54:21Z Environmental Pathogen Attribute Package (Pathogen.env) A sampling strategy in which individuals and/or materials are sampled for surveillance performed for research purposes. 2022-09-23T11:04:13Z survey study sampling strategy The "specimen collector sample ID" is where one can store sample identifiers given by the specimen collector. A data item which is a location for a predetermined type of data. https://analystanswers.com/what-is-a-data-field-definition-types-examples/ 2022-11-23T00:33:27.896Z data slot data field This "or" class is just to represent an equivalency with a commonly used classification, it should not have subclasses under it. (2022-12-22) 2022-12-22T22:56:03.687Z State or Province or Territory 1 symptom http://purl.obolibrary.org/obo/SYMP_0000462 position of phenotype abnormality http://www.ebi.ac.uk/efo/EFO_0001444 Damion Dooley's note: See https://www.w3.org/TR/NOTE-datetime for ISO 8601 time standards Damion notes: Subclasses can actually have two components - the identifier, and the agency it is associated with. Under the general concept of "postal code" one can place particular national postal code systems like the US Postal code. 2020-04-24T22:38:17Z pregnant woman http://www.ontobee.org/ontology/MI data source Database that originally provided the interaction record for exchange purposes. PMID:14755292 Damion Dooley's note: GenEpiO is using members of this class for the first part of the hasDbXref relation. This is where we are keeping the mapping between a code and its source database. Depending on its context, hasDbXref annotations may point directly to metadata field names in the target database, and may be annotated with additional information that guides any additional specifications that impact on data conversion of these fields. source database tissue collection (necropsy) A postmortem examination of the body of an animal to determine the cause of death or the character and extent of changes produced by disease. [database_cross_reference: Merriam-Webster:Merriam-Websters_Online_Dictionary--11th_Ed] necropsy The Archaea constitute a domain and kingdom of single-celled microorganisms Damion Dooley https://en.wikipedia.org/wiki/Archaea subspecies Damion Dooley's note: this term was mistakenly placed in a list of GenEpiO diagnostic tests; in fact it references a material entity healthy NCIT offers this alternate definition which GenEpiO intends: "A measurement that specifies the minimum concentration of the agent at which organism growth was inhibited." MIC Clinical and Laboratory Standards Institute MIC antimicrobial testing standards at http://em100.edaptivedocs.info/GetDoc.aspx?doc=CLSI%20M100%20S27:2017&scope=user A portion of an absorbent material attached to one end of a small stick for the purpose of applying or collecting material. Emma Griffiths http://purl.obolibrary.org/obo/GEO_000000396 diseased sick lookup:http://purl.obolibrary.org/obo/GAZ_00000448 [A-Za-z0-9]+([_.\-][A-Za-z0-9]+)*\@[A-Za-z0-9]+([.\-][A-Za-z0-9]+){1,3} /^(:<zip>\d{5}(-\d{4})?)/, $zip deceased deceased phone A hollow cylinder, especially one for holding or conveying liquids. Damion Dooley Damion Dooley's note: It can also be used as an authorization token in some identification processes. Damion Dooley's note: here "sample" means more than one specimen merged together. pooled Damion Dooley Damion Dooley brand name An entity, either biologic or otherwise, of interest in an investigation. [def-source: NCI] Damion Dooley NCIT Thesaurus investigative subject lookup:http://purl.obolibrary.org/obo/GAZ_00000448 http://semanticscience.org/resource/SIO_000665 antimicrobial resistance testing method E-test (Epsilometry) lookup:http://purl.obolibrary.org/obo/GAZ_00000448 lookup:http://purl.obolibrary.org/obo/GAZ_00000448 http://purl.obolibrary.org/obo/ncit.owl organization name Damion Dooley NCI Thesaurus antibiotic treatment fetal death (miscarriage/still birth) NCBI SRA: 454 GS 20 NCBI SRA: AB SOLiD System NCBI SRA: 454 GS FLX NCBI SRA: Illumina Genome Analyzer II NCBI SRA: Helicos HeliScope pathogen role Note: testing use of 'transmitted by' relation; issue to be resolved is whether appropriate domain and range are involved. This use would place 'transmitted by' as a subclass of 'participates in' Damion Dooley note: OBI has a variety of subclasses like "blood specimen" and "colon specimen" etc. which we have not added here. The are mapped to UBERON terms indirectly through the logic "derives from some [uberon term]". GenEpiO takes a different approach in describing this class with lists for "subject anatomical site", "subject body product", and "plant anatomy". ^SAM[NED](\w)?\d+$ http://www.ebi.ac.uk/miriam/main/datatypes/MIR:00000350 http://www.ncbi.nlm.nih.gov/Sequin/acc.html https://www.ncbi.nlm.nih.gov/books/NBK21091/table/ch18.T.refseq_accession_numbers_and_mole/ NCBI BioProject Project Description Description* sample identifier specimen collection date help:Latitude should not be abstracted to the centre of a city, province/state or country as this may falsely implicate an existing location. Damion Dooley's note: IAO lat/long can't be made a subclass of "angular coordinate" or disjoint contradiction arises. See http://www.csgnetwork.com/gpscoordconv.html See http://www.geomidpoint.com/latlon.html See http://www.ontobee.org/browser/rdf.php?o=OBI&iri=http://purl.obolibrary.org/obo/OBI_0001621 See http://www.w3.org/2003/01/geo/wgs84_pos NCBI BioProject NIAID GSCID-BRC Damion Dooley's note: A 'sovereign state' entity should have attribute or relationships to measurables - population, size, name, etc. So e.g. " 'country name' inheres in some 'sovereign state/country'". Also note historical name issue (Myanmar vs Burma) and diplomatic name issue (Taiwan vs "Republic of China"). Both point to needing a single entity with multiple names. http://purl.obolibrary.org/obo/OMIABIS_0000006 target material NCBI BioProject: Material Used in NCBI BioProject specifcation specimen sample scope sequencing facility project objective NCBI BioProject: Objective project method specimen provider PI email 1 categorical scalar Damion Dooley's note: Often a value specification may need to allow for a "data item state" that indicates that a given datum doesn't have a recorded or observed or retrievable value, and the general reason for this state. Assembly QC Damion Dooley http://www.nature.com/nature/journal.../409860a0.html GENEPIO NCBI SRA: Illumina MiSeq NCBI SRA: PacBio RS NCBI SRA: Illumina NextSeq sample 2019-09-08T23:41:09Z diagnosis An object aggregate consisting of an organism and all material entities located within the organism, overlapping the organism, or occupying sites formed in part by the organism. extended organism Damion Dooley NIAID: Comorbidity complications Comorbidity http://purl.obolibrary.org/obo/omit Vital signs are unstable and not within normal limits. Patient may be unconscious. Indicators are unfavorable. Damion Dooley grave A type of Shiga toxin found in E. coli Damion Dooley Stx1 Shiga toxin 1 A type of Shiga toxin found in E. coli which is antigenically distinct from Shiga toxin 1 Damion Dooley Stx2 Shiga toxin 2 MLST typing A health care role borne by a human being and realized by promoting, maintaining or restoring human health through the study, diagnosis, and treatment of disease, injury and other physical and mental impairments. physician role Damion's Note: Future use might prefer: CMO_0000015 body temperature. NCBI BioSample: female NCBI BioSample: male Phenotypic sex refers to an individual's sex as determined by their internal and external genitalia, expression of secondary sex characteristics, and behavior. Note that NCBI Biosample has these values allowed: male | female | pooled male and female | neuter | hermaphrodite | intersex | not determined | missing | not applicable | not collected NCBI: http://www.ncbi.nlm.nih.gov/books/NBK10943/ Damion Dooley's note: We can't say radius is a subclass of "linear coordinate" because length is considered a quality/specifically dependent continuant, which is disjoint with immaterial entity/independent continuant . I.e. "length" as quality unfortunately can't pertain to abstract entities, only "real" ones. social gathering Damion Dooley 1 sequence assembly http://purl.obolibrary.org/obo/OBI_0000973 - 'sequence data' http://edamontology.org/data_0925 Damion Dooley http://purl.obolibrary.org/obo/ERO_0002173 Damion Dooley's note: Unfortunately an IAO "count" cannot be a subclass of immaterial entity "linear coordinate" as this leads to disjoint conflict. Damion Dooley's note: A count datum often implicity carries a date range for which the count is ascertained, or estimated, or simulated for. Damion Dooley's note: The result of the act of counting is a scalar number whose unit is a description of the thing being counted. One can count 10 oranges and compare that to a count of 5 oranges. One can generalize that a count of 5 oranges is a count of pieces of fruit. One can then compare 5 oranges and 5 apples, fruitfully! variable type A process that is the means during which the pathogen is transmitted directly or indirectly from its natural reservoir, a susceptible host or source to a new host. https://www.ebi.ac.uk/ols/ontologies/trans TRANS follows the CDC Epidemiology model of transmission between two hosts, a reservoir and host or other non-host, non-reservoir entities (e.g), a needle and a host. transmission of infection Direct and essentially immediate transfer of infectious agents to a receptive portal of entry through which human or animal infection may take place. This may be by direct contact such as touching, kissing, biting, or sexual intercourse or by the direct projection (droplet spread) of droplet spray onto the conjunctiva or the mucous membranes of the eyes, nose, or mouth. It may also be by direct exposure of susceptible tissue to an agent in soil, compost, or decaying vegetable matter or by the bite of a rabid animal. Transplacental transmission is another form of direct transmission. https://www.ebi.ac.uk/ols/ontologies/trans direct transmission Indirect transmission is a transmission process during which the pathogen is indirectly transferred from a reservoir, source or host to another host by intermediary vehicles, vectors or as airborne dust particles. https://www.ebi.ac.uk/ols/ontologies/trans indirect transmission Congenital transmission is a direct transmission process during which the pathogen is transmitted directly from mother to child at or around the time of birth. https://www.ebi.ac.uk/ols/ontologies/trans congenital contact transmission Contact transmission is a direct transmission process during which the pathogen is transmitted from a reservoir, source or host to another host by kissing, skin-to-skin contact, sexual intercourse, or by contact with soil or vegetation containing the pathogen. https://www.ebi.ac.uk/ols/ontologies/trans contact Droplet spread transmission is a direct transmission process during which the pathogen is transmitted from a reservoir, source or host to another host by spray of aerosols over a short distance, spray from sneezing, coughing or talking. https://www.ebi.ac.uk/ols/ontologies/trans droplet spread Vehicle-borne ingestion transmission is an indirect vehicle-borne transmission process during which the pathogen is indirectly transferred from a reservoir, source or host to another host by ingestion of fluids or foods or food products including: food, water, milk, or meat products. https://www.ebi.ac.uk/ols/ontologies/trans vehicle-born ingestion organism body product poultry fluff Damion Dooley's note: Poultry fluff is both an 'anatomical structure' and a body product when it is shed. Damion Dooley's note: Need to build this out further with reference to http://unitsofmeasure.org/trac and other unit ontologies: [1] R. Hodgson and P. J. Keller, "QUDT-quantities, units, dimensions and data types in OWL and XML." Online (September 2011) http://www. qudt.org [2] M. van Assem, H. Rijgersberg, and J. Top, "Ontology of units of measure and related concepts." Semantic Web 4, no. 1 (2013): 3-13. [3] G. V. Gkoutos, P. N. Schofield, and R. Hoehndorf, "The Units Ontology: a tool for integrating units of measurement in science." Database2012 (2012): bas033. mm µg/ml ongoing pregnancy ancestral group Damion Dooley A personal health datum which specifies host role at time of specimen collection. 0000-0002-9578-0788 2022-01-31T23:53:59Z host role at time of specimen collection MLVA typing Damion Dooley's comment: On spheroids a polygon can be minimally defined by latitude and longitude coordinates, however technically this is ambiguous whether the larger or smaller area is being referred to. Unambiguously, such a polygon could be defined by an epicenter, and a latitude width and longitude height. In elementary geometry, a polygon is a plane figure that is bounded by a finite chain of straight line segments closing in a loop to form a closed chain or circuit. Damion Dooley https://en.wikipedia.org/wiki/Polygon PulseNet: Key Damion Dooley medical health record (history / timeline) depth:1 lookup Bacterial bacteriophage typing http://phinvads.cdc.gov/vads/http:/phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=22642004 http://purl.obolibrary.org/obo/hancestro_0004 obsolete: ancestral group true A social group characterized by a distinctive social and cultural tradition maintained from generation to generation, a common history and origin and a sense of identification with the group; members of the group have distinctive features in their way of life, shared experiences and often a common genetic heritage; these features may be reflected in their experience of health and disease. (NCI); Ethnicity - an arbitrary classification of the social group a person belongs to, and either identifies with or is identified with by others, as a result of a complex of cultural, biological, geographical and other factors such as linguistic, dietary and religion traditions; ancestry, background, allegiance, or association; and physical characteristics traditionally associated with race. Increasingly the concept is used synonymously with race but this use trend has a pragmatic basis rather than scientific. (NCI); The concept of ethnic origin is an attempt to classify people, not according to their current ethnicity, but according to where their ancestors came from. Ethnic origin has become a popular classification in statistics, where the concept of race has been largely discarded. (from Wikipedia) (NCI); a group of people with a common cultural heritage that sets them apart from others in a variety of social relationships. (CSP) Welcome to the GenEpiO ontology! The "information content entity" class has many GenEpiO terms. "data item > measurement data item" has many datum terms; see "data representational model" for collections of terms set up for compatibility with 3rd party sources. Note that this file contains examples of cities, provinces, states, territories, and soverign nations from around the world; customization of this geographical content would be expected for particular regional use. Base class for definitions. A named element in the model. Usage example and description. An attributed label. A permissible value, accompanied by intended text and an optional mapping to a concept URI. A collection of subset, type, slot and class definitions. The definition of a property or a slot. Damion Dooley Damion Dooley's note: Here we are exploring the use of individuals/instances to describe local picklists for GenEpiO driven software. United Nations MSH NCI_Thesaurus NIFSTD home _test_individual LabID assigned by NLEP Damion Dooley 2019-04-17T23:31:44Z PulseNet:LabID Country where isolate was collected Damion Dooley 2019-04-17T23:32:45Z SourceCountry Country where isolate was collected Currently not used for PulseNet Canada Damion Dooley 2019-04-17T23:37:19Z PulseNet:SourceCountry (Province for PNC) where isolate was collected. Damion Dooley 2019-04-17T23:38:09Z PulseNet:SourceState Damion Dooley 2019-04-17T23:40:57Z PulseNet:SourceCity Site of the source type where isolate was collected, e.g. blood, CSF, etc. Damion Dooley 2019-04-17T23:47:10Z PulseNet:SourceSite The source type of the isolate, e.g. human, environmental, etc. Damion Dooley 2019-04-17T23:48:49Z PulseNet:SourceType Other countries where the patient traveled Damion Dooley 2019-04-17T23:49:47Z PulseNet:Traveled_To Further description of the source type, e.g. bovine, chicken Damion Dooley 2019-04-17T23:51:02Z PulseNet:TypeDetails Submitted isolate # (if diff. From Key #), NLEP will enter Prov. Submitted # if sample was sent to NLEP for PFGE Damion Dooley 2019-04-17T23:52:01Z PulseNet:SubmittedNumber (Province for PNC) additional number used to identify the isolate Damion Dooley 2019-04-17T23:53:47Z PulseNet:OtherStateIsolate Age of patient when isolate was collected Damion Dooley 2019-04-17T23:54:24Z PulseNet:PatientAge Sex of patient when isolate was collected Damion Dooley 2019-04-17T23:55:04Z PulseNet:PatientSex Date of collection Damion Dooley 2019-04-17T23:56:33Z PulseNet:IsolatDate Date the isolate is uploaded to the national database after initial upload Damion Dooley 2019-04-18T00:21:46Z PulseNet:UploadModifiedDate Antigenic Formula Damion Dooley 2019-04-18T00:23:18Z PulseNet:AntigenForm Subspecies, I - VI (roman numerals) Damion Dooley 2019-04-18T00:24:11Z PulseNet:Subspecies O polysaccharide (LPS) Group Damion Dooley 2019-04-18T00:24:53Z PulseNet:OGroup Serotype of isolate Damion Dooley 2019-04-18T00:25:22Z PulseNet:Serotype Code issued by CDC; see: 'Naming of outbreaks…' on WebBoard. (format YYMMLabIDLITScode-#) Currently not used for PulseNet Canada Damion Dooley 2019-04-18T00:27:08Z PulseNet:Outbreak Number issued by CDC if isolate is run at CDC (Key will be # issued by NLEP if isolate run at NLEP) Damion Dooley 2019-04-18T00:28:01Z PulseNet:cdc_id Enzyme name present if part of that national list. Damion Dooley 2019-04-18T00:28:35Z PulseNet:ListMember Phage type Damion Dooley 2019-04-18T00:29:20Z PulseNet:Phagetype Provides LabID, gel and lane information Damion Dooley 2019-04-18T00:29:58Z PulseNet:TEMP Status of the isolate; this is only confirmed once CDC runs it. Currently not used for PulseNet Canada Damion Dooley 2019-04-18T00:30:44Z PulseNet:Status Currently not used for PulseNet Canada checked if the isolate is part of NARMS. Damion Dooley 2019-04-18T00:35:55Z PulseNet:NARMS-EB Checked if the isolate is part of FoodNet. Currently not used for PulseNet Canada Damion Dooley 2019-04-18T00:36:39Z PulseNet:FoodNet XbaI gel where the isolate is located, lane of gel Damion Dooley 2019-04-18T00:38:57Z PulseNet:PFGE-XbaI-file XbaI pattern assigned by CDC (format XXXXXX.####) Damion Dooley 2019-04-18T00:39:23Z PulseNet:PFGE-XbaI-pattern The date the XbaI gel is run Damion Dooley 2019-04-18T00:40:05Z PulseNet:PFGE-XbaI-rundate Once the XbaI pattern is named, the status is confirmed by CDC Damion Dooley 2019-04-18T00:41:39Z PulseNet:PFGE-XbaI-status BlnI gel where the isolate is located, lane of gel Damion Dooley 2019-04-18T00:42:11Z PulseNet:PFGE-BlnI-file BlnI pattern assigned by CDC (format XXXXXX.####). Damion Dooley 2019-04-18T00:42:42Z PulseNet:PFGE-BlnI-pattern The date the BlnI gel is run Damion Dooley 2019-04-18T00:43:50Z PulseNet:PFGE-BlnI-rundate Once the BlnI pattern is named, the status is confirmed by CDC Damion Dooley 2019-04-18T00:44:44Z PulseNet:PFGE-BlnI-status SpeI gel where the isolate is located, lane of gel. Damion Dooley 2019-04-18T00:45:15Z PulseNet:PFGE-SpeI-file SpeI pattern assigned by CDC (format XXXXXX.####). Damion Dooley 2019-04-18T00:45:41Z PulseNet:PFGE-SpeI-pattern The date the SpeI gel is run. Damion Dooley 2019-04-18T00:46:21Z PulseNet:PFGE-SpeI-rundate Once the SpeI pattern is named, the status is confirmed by CDC. Damion Dooley 2019-04-18T00:47:00Z PulseNet:PFGE-SpeI-status Date WHICH (PH?) lab received isolate. Damion Dooley 2019-04-18T00:57:28Z PulseNet:ReceivedDate 2019-11-17T00:54:44Z ISO_23418.organization_contact.service Date the isolate uploaded to the national database. Damion Dooley 2019-04-17T22:47:51Z PulseNet:UploadDate A Canadian public research university in the province of Manitoba. 2023-06-21T14:22:42Z University of Manitoba (UM) The Public Health Agency of Canada is an agency of the Government of Canada that is responsible for public health, emergency preparedness and response, and infectious and chronic disease control and prevention. https://en.wikipedia.org/wiki/Public_Health_Agency_of_Canada 2022-09-23T10:10:13Z PHAC Public Health Agency of Canada (PHAC) The Canadian Food Inspection Agency is a regulatory agency that is dedicated to the safeguarding of food, plants, and animals in Canada https://en.wikipedia.org/wiki/Canadian_Food_Inspection_Agency 2022-09-23T10:23:42Z CFIA Canadian Food Inspection Agency (CFIA) Agriculture and Agri-Food Canada is the department of the Government of Canada responsible for the federal regulation of agriculture, including policies governing the production, processing, and marketing of all farm, food, and agri-based products. https://en.wikipedia.org/wiki/Agriculture_and_Agri-Food_Canada 2022-09-23T10:29:32Z AAFC Agriculture and Agri-Food Canada (AAFC) Health Canada is the department of the Government of Canada responsible for national health policy. https://en.wikipedia.org/wiki/Health_Canada 2022-09-23T10:33:24Z HC Health Canada Environment and Climate Change Canada, is the department of the Government of Canada, that is responsible for coordinating environmental policies and programs, as well as preserving and enhancing the natural environment and renewable resources. It is also colloquially known by its former name, Environment Canada. https://en.wikipedia.org/wiki/Environment_and_Climate_Change_Canada 2022-09-23T10:38:21Z ECCC Environment Canada Environment and Climate Change Canada (ECCC) Fisheries and Oceans Canada, is a department of the Government of Canada that is responsible for developing and implementing policies and programs in support of Canada's economic, ecological and scientific interests in oceans and inland waters. https://en.wikipedia.org/wiki/Fisheries_and_Oceans_Canada 2022-09-23T10:42:12Z DFO Fisheries and Oceans Canada (DFO) 2 2 intermediate I resistant R susceptible S sensitive nonsusceptible NS nonsensitive susceptible - dose dependent SSD sensitive - dose dependent BSAC DIN EUCAST SFM Damion Dooley's note: this BFO class is incompatible with "immaterial entity" "n-dimensional concept". food role constructed feature http://purl.obolibrary.org/obo/NCIT_C25464 irrigation pond http://purl.obolibrary.org/obo/NCIT_C80234 stool straw bedding http://opendata.inra.fr/EOL/EOL_0001601#bedding true chicken cage NCBI SRA: ION_TORRENT asian tiger shrimp black tiger shrimp gathered wild Damion Dooley This is a catch-all category for Gazetteer items that have "located in" relations but not an imported GAZETTEER class parent in GenEpiO. Damion Dooley Damion Dooley obsolete: has primitive data type true Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley's note: IAO lat/long can't be made a subclass of "angular coordinate" or disjoint contradiction arises with "immaterial entity" Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley This seems to be covered by 'datetime range intermediate' Damion Dooley Damion Dooley Damion Dooley A datetime set item is a datetime that has been grouped semantically under a given type of process Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley mobile work Damion Dooley An unordered set is a set of elements which have no intrinsic (semantic) order with respect to each other. The set of all living people is not intrinsically ordered; an ordering can only be provided by looking at some partiular feature of the given set elements. Damion Dooley Damion Dooley Damion Dooley Damion Dooley Damion Dooley https://en.wikipedia.org/wiki/Plane_(geometry) Damion Dooley clam, squid, octopus NCBI SRA: HELICOS NCBI SRA: ABI_SOLID A measurement datum that representing the primary structure of a macromolecule(it's sequence) sometimes associated with an indicator of confidence of that measurement. specimen provider PI organization NCBI SRA: Illumina Genome Analyzer IIx NCBI SRA: AB SOLiD 3 Plus System NCBI SRA: AB SOLiD 4 System NCBI SRA: Illumina Genome Analyzer IIe disease (OGMS) NCBI BioSample: hermaphrodite This value is not present in NCBI BioSample host sex. derives into starts during happens during starts with ends with input of n-dimensional coordinate system Damion Dooley's note:Damion Dooley's note: Outstanding issue: How to attach units to each dimension? This depends on the instance of the problem involving a dimension? Linear offset can be decimal or integer or complex, or a count. - Can have a unit: meter, kilometer, foot, light-year, oranges, apples, fruit Angular offset can be radians or degrees. rectal http://purl.obolibrary.org/obo/BTO_0000818 cerebrospinal fluid (CSF) cfu/mL Damion Dooley Damion Dooley http://purl.obolibrary.org/obo/PATO_0000011