FAQ



  1. Why is there a Collection object? What is the reasoning for including it?

  2. Collection (formerly Library) is defined in the SBOL RFC Draft as an organizational container object which helps users and developers conceptualize a set of DnaComponents as a group. Any combination of related DnaComponents can be added to a Collection object, annotated with a displayID, name and description and be shared on the web or exchanged directly. As an example, collections could contain a set of restriction enzyme recognition sites, such as the features commonly used for BBF RFC 10 BioBricks. A collection could contain all the DNA components used in a specific project, lab, or any custom grouping specified by the user. Arbitrary groupings and new Collection objects SHOULD NOT be created and named when the groupings are not defined, and Collections SHOULD NOT be created to define an arbitrary set -- the contents should be related in some context.
    Context definition: Nominal definition: Set, Library, Bag of Parts, Feature Set
    The name change was adopted when there was a rough consensus that Collection is a more accurate description of the aggregation object than overloading the meaning of the term Library, a term also used in Molecular Biology. This means that implementations such as in Java will have to be use the fully qualified name when java.util.Collection is also used.
    When should SBOL undertake terminology changes like this? see Version Question below.

  3. Why does the SequenceAnnotation relate to the DnaComponent by the “subComponents” property as 1..n as opposed to 1?

    It is possible for someone to want to annotate two different subComponents to the exact same position and strand, specifically in the case of different kinds of sequence aspects.  For example, a specific primer sequence and the core promoter sequence could map directly to the same DNA sequence of a DnaComponent. Multiple annotations of the same sequence are especially common when merging two different data sources. 
    The location and strand information are tied to the DnaComponent and are relative to its total DnaSequence.  In another example, the same TetR binding site found in two different DnaComponents could have different start and stop positions. The same TetR binding site DnaComponent object should be used to annotate the two DnaComponents, therefore allow independent use.


  4. What is the reason for keeping the SBOL class model fairly independent of the serialization technology? For example, why are we considering JSON, RDF/XML, RDF/Turtle and a native SBOL XML?

    Note: A fact-based answer to this question cannot be formulated at this time. We have not yet clearly defined the use cases for data exchange using SBOL enough to choose a single optimal technology. Nor should we try to theoretically determine which should be optimal without trying to exchange data. Therefore, the following discussion is essentially the hypotheses which we are considering for the initial attempts. Hopefully this discussion will help us collect the information needed to formulate an answer as we deploy solutions in the wild.
    At the heart of SBOL is a class model (aka the core data model)  that will eventually integrate design, performance, and other information needed to reproduce and re-use a synthetic biological system.  We can serialize the prototype SBOL versions to JSON and RDF/XML. We can de-serialize from RDF/XML. But, recently there's interest in an XML serialization as this is a familiar form of data exchange to experienced software developers. 
    In order to simplify the debate about which technology to choose for the transmission of the SBOL core data model, we strive to make its specification independent of the serialization technology.  But as we work towards that goal we note which SBOL requirements are impacted by the choice of serialization technology. No single serialization technology answers every use case and we should keep the flexibility of using more than one format for as long as needed to have an opinion based on SBOL use. At the same time, we can’t attempt to use all the available serialization technologies available.  For example, JSON serialization and a new native SBOL XML are likely to be similar in their benefits, as they share many common features. David Megginson argues that they are “Turing” equivalent [1]. 
    Therefore, SBOL should take a different development path from SBML in that the SBOL library can serialize into multiple formats, including an XML dump of the object model, an RDF model, JSON and Genbank. This means the libSBOL libraries will have a common interface for writing additional serializations by third-party developers, for example, a serializer for Eugene scripts or any other human readable form.
    RDF is an abstraction level above XML. XML and JSON are text formats, usually stored as files, whereas RDF is a set of relationships (known as triples) that can be represented using different formats (XML, N3, etc.). At first blush RDF/XML implies XML is the serialization method. However, RDF/XML is not often generated with the classical nested XML tree from based on the relationships encoded by the RDF triples. When examining the RDF-XML file in a text viewer, it does not adhere to a tree structure as would be expected of a human designed XML document. RDF’s main advantage to the SBOL development process is the ability to automatically handle changes from one version to the next; whereas the addition of new classes or relationships between classes might destroy the XML tree structure, RDF can circumvent this problem because it is not so tied to the actual file format.  
    For now, we support and endorse RDF/XML (http://www.rdfabout.com/) and XML (http://www.w3.org/XML/). We offer the ability to read extant GenBank flat file format (http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html) data, and continue to develop an open framework for exchange of data  useful to synthetic biologists. 

  5. What is the current version number? What is the versioning strategy for SBOL?

    The current rough consensus is to "implement changes to the standard sooner rather than later". The reasoning is clear: right now, there are very few implementations using SBOL, and if we wait until the next release to make a change, then we'll have backwards compatibility issues that may be costly to support.  Obviously, as more applications support SBOL, we’ll need to formalize our release management strategy.

    SBOL work has been proceeding on the revisions to core data model to achieve ratification of the specification outlined in a draft BBF RFC. The core data model diagram reflects changes made to the diagram since the June 8, 2011, San Diego meeting. This diagram is the basis for the SBOL RFC draft document for "SBOL version 1".  This version is being implemented in libSBOL.
    The previous version (Which at the time was not versioned separately from libSBOL) is implemented by libSBOLj 0.4.  Since we don't consider, nor plan, to make version 1.0 backward compatible with any prior work, it is prudent to push forward and roll any changes into a stable version now.  As more people enter the discussion a strict, but efficient and speedy versioning policy will greatly help enhance communication of proposed changes. 

  6. What are some SBOL use case examples?



  7. Is there a written list of capabilities this spec is meeting? (temp answer)

    see Use Cases
  8. Have you considered the implication of overlapping but not nested annotations? And of course, it's not clear how introns would be handled at all? (temp answer)

    These are important use cases that have not been considered in detail.  JBEI-ICE XML is handling introns?  We will consider this in the future.

  9. What is the difference between a URI and a URL?




Editors: Michal Galdzicki and Mandy Wilson