CommonSubs |
MOBY::CommonSubs.pm - a set of exportable subroutines that are useful in clients and services to deal with the input/output from MOBY Services
not written yet
The following is a generalized architecture for *all* BioMOBY services showing how to parse incoming messages using the subroutines provided in CommonSubs
sub myServiceName { my ($caller, $data) = @_; my $MOBY_RESPONSE; # holds the response raw XML
# genericServiceInputParser # unpacks incoming message into an array of arrarefs. # Each element of the array is a queryInput block, or a mobyData block # the arrayref has the following structure: # [SIMPLE, $queryID, $simple] # the first element is an exported constant SIMPLE, COLLECTION, SECONDARY # the second element is the queryID (required for enumerating the responses) # the third element is the XML::DOM for the Simple, Collection, or Parameter block my (@inputs)= genericServiceInputParser($data); # or fail properly with an empty response return SOAP::Data->type('base64' => responseHeader("my.authURI.com") . responseFooter()) unless (scalar(@inputs));
# you only need to do this if you are intending to be namespace aware # some services might not care what namespace the data is in, so long # as there is data... my @validNS_LSID = validateNamespaces("NCBI_gi"); # returns LSID's for each human-readable
foreach (@inputs){ my ($articleType, $qID, $input) = @{$_}; unless (($articleType == SIMPLE) && ($input)){ # in this example, we are only accepting SIMPLE types as input # so write back an empty response block and move on to the next $MOBY_RESPONSE .= simpleResponse("", "", $qID) ; next; } else { # now take the namespace and ID from our input article # (see pod docs for other possibilities) my $namespace = getSimpleArticleNamespaceURI($input); # get namespace my ($identifier) = getSimpleArticleIDs($input); # get ID (note array output! see pod)
# here is where you do whatever manipulation you need to do # for your particular service. # you will be building an XML document into $MOBY_RESPONSE } } return SOAP::Data->type('base64' => (responseHeader("illuminae.com") . $MOBY_RESPONSE . responseFooter)); }
A COMPLETE EXAMPLE OF AN EASY MOBY SERVICE
This is a service that:
CONSUMES: base Object in the GO namespace EXECUTES: Retrieval PRODUCES: GO_Term (in the GO namespace)
# this subroutine is called from your dispatch_with line # in your SOAP daemon
sub getGoTerm { my ($caller, $message) = @_; my $MOBY_RESPONSE; my (@inputs)= genericServiceInputParser($message); # ([SIMPLE, $queryID, $simple],...) return SOAP::Data->type('base64' => responseHeader('my.authURI.com') . responseFooter()) unless (scalar(@inputs));
my @validNS = validateNamespaces("GO"); # ONLY do this if you are intending to be namespace aware!
my $dbh = _connectToGoDatabase(); return SOAP::Data->type('base64' => responseHeader('my.authURI.com') . responseFooter()) unless $dbh; my $sth = $dbh->prepare(q{ select name, term_definition from term, term_definition where term.id = term_definition.term_id and acc=?});
foreach (@inputs){ my ($articleType, $ID, $input) = @{$_}; unless ($articleType == SIMPLE){ $MOBY_RESPONSE .= simpleResponse("", "", $ID); next; } else { my $ns = getSimpleArticleNamespaceURI($input); (($MOBY_RESPONSE .= simpleResponse("", "", $ID)) && (next)) unless validateThisNamespace($ns, @validNS); # only do this if you are truly validating namespaces my ($accession) = defined(getSimpleArticleIDs($ns, [$input]))?getSimpleArticleIDs($ns,[$input]):undef; unless (defined($accession)){ $MOBY_RESPONSE .= simpleResponse("", "", $ID); next; } unless ($accession =~/^GO:/){ $accession = "GO:$accession"; # we still haven't decided on whether id's should include the prefix... } $sth->execute($accession); my ($term, $def) = $sth->fetchrow_array; if ($term){ $MOBY_RESPONSE .= simpleResponse(" <moby:GO_Term namespace='GO' id='$accession'> <moby:String namespace='' id='' articleName='Term'>$term</moby:String> <moby:String namespace='' id='' articleName='Definition'>$def</moby:String> </moby:GO_Term>", "GO_Term_From_ID", $ID) } else { $MOBY_RESPONSE .= simpleResponse("", "", $ID) } } }
return SOAP::Data->type('base64' => (responseHeader("my.authURI.com") . $MOBY_RESPONSE . responseFooter)); }
CommonSubs are used to do various manipulations of MOBY Messages. It is useful both Client and Service side to construct and parse MOBY Messages, and ensure that the message structure is valid as per the API.
It DOES NOT connect to MOBY Central for any of its functions, though it does contact the ontology server, so it will require a network connection.
Mark Wilkinson (markw at illuminae dot com)
BioMOBY Project: http://www.biomoby.org
name : genericServiceInputParser function : For the MOST SIMPLE SERVICES that take single Simple or Collection inputs and no Secondaries/Parameters this routine takes the MOBY message and breaks the objects out of it in a useful way usage : my @inputs = genericServiceInputParser($MOBY_mssage)); args : $message - this is the SOAP payload; i.e. the XML document containing the MOBY message returns : @inputs - the structure of @inputs is a list of listrefs. Each listref has three components: 1. COLLECTION|SIMPLE (i.e. constants 1, 2) 2. queryID 3. $data - the data takes several forms a. $article XML::DOM node for Simples <mobyData...>...</mobyData> b. \@article XML:DOM nodes for Collections
for example, the input message:
<mobyData queryID = '1'> <Simple> <Object namespace=blah id=blah/> </Simple> </mobyData> <mobyData queryID = '2'> <Simple> <Object namespace=blah id=blah/> </Simple> </mobyData>
will become: (note that SIMPLE, COLLECTION, and SECONDARY are exported constants from this module)
@inputs = ([SIMPLE, 1, $DOM], [SIMPLE, 2, $DOM]) # the <Simple> block
for example, the input message:
<mobyData queryID = '1'> <Collection> <Simple> <Object namespace=blah id=blah/> </Simple> <Simple> <Object namespace=blah id=blah/> </Simple> </Collection> </mobyData>
will become:
@inputs = ( [COLLECTION, 1, [$DOM, $DOM]] ) # the <Simple> block
name : DO NOT USE!! function : to take a MOBY message and break the objects out of it. This is identical to the genericServiceInputParser method above, except that it returns the data as Objects rather than XML::DOM nodes. This is an improvement! usage : my @inputs = serviceInputParser($MOBY_mssage)); args : $message - this is the SOAP payload; i.e. the XML document containing the MOBY message returns : @inputs - the structure of @inputs is a list of listrefs. Each listref has three components: 1. COLLECTION|SIMPLE|SECONDARY (i.e. constants 1, 2, 3) 2. queryID (undef for Secondary parameters) 3. $data - either MOBY::Client::SimpleArticle, CollectionArticle, or SecondaryArticle
name : complexServiceInputParser function : For more complex services that have multiple articles for each input and/or accept parameters, this routine will take a MOBY message and extract the Simple/Collection/Parameter objects out of it in a useful way. usage : my $inputs = complexServiceInputParser($MOBY_mssage)); args : $message - this is the SOAP payload; i.e. the XML document containing the MOBY message returns : $inputs is a hashref with the following structure:
$inputs->{$queryID} = [ [TYPE, $DOM], [TYPE, $DOM], [TYPE, $DOM] ]
Simples ------------------------
for example, the input message:
<mobyData queryID = '1'> <Simple articleName='name1'> <Object namespace=blah id=blah/> </Simple> <Parameter articleName='cutoff'> <Value>10</Value> </Parameter> </mobyData>
will become: (note that SIMPLE, COLLECTION, and SECONDARY are exported constants from this module)
$inputs->{1} = [ [SIMPLE, $DOM_name1], # the <Simple> block [SECONDARY, $DOM_cutoff] # $DOM_cutoff= <Parameter> block ]
Please see the XML::DOM pod documentation for information about how to parse XML DOM objects.
Collections --------------------
With inputs that have collections these are presented as a listref of Simple article DOM's. So for the following message:
<mobyData> <Collection articleName='name1'> <Simple> <Object namespace=blah id=blah/> </Simple> <Simple> <Object namespace=blah id=blah/> </Simple> </Collection> <Parameter articleName='cutoff'> <Value>10</Value> </Parameter> </mobyData>
will become
$inputs->{1} = [ [COLLECTION, [$DOM, $DOM] ], # $DOM is the <Simple> Block! [SECONDARY, $DOM_cutoff] # $DOM_cutoff = <Parameter> Block ]
Please see the XML::DOM pod documentation for information about how to parse XML DOM objects.
name : getArticles function : get the Simple/Collection/Parameter articles for a single mobyData usage : @articles = getArticles($XML) args : raw XML or XML::DOM of a queryInput, mobyData, or queryResponse block (e.g. from getInputs) returns : a list of listrefs; each listref is one component of the queryInput or mobyData block a single block may consist of one or more named or unnamed simple, collection, or parameter articles. The listref structure is thus [name, $ARTICLE_DOM]:
e.g.: @articles = ['name1', $SIMPLE_DOM]
generated from the following sample XML:
<mobyData> <Simple articleName='name1'> <Object namespace=blah id=blah/> </Simple> </mobyData>
or : @articles = ['name1', $COLL_DOM], ['paramname1', $PARAM_DOM]
generated from the following sample XML:
<mobyData> <Collection articleName='name1'> <Simple> <Object namespace=blah id=blah/> </Simple> <Simple> <Object namespace=blah id=blah/> </Simple> </Collection> <Parameter articleName='e value cutoff'> <default>10</default> </Parameter> </mobyData>
name : getSimpleArticleIDs function : to get the IDs of simple articles that are in the given namespace usage : my @ids = getSimpleArticleIDs("NCBI_gi", \@SimpleArticles); my @ids = getSimpleArticleIDs(\@SimpleArticles); args : $Namespace - (optional) a namespace stringfrom the MOBY namespace ontology, or undef if you don't care \@Simples - (required) a listref of Simple XML::DOM nodes i.e. the XML::DOM representing an XML structure like this: <Simple> <Object namespace="NCBI_gi" id="163483"/> </Simple> note : If you provide a namespace, it will return *only* the ids that are in the given namespace, but will return 'undef' for any articles in the WRONG namespace so that you get an equivalent number of outputs to inputs.
Note that if you call this with a single argument, this is assumed to be \@Articles, so you will get ALL id's regardless of namespace!
name : getSimpleArticleNamespaceURI function : to get the namespace of a simple article usage : my $ns = getSimpleArticleNamespaceURI($SimpleArticle); args : $Simple - (required) a single XML::DOM node representing a Simple Article i.e. the XML::DOM representing an XML structure like this: <Simple> <Object namespace="NCBI_gi" id="163483"/> </Simple>
name : simpleResponse function : wraps a simple article in the appropriate (mobyData) structure usage : $resp .= &simpleResponse($object, 'MyArticleName', $queryID); args : (in order) $object - (optional) a MOBY Object as raw XML $article - (optional) an articeName for this article $query - (optional, but strongly recommended) the queryID value for the mobyData block to which you are responding notes : as required by the API you must return a response for every input. If one of the inputs was invalid, you return a valid (empty) MOBY response by calling &simpleResponse(undef, undef, $queryID) with no arguments.
name : collectionResponse function : wraps a set of articles in the appropriate mobyData structure usage : return responseHeader . &collectionResponse(\@objects, 'MyArticleName', $queryID) . responseFooter; args : (in order) \@objects - (optional) a listref of MOBY Objects as raw XML $article - (optional) an articeName for this article $queryID - (optional, but strongly recommended) the mobyData ID to which you are responding notes : as required by the API you must return a response for every input. If one of the inputs was invalid, you return a valid (empty) MOBY response by calling &collectionResponse(undef, undef, $queryID).
name : responseHeader function : print the XML string of a MOBY response header +/- serviceNotes usage : responseHeader('illuminae.com') responseHeader( -authority => 'illuminae.com', -note => 'here is some data from the service provider') args : a string representing the service providers authority URI, OR a set of named arguments with the authority and the service provision notes. caveat : notes : returns everything required up to the response articles themselves. i.e. something like: <?xml version='1.0' encoding='UTF-8'?> <moby:MOBY xmlns:moby='http://www.biomoby.org/moby'> <moby:Response moby:authority='http://www.illuminae.com'>
name : responseFooter function : print the XML string of a MOBY response footer usage : return responseHeader('illuminae.com') . $DATA . responseFooter; notes : returns everything required after the response articles themselves i.e. something like:
</moby:Response> </moby:MOBY>
name : getInputs function : get the mobyData block(s) as XML::DOM nodes usage : @queryInputs = getInputArticles($XML) args : the raw XML of a <MOBY> query, or an XML::DOM document returns : a list of XML::DOM::Node's, each is a queryInput or mobyData block. Note : Remember that these blocks are enumerated! This is what you pass as the third argument to the simpleResponse or collectionResponse subroutine to associate the numbered input to the numbered response
name : getInputID function : get the value of the queryID element usage : @queryInputs = getInputID($XML) args : the raw XML or XML::DOM of a queryInput or mobyData block (e.g. from getInputs) returns : integer, or '' Note : Inputs and Responses are coordinately enumerated! The integer you get here is what you pass as the third argument to the simpleResponse or collectionResponse subroutine to associate the numbered input to the numbered response
name : DO NOT USE!! function : get the Simple/Collection articles for a single mobyData or queryResponse node, rethrning them as SimpleArticle, SecondaryArticle, or ServiceInstance objects usage : @articles = getArticles($XML) args : raw XML or XML::DOM of a moby:mobyData block returns :
name : getCollectedSimples function : get the Simple articles collected in a moby:Collection block usage : @Simples = getCollectedSimples($XML) args : raw XML or XML::DOM of a moby:Collection block returns : a list of XML::DOM nodes, each of which is a moby:Simple block
name : getInputArticles function : get the Simple/Collection articles for each input query, in order usage : @queries = getInputArticles($XML) args : the raw XML of a moby:MOBY query returns : a list of listrefs, each listref is the input to a single query. Remember that the input to a single query may be one or more Simple and/or Collection articles. These are provided as XML::DOM nodes.
i.e.: @queries = ([$SIMPLE_DOM_NODE], [$SIMPLE_DOM_NODE2]) or : @queries = ([$COLLECTION_DOM_NODE], [$COLLECTION_DOM_NODE2])
the former is generated from the following XML:
... <moby:mobyContent> <moby:mobyData> <Simple> <Object namespace=blah id=blah/> </Simple> </moby:mobyData> <moby:mobyData> <Simple> <Object namespace=blah id=blah/> </Simple> </moby:mobyData> </moby:mobyContent> ...
name : isSimpleArticle function : tests XML (text) or an XML DOM node to see if it represents a Simple article usage : if (isSimpleArticle($node)){do something to it} input : an XML::DOM node, an XML::DOM::Document or straight XML returns : boolean
name : isCollectionArticle function : tests XML (text) or an XML DOM node to see if it represents a Collection article usage : if (isCollectionArticle($node)){do something to it} input : an XML::DOM node, an XML::DOM::Document or straight XML returns : boolean
name : isSecondaryArticle function : tests XML (text) or an XML DOM node to see if it represents a Secondary article usage : if (isSecondaryArticle($node)){do something to it} input : an XML::DOM node, an XML::DOM::Document or straight XML returns : boolean
name : extractRawContent function : pass me an article (Simple, or Collection) and I'll give you the content AS A STRING - i.e. the raw XML of the contained MOBY Object(s) usage : extractRawContent($simple) input : the one element of the output from getArticles returns : string
name : getNodeContentWithArticle function : a very flexible way to get the stringified content of a node that has the correct element and article name or get the value of a Parameter element. usage : @strings = getNodeContentWithArticle($node, $tagname, $articleName) args : (in order) $node - an XML::DOM node, or straight XML. It may even be the entire mobyData block. $tagname - the tagname (effectively from the Object type ontology), or "Parameter" if you are trying to get secondaries $articleName - the articleName that we are searching for
returns : an array of the stringified text content for each node that matched the tagname/articleName specified. note that each line of content is an element of the string. notes : This was written for the purpose of getting the values of String, Integer, Float, Date_Time, and other such primitives. For example, in the following XML: ... ... <moby:mobyContent> <moby:mobyData> <Simple> <Sequence namespace=blah id=blah> <Integer namespace='' id='' articleName="Length">3</Integer> <String namespace='' id='' articleName="SequenceString">ATG</String> </Sequence> </Simple> </moby:mobyData> </moby:mobyContent> ... ...
would be analysed as follows:
# get $input - e.g. from genericServiceInputParser or complexServiceInputParser @sequences = getNodeContentWithArticle($input, "String", "SequenceString");
For Parameters, such as the following ... ... <moby:mobyContent> <moby:mobyData> <Simple> <Sequence namespace=blah id=blah> <Integer namespace='' id='' articleName="Length">3</Integer> <String namespace='' id='' articleName="SequenceString">ATG</String> </Sequence> </Simple> <Parameter articleName='cutoff'> <Value>24</Value> </Parameter> </moby:mobyData> </moby:mobyContent> ... ...
You would parse it as follows:
# get $input - e.g. from genericServiceInputParser or complexServiceInputParser @sequences = getNodeContentWithArticle($input, "String", "SequenceString"); @cutoffs = getNodeContentWithArticle($input, "Parameter", "cutoff");
EXAMPLE : my $inputs = complexServiceInputParser($MOBY_mssage)); # $inputs->{$queryID} = [ [TYPE, $DOM], [TYPE, $DOM], [TYPE, $DOM] ] my (@enumerated) = keys %{$inputs}; foreach $no (@enumerated){ my @articles = @{$inputs->{$no}}; foreach my $article(@articles){ my ($type, $DOM) = @{$article}; if ($type == SECONDARY){ $cutoff = getNodeContentsWithArticle($DOM, "Parameter", "cutoff"); } else { $sequences = getNodeContentWithArticle($DOM, "String", "SequenceString"); } } }
name : validateNamespaces function : checks the namespace ontology for the namespace lsid usage : @LSIDs = validateNamespaces(@namespaces) args : ordered list of either human-readable or lsid presumptive namespaces returns : ordered list of the LSID's corresponding to those presumptive namespaces; undef for each namespace that was invalid
name : validateThisNamespace function : checks a given namespace against a list of valid namespaces usage : $valid = validateThisNamespace($ns, @validNS); args : ordered list of the namespace of interest and the list of valid NS's returns : boolean
name : getResponseArticles function : get the DOM nodes corresponding to individual Simple or Collection outputs from a MOBY Response usage : ($collections, $simples) = getResponseArticles($node) args : $node - either raw XML or an XML::DOM::Document to be searched returns : an array-ref of Collection article XML::DOM::Node's an array-ref of Simple article XML::DOM::Node's
name : getServiceNotes function : to get the content of the Service Notes block of the MOBY message usage : getServiceNotes($message) args : $message is either the XML::DOM of the MOBY message, or plain XML returns : String content of the ServiceNotes block of the MOBY Message
name : getCrossReferences function : to get the cross-references for a Simple article usage : @xrefs = getCrossReferences($XML) args : $XML is either a SIMPLE article (<Simple>...</Simple>) or an object (the payload of a Simple article), and may be either raw XML or an XML::DOM node. returns : an array of MOBY::CrossReference objects example :
my (($colls, $simps) = getResponseArticles($query); # returns DOM nodes foreach (@{$simps}){ my @xrefs = getCrossReferences($_); foreach my $xref(@xrefs){ print "Cross-ref type: ",$xref->type,"\n"; print "namespace: ",$xref->namespace,"\n"; print "id: ",$xref->id,"\n"; if ($xref->type eq "Xref"){ print "Cross-ref relationship: ", $xref->xref_type,"\n"; } } }
name : whichDeepestParentObject function : select the parent node from nodeList that is closest to the querynode usage : ($term, $lsid) = whichDeepestParentObject($CENTRAL, $queryTerm, \@termList) args : $CENTRAL - your MOBY::Client::Central object $queryTerm - the object type I am interested in \@termlist - the list of object types that I know about returns : an ontology term and LSID as a scalar, or undef if there is no parent of this node in the nodelist. (note that it will only return the term if you give it term names in the @termList. If you give it LSID's in the termList, then both the parameters returned will be LSID's - it doesn't back-translate...)
Usage : $object->_rearrange( array_ref, list_of_arguments) Purpose : Rearranges named parameters to requested order. Example : $self->_rearrange([qw(SEQUENCE ID DESC)],@param); : Where @param = (-sequence => $s, : -desc => $d, : -id => $i); Returns : @params - an array of parameters in the requested order. : The above example would return ($s, $i, $d). : Unspecified parameters will return undef. For example, if : @param = (-sequence => $s); : the above _rearrange call would return ($s, undef, undef) Argument : $order : a reference to an array which describes the desired : order of the named parameters. : @param : an array of parameters, either as a list (in : which case the function simply returns the list), : or as an associative array with hyphenated tags : (in which case the function sorts the values : according to @{$order} and returns that new array.) : The tags can be upper, lower, or mixed case : but they must start with a hyphen (at least the : first one should be hyphenated.) Source : This function was taken from CGI.pm, written by Dr. Lincoln : Stein, and adapted for use in Bio::Seq by Richard Resnick and : then adapted for use in Bio::Root::Object.pm by Steve Chervitz, : then migrated into Bio::Root::RootI.pm by Ewan Birney. Comments : : Uppercase tags are the norm, : (SAC) : This method may not be appropriate for method calls that are : within in an inner loop if efficiency is a concern. : : Parameters can be specified using any of these formats: : @param = (-name=>'me', -color=>'blue'); : @param = (-NAME=>'me', -COLOR=>'blue'); : @param = (-Name=>'me', -Color=>'blue'); : @param = ('me', 'blue'); : A leading hyphenated argument is used by this function to : indicate that named parameters are being used. : Therefore, the ('me', 'blue') list will be returned as-is. : : Note that Perl will confuse unquoted, hyphenated tags as : function calls if there is a function of the same name : in the current namespace: : -name => 'foo' is interpreted as -&name => 'foo' : : For ultimate safety, put single quotes around the tag: : ('-name'=>'me', '-color' =>'blue'); : This can be a bit cumbersome and I find not as readable : as using all uppercase, which is also fairly safe: : (-NAME=>'me', -COLOR =>'blue'); : : Personal note (SAC): I have found all uppercase tags to : be more managable: it involves less single-quoting, : the key names stand out better, and there are no method naming : conflicts. : The drawbacks are that it's not as easy to type as lowercase, : and lots of uppercase can be hard to read. : : Regardless of the style, it greatly helps to line : the parameters up vertically for long/complex lists.
CommonSubs |