[gt-users] get feature by ID

Sascha Steinbiss steinbiss at zbh.uni-hamburg.de
Wed Feb 11 15:16:59 CET 2009


Brent Pedersen wrote:
> hi again,

Hi Brent,

> one thing that i dont know how to do in genometools via the bindings
> is to get at a FeatureNode
> from its ID. i've been doing this by using a python dictionary, but
> then instead of simply using FeatureIndex.add_gff3file(), i have to
> use a stream and iterate over it in python to fill the dictionary. is
> there somethign like this in the C code that's just not exposed in the
> API?

Using the implementation as it is now, this is exactly the way to do it 
as we do not have an ID-based index for an annotation file (yet). Thus 
all root nodes must be read into memory if you want to write your own 
index like this (which is quite easy, and can also be kept in memory for 
subsequent accesses).
Furthermore, if annotations from multiple files are processed like this, 
IDs are no longer guaranteed to be unique, because an ID needs only be 
unique in the scope of a single annotation file. As the IDs are only 
changed when they are output via a GFF3OutStream, at this point they may 
still be ambiguous. Keep that in mind when indexing like this.

> on a sort of related note, one feature that would be nice is something
> to remove this boilerplate:
> 
> genome_stream = FeatureStream(genome_stream, feature_index)
> feature = genome_stream.next_tree()
> while feature:
>     feature = genome_stream.next_tree()
> 
> maybe with something like:
> 
> FeatureStream(genome_stream, feature_index).iterate_all()
> 
> that doesnt apply to the problem of make a name => feature hash, but
> is nice for filling a feature_index and doing an intron_stream.

This is a nice idea and should be implementable without many problems as 
a general method in the GenomeStream class.

> thanks,
> -brent

Sascha

-- 
Sascha Steinbiss
Center for Bioinformatics
University of Hamburg
Bundesstr. 43
20146 Hamburg
Germany

Email:  steinbiss at zbh.uni-hamburg.de
URL:    http://www.zbh.uni-hamburg.de/steinbiss
Phone:  +49 (40) 42838 7322
FAX:    +49 (40) 42838 7312



More information about the gt-users mailing list