[gt-users] FASTA stream input implemented?

Gordon Gremme gremme at gmail.com
Thu Feb 19 11:54:19 CET 2009


>>> Alternatively, you can try to implement a sequence stream the way the
>>> GtNodeStream for annotations is implemented (look at the *_stream.[ch]
>>> files in extended). Such as stream could pass around GtSeq objects
>>> instead of GtGenomeNodes. The GtSeqIterator interface looks already
>>> quite like that, just without the stream connection capability.
>>>
>>
>> Yes, that would be the best solution (but the stream returns
>> GtSequenceNode objects and not GtSeq objects).
>> You can easily implement it yourself (let's call it FastaInStream).
>> The FastaInStream would implement the NodeStream interface (you can
>> copy the BedInStream for example).
>> The constructor gets the Fasta file name(s) and creates an
>> GtSeqIterator internally (to process the Fasta file(s)).
>> In the _next method, you would just call the GtSeqIterator to retrieve
>> the next Fasta sequence, create a GtSequenceNode and return it.
>>
>
> I have copied extended/bed_in_stream.c and changed it to fasta_in_stream.c.
> But I am not sure when to call the _next method to retrieve the next Fasta
> sequence,
> do you mean in the fasta_in_stream_next() method?
> Or do I have to write my own GtFASTAParser struct as well as functions
> like gt_fasta_parser_parse() in a new file called extended/fasta_parser.c?
> (Sorry, but I am just a little confused how the stream actually works.)

Sorry, if I wasn't explaining this well enought. You have to do the former.
I.e.. get rid of the process_file() method (and don't implement a
GtFastaParser),

In fasta_in_stream_next you would call gt_seqiterator_next(). If it
sucessfully returns another sequence, you construct a GtSequenceRegion
from it and return it. Otherwise, you return NULL.

Hope that helps,

Gordon


More information about the gt-users mailing list