[gt-users] FASTA stream input implemented?

David Ellinghaus d.ellinghaus at ikmb.uni-kiel.de
Thu Feb 19 14:32:34 CET 2009


Hi Sascha and Gordon, it's me again ;-)
>>>> Alternatively, you can try to implement a sequence stream the way the
>>>> GtNodeStream for annotations is implemented (look at the *_stream.[ch]
>>>> files in extended). Such as stream could pass around GtSeq objects
>>>> instead of GtGenomeNodes. The GtSeqIterator interface looks already
>>>> quite like that, just without the stream connection capability.
>>>>
>>>>         
>>> Yes, that would be the best solution (but the stream returns
>>> GtSequenceNode objects and not GtSeq objects).
>>> You can easily implement it yourself (let's call it FastaInStream).
>>> The FastaInStream would implement the NodeStream interface (you can
>>> copy the BedInStream for example).
>>> The constructor gets the Fasta file name(s) and creates an
>>> GtSeqIterator internally (to process the Fasta file(s)).
>>> In the _next method, you would just call the GtSeqIterator to retrieve
>>> the next Fasta sequence, create a GtSequenceNode and return it.
>>>
>>>       
>> I have copied extended/bed_in_stream.c and changed it to fasta_in_stream.c.
>> But I am not sure when to call the _next method to retrieve the next Fasta
>> sequence,
>> do you mean in the fasta_in_stream_next() method?
>> Or do I have to write my own GtFASTAParser struct as well as functions
>> like gt_fasta_parser_parse() in a new file called extended/fasta_parser.c?
>> (Sorry, but I am just a little confused how the stream actually works.)
>>     
>
> Sorry, if I wasn't explaining this well enought. You have to do the former.
> I.e.. get rid of the process_file() method (and don't implement a
> GtFastaParser),
>
> In fasta_in_stream_next you would call gt_seqiterator_next(). If it
> sucessfully returns another sequence, you construct a GtSequenceRegion
> from it and return it. Otherwise, you return NULL.
>
> Hope that helps,
>   
I am very sorry, how to contruct a GtSequenceRegion? I get a sequence, 
len, and desc from fasta_in_stream_next()
but I don't know where to put them ...
I found "SimpleGtSequenceRegion", but I think this isn't right.
I don't need the "GTQueue genome_node_buffer" any more in my struct 
GtFASTAInStream, right?

Thank you a lot
David

-- 
David Ellinghaus
Institute for Clinical Molecular Biology
Christian-Albrechts-University Campus Kiel
House 6, Arnold-Heller-Str.3
D-24105 Kiel, Germany

Email:  d.ellinghaus at ikmb.uni-kiel.de
Phone:  +49-(0)431-597-1963
FAX  :  +49-(0)431-597-1842



More information about the gt-users mailing list