[gt-users] FASTA stream input implemented?

Sascha Steinbiss steinbiss at zbh.uni-hamburg.de
Tue Feb 17 11:53:34 CET 2009


David Ellinghaus wrote:
> Dear gt-developers,

Hey David,

> it's been a long time that I've written some code using the genometools.
> And I am really impressed what is now available in the libraries for 
> implementing
> efficient and clean C programs!! So, I tried to implement something with 
> the genometools ... :-)

Always glad to hear that. :)

> @Sascha and Gordon:
> Are there any functions in the genometools library that I can use for 
> reading FASTA files
> as a stream?
> I tried using the "core/bioseq_iterator.c" functions to read FASTA files:
[snip]
> Is something like a FASTA stream already implemented, and if not, is 
> there a stream implementation which I can use
> to guide my own FASTA stream implementation?

Maybe using multiple BioSeq objects (one per file) with a different 
FastaReader implementation (via gt_bioseq_new_with_fasta_reader()) may 
help you.

My guess would be the one based on the GtSeqIterator 
(GtFastaReaderGtSeqIt) as it employs GtFastaBuffer which uses regular 
file operations instead of mapping the whole file into memory. Never 
tried that myself though, so let's wait what Gordon has to say about this.

Alternatively, you can try to implement a sequence stream the way the 
GtNodeStream for annotations is implemented (look at the *_stream.[ch] 
files in extended). Such as stream could pass around GtSeq objects 
instead of GtGenomeNodes. The GtSeqIterator interface looks already 
quite like that, just without the stream connection capability.

> Many thanks in advance
> Best regards

You're welcome!

Sascha

-- 
Sascha Steinbiss
Center for Bioinformatics
University of Hamburg
Bundesstr. 43
20146 Hamburg
Germany

Email:  steinbiss at zbh.uni-hamburg.de
URL:    http://www.zbh.uni-hamburg.de/steinbiss
Phone:  +49 (40) 42838 7322
FAX:    +49 (40) 42838 7312



More information about the gt-users mailing list