[gt-users] how to access GtGenomeNode content from stream?
Sascha Steinbiss
steinbiss at zbh.uni-hamburg.de
Tue Apr 21 11:13:19 CEST 2009
David Ellinghaus wrote:
> Dear genometools developers,
Dear David,
> I try to get familiar with the genometools and would like to implement a
> stream in genometools
> simply reading huge tables from input files line by line.
> I wrote something like that for my new "tools/gt_fam.c":
>
> GtNodeStream *fam_in_stream;
> GtGenomeNode *gn;
>
> /* create a fam input stream */
> fam_in_stream =
> gt_fam_in_stream_new(gt_str_get(arguments->str_file_selection));
>
> /* pull the FAM files through the stream and free them afterwards */
> while (!(had_err = gt_node_stream_next(fam_in_stream, &gn, err)) && gn) {
>
> (1) How to get the information from gn ?
That depends on which GenomeNode subclass/implementation you have
created in the input stream. The SequenceNode, FeatureNode and
CommentNode classes offer methods to extract specific data (see the
respective hearder files), while the GenomeNode (genome_node_api.h)
methods offers general accessors to ranges etc.
You can check if a GenomeNode is of a specific subclass/implements a
specific interface by calling a try_cast function. For example, to check
whether a GenomeNode supports SequenceNode functionality, use
GtSequenceNode *sn;
if ((sn = gt_sequence_node_try_cast(gn)) == NULL) {
printf("Could not cast genome node -- is not a sequence node!");
}
> gt_genome_node_delete(gn);
> }
>
> /* free */
> gt_node_stream_delete(fam_in_stream);
>
>
> I adapted my own source code in "extended/fam_in_stream.c" from your
> file "extended/bed_in_stream.c."
> Unfortunately, I don't know how to access my parsed value (fam_id) from
> each line which must be in GtGenomeNode gn.
> Is there any possibility to cast the gn to access my string fam_id in
> each line?
IMHO the cleanest solution would be to subclass GenomeNode, thus
creating your own node type, and define appropriate accessors there
(e.g. gt_fam_node_get_fam_id()). Gordon, do you agree?
As a side note: for efficient sequential (unmapped) sequence access, in
the latest GenomeTools versions the GtSeqIterator and GtSeqIteratorQual
classes may help you. I do not know the FAM format you are using here,
but as I remember a prior question of yours, I think I could mention it ;)
> Thanks a lot in advance
> Best regards
> David
No problem!
Sascha
--
Sascha Steinbiss
Center for Bioinformatics
University of Hamburg
Bundesstr. 43
20146 Hamburg
Germany
Email: steinbiss at zbh.uni-hamburg.de
URL: http://www.zbh.uni-hamburg.de/steinbiss
Phone: +49 (40) 42838 7322
FAX: +49 (40) 42838 7312
More information about the gt-users
mailing list