[gt-users] how to access GtGenomeNode content from stream?

David Ellinghaus d.ellinghaus at ikmb.uni-kiel.de
Tue Apr 21 14:17:48 CEST 2009


Hi Sascha,

thanks for reply!
I think then I will try to implement my own FamNode class, which must be 
a GenomeNode subclass.
(a FAMily file is used in case/control studies for pedigree information.)

Nice to hear that you implemented a stream class for sequence information,
weeks ago I could implement this on my own, but I think I will use your 
implementation
for the next time :-)

Best regards to all
David
> David Ellinghaus wrote:
>   
>> Dear genometools developers,
>>     
>
> Dear David,
>
>   
>> I try to get familiar with the genometools and would like to implement a 
>> stream in genometools
>> simply reading huge tables from input files line by line.
>> I wrote something like that for my  new "tools/gt_fam.c":
>>
>>   GtNodeStream *fam_in_stream;
>>   GtGenomeNode *gn;
>>
>>   /* create a fam input stream */
>>   fam_in_stream = 
>> gt_fam_in_stream_new(gt_str_get(arguments->str_file_selection));
>>
>>   /* pull the FAM files through the stream and free them afterwards */
>>   while (!(had_err = gt_node_stream_next(fam_in_stream, &gn, err)) && gn) {
>>    
>>     (1) How to get the information from gn ?
>>     
>
> That depends on which GenomeNode subclass/implementation you have 
> created in the input stream. The SequenceNode, FeatureNode and 
> CommentNode classes offer methods to extract specific data (see the 
> respective hearder files), while the GenomeNode (genome_node_api.h) 
> methods offers general accessors to ranges etc.
> You can check if a GenomeNode is of a specific subclass/implements a 
> specific interface by calling a try_cast function. For example, to check 
> whether a GenomeNode supports SequenceNode functionality, use
>
> GtSequenceNode *sn;
> if ((sn = gt_sequence_node_try_cast(gn)) == NULL) {
>    printf("Could not cast genome node -- is not a sequence node!");
> }
>
>   
>>     gt_genome_node_delete(gn);
>>   }
>>
>>   /* free */
>>   gt_node_stream_delete(fam_in_stream);
>>
>>
>> I adapted my own source code in "extended/fam_in_stream.c" from your 
>> file "extended/bed_in_stream.c."
>> Unfortunately, I don't know how to access my parsed value (fam_id) from 
>> each line which must be in GtGenomeNode gn.
>> Is there any possibility to cast the gn to access my string fam_id in 
>> each line?
>>     
>
> IMHO the cleanest solution would be to subclass GenomeNode, thus 
> creating your own node type, and define appropriate accessors there 
> (e.g. gt_fam_node_get_fam_id()). Gordon, do you agree?
>
> As a side note: for efficient sequential (unmapped) sequence access, in 
> the latest GenomeTools versions the GtSeqIterator and GtSeqIteratorQual 
> classes may help you. I do not know the FAM format you are using here, 
> but as I remember a prior question of yours, I think I could mention it ;)
>
>   
>> Thanks a lot in advance
>> Best regards
>> David
>>     
>
> No problem!
> Sascha
>
>   


-- 
David Ellinghaus
Institute for Clinical Molecular Biology
Christian-Albrechts-University Campus Kiel
House 6, Arnold-Heller-Str.3
D-24105 Kiel, Germany

Email:  d.ellinghaus at ikmb.uni-kiel.de
Phone:  +49-(0)431-597-1963
FAX  :  +49-(0)431-597-1842



More information about the gt-users mailing list