[gt-users] gff3 parser
Gordon Gremme
gremme at gmail.com
Tue Feb 17 17:07:10 CET 2009
>> bah. i was trying to avoid becoming a c programmer.
It doesn't hurt to know some C, all major scripting languages are
written in it ;-)
>>>> should i just change the tests or do you want to keep the original behavior when
>>>> the retainids is not used?
>>> Please keep the original behaviour.
>> ok. all tests pass with this patch:
>> http://gist.github.com/62770/
Cool!
> We should maybe generally discuss whether ID renaming may be
> counterproductive in the case of multi-line features, e.g.
>
> $ bin/gt gff3 -retainids testdata/multi_feature_simple.gff3
> ##gff-version 3
> ##sequence-region ctg123 1 1497228
> warning: feature ID "CDS1" not unique: changing to CDS1.1
> ctg123 . gene 1000 9000 . + . ID=gene1
> ctg123 . CDS 1201 1500 . + 0 ID=CDS1;Parent=gene1
> ctg123 . CDS 3000 3902 . + 0 ID=CDS1.1;Parent=gene1
>
> breaks the "same-ID" rule for the multi-line features. I am not sure
> whether this may lead to problems or not...
> Any comments?
Good point Sascha, that is a bug in the new retainids functionality.
Multi-features have to be handled explicitly (similar to the
non-retainids case) to avoid this problem.
I think after fixing this and two additonal -retainids test (one for
``normal'' features and one for multi-features) we are ready for prime
time!
Gordon
More information about the gt-users
mailing list