[gt-users] gff3 parser

Brent Pedersen bpederse at gmail.com
Fri Feb 13 18:12:54 CET 2009


On Fri, Feb 13, 2009 at 5:34 AM, Sascha Steinbiss
<steinbiss at zbh.uni-hamburg.de> wrote:
> Sascha Steinbiss wrote:
>> Gordon Gremme wrote:
>>>> ok. i think i've resolved this:
>>>> http://gist.github.com/62770
>> This patch introduced some memleaks
>
> Some word about the memory leaks:
>
> +static void make_id_unique(GtGFF3Visitor *gff3_visitor, GtStr *id)
> +{
> +
> +  GtCstrTable *used_ids = gff3_visitor->gt_used_ids;
> +  unsigned long i = 0;
> +  GtStr *id_clone;
> +
> +  if (gt_cstr_table_get(used_ids, gt_str_get(id))) {
> +      id_clone = gt_str_clone(id);
> +      while ( gt_cstr_table_get(used_ids, gt_str_get(id_clone) )) {
> +        /* TODO: add warning */
> +        id_clone = gt_str_clone(id);
> +        make_unique_id_string(id_clone, ++i);
> +      }
> +      /* update table with the new id */
> +      gt_str_set(id, gt_str_get(id_clone));
> +  }
> +  gt_cstr_table_add(used_ids, gt_str_get(id));
> +
> +}
>
> Note that gt_str_clone() creates a new copy of the string which needs to
> be properly freed using gt_str_delete(). This is not easy here, as the
> only reference to the new clone is overwritten in each iteration,
> leading to leaking memory. Sometimes it is better to use a single GtStr
> as a buffer for temporary strings:
>
> static bool id_string_is_unique(GtStr *id, GtStr *buf, GtCstrTable *tab,
>                                 unsigned long i)
> {
>   gt_str_reset(buf);
>   gt_str_append_str(buf, id);
>   make_unique_id_string(buf, i);
>   return (gt_cstr_table_get(tab, gt_str_get(buf)) == NULL);
> }
> static void make_id_unique(GtGFF3Visitor *gff3_visitor, GtStr *id)
> {
>   unsigned long i = 1;
>
>   if (gt_cstr_table_get(gff3_visitor->used_ids, gt_str_get(id)))
>   {
>     GtStr *buf = gt_str_new();
>     gt_warning("feature ID '%s' is not unique", gt_str_get(id));
>     while (!id_string_is_unique(id, buf, gff3_visitor->used_ids, i))
>     {
>       i++;
>     }
>     gt_warning("make feature ID unique: %s -> %s", gt_str_get(id),
>                                                    gt_str_get(buf));
>     /* update table with the new id */
>     gt_str_set(id, gt_str_get(buf));
>     gt_str_delete(buf);
>   }
>   gt_cstr_table_add(gff3_visitor->used_ids, gt_str_get(id));
> }
>
> Also, you need to take care of freeing the GtCStrTable created in the
> visitor constructor.
>
> As for the tests, the error messages of the failing tests can be found
> in the corresponding <GT_HOME>/testsuite/stest_testsuite/test* directories.
>
> Have fun coding,
> Sascha
>

thanks, i didnt know what gt_str_reset did. and i found GT_MEM_BOOKKEEPING.
so, now the failing tests are because previously a line of:
chr8    AUGUSTUS    gene    3922    44514   1   -   .   ID=g1
would have a new id created as:
chr8    AUGUSTUS    gene    3922    44514   1   -   .   ID=gene1
where as now, it still has the ID=g1.

should i just change the tests or do you want to keep the original behavior when
the retainids is not used?

-b


More information about the gt-users mailing list