[Debtags-devel] Using Python NLTK for tag generation [was: AI for tag generation].
Gustavo Franco
Gustavo Franco <gustavorfranco@gmail.com>
Fri, 1 Oct 2004 14:02:32 -0300
Hi,
I guess that NLTK is more useful than it but i prefer to code
something and see if the results fit with our needs.Maybe you know
more about NLTK than me, do you? If not, please check this url:
http://www-106.ibm.com/developerworks/linux/library/l-cpnltk.html.
The NTLK can be used to:
- Split the package descriptions (corpora) of already tagged packages in tokens;
- Associate these tokens with their tags;
- Get some package descriptions of non-tagged packages and try
classify them using the text classification stuff already on NLTK.
Why hard work writing grammar rules are needed here?
I agree with a special tag too, but can't it be 'special:tag-verified' ?
Thanks,
Gustavo Franco -- <stratus@acm.org>
On Fri, 1 Oct 2004 17:49:01 +0200, Erich Schubert
<erich.schubert@gmail.com> wrote:
> Hi,
> NTLK and similar natural-language-processing things are mostly of use
> when you have a limited grammar you want to understand perfectly. i.e.
> parsing commands like "search for all movies directed by stephen
> spielberg"
> This requires a lot of work in writing grammar rules and such; but we
> can't actually use this grammar information. Therefore i don't think
> NTLK will help that much.
>
> The suggestion by enrico of a "special:completely-tagged" tag is
> sweet. I'd appreciate having this tag added to the vocabulary.
>