RE: Tagging a document with language

Robert Raisch, The Internet Company (raisch@internet.com)
Fri, 7 Jun 1996 10:41:03 -0400


Henk, I'd like to see what you've done. We have a number of
European customers who are tackling heterogeneous language
indexing. Thanks. </rr>

On 7 Jun 96 at 10:01, Henk Alles wrote:

> A more interesting approach is the indexer trying to figure the
> language of the document, based may be on a statistical analysis.
> Probably, problems will arise with mixed languages files.
>
> What do you think of that ? Has this been done by someone ?
>
>
> I've written such a thing a while ago. If anyone wants it, mail me.
>
> It is written in C++ and Delphi as a 16 bit DLL. There is a little program to test, import and do some statistics on the data, it can handle up to 65535 different languages.
>
>
>

--
  </rr> Rob Raisch, The Internet Company