ss_blog_claim=8d153ae8d8a38d5e918e6d9f2336f873

spam corpus

Posted on March 21st, 2007 in Language by jiajun925

Normal, when you get spam email, you will delete that, or mark that as spam, so the program (e.g., Fox) will do machine learning and remember that. I think, in the world, a lot of person are attacked by spam email. but in the other hand, a lot of person are sending spam email, maybe. in different country with different language. If you collect the spam email and make that as corpus, and then translate (of course machine translation) to other language, it would get a lot of users or more practical.

Along with the internationalization of spam email , the machine translation  will get developed rapidly, especially,  AV  machine translation, or  AV  robot.

It should be fun.