Blog Mining, with LSA on top

In my Computing the Mental Lexicon class we were talking about what LSA (and related algorithms) could be used for, and I wondered about doing some sort of marketing research by scanning blogs. E.g. you’ve got 3 million users on LiveJournal.com alone, most with specified ages, locations, and interests. If you could correlate that to discussion about products (Coke, Nike, etc..) and find the words that are associated with the brands, that’s useful stuff.
And for bonus, you could apply some sort of semantic analysis to determine if the comments are positive or negative. This is stuff the marketing departments would love. (And at least with LSA, you could infer education level from the writings too…)

[Like this idea? Already working on it? I’d love to hear about it!]