A genetic example of Latent Dirichlet Allocation


            There are lots of blogs or examples online to explain how a document is generated by Latent Dirichelt Allocation. But during a discussion last week, Prof. Banks gave me a very informative example of Latent Dirichelet Allocation, and it illustrates the way we inherit the genes from our ancestors.

            Suppose Mark is a young data scientist living in the San Francisco Bay Area, his maternal family is from Japan, and his paternal family is from Europe. His favorite food is volcano rolls and Cobb Salad. What kind of genes might Mark have? Let’s try to analyze it in a LDA way:

            How every part of genes is chosen(generated) before he was born?

            This example is impressive and I personally think it’s more reasonable than many topic modeling examples, since usually the way people use words and make sentences is more dynamic and unpredictable.