• Log In
  • Sign Up
    • Following from this:

      I'd like to flesh out the concept of topic graphs and how they might be useful to finding interesting content here on Cake in more detail.

      The situation: each day, new conversations are created on Cake. Each time this happens, the user starting the conversation is asked to add it to up to five different topics. This makes the conversation available to those who already follow at least one of the topics - but not to those who might be following a "related" topic but not exactly the one that has been used. Users might want to manage their followed topics every once in a while, adding new ones and getting rid of those that no longer work - but what topics should they pick?

      The algorithm:
      1) At the end of each week, iterate over all conversations that have been started this week (alternatively: that have been active this week).
      2) For each conversation, iterate over all sorted pairs of topics this conversation has been added to.
      3) For each such pair of topics, increase its count by 1.

      The intermediate result will be a "weekly list" like this, with a higher count corresponding to more conversations with both topics, and thus a closer relationship between those topics:
      - (topic_a, topic_b) 5
      - (topic_a, topic_c) 2
      - (topic_b, topic_d) 3
      - (topic_c, topic_e) 1

      4) Optionally, add up the last N weekly lists (perhaps weighing them differently) into a single one to use from here on, so that changes in conversation behavior are reflected more gradually.

      The graph: To make this work for content exploration, we want to turn the above list into a graph where related topics are connected and unrelated topics are not, and where it is easier to travel between "more related" than between "less related" topics. In order to do so, we take the reciprocal of the above count and consider this value to be the distance between two topics. In the above example, this would mean that topics A and B are very close (distance 1/5), while C and E are less so (distance 1/1).

      These distances can then be used to calculate the shortest route between topics, even if there is no direct connection. In the above example, there hasn't been any conversation with both topics A and D, but by considering the distance from A to B (1/5) and the distance from B to D (1/3), we can state that their distance is just above 0.5, and that A and D must have a much closer relation than A and E (distance 1.5 via C).

      The UI: In the end, we probably don't want to see these details directly (I do, but the average user probably doesn't!). Instead, this graph should be used to decide what topics to suggest to a user:

      For example, if the user visits the topic page A, at the end of it (or after every N conversations) could be a box suggesting five other topics the user doesn't already follow, which are chosen based on their distance to A.

    • Hey Factotum, this is fascinating. Thanks for writing this up. Our silence isn't for lack of interest but we wanted to think it over and debate it among ourselves, which we're doing. More to follow.

    • Thank you for writing up, Factotum!

      This sounds like a major improvement to our topic suggestion system. As you've pointed out, right now we only suggest topics to follow based on the conversation's topics that the author tagged. We don't have a relationship graph mapped out between topics except for bucketing them under categories. Each topic can live in one or many categories, but there is no relationship established across categories.

      For the v1 we decided on this simple approach in order to ship it sooner and learn. For the v2, your suggested topic graph idea is what we'd love to do.