Feb 102011
 
  • LGFBBV: To kind of universally say “how popular” something is, really need a fixed scale, as raw # of searches or other absolute scale, but how do you do that in http://Trends.Google.com (Google Trends)?
    1. LGFBUP: For instance, currently compared to the term “Facebook”, many/most terms score at/near 0, but of course they really aren’t 0.
    2. LGFBL7: but the tool doesn’t seem to give fixed values even with “CSV with fixed scaling” output:
      1. LGFBLI: in multi-term comparison output
        1. LGFBER: My guess is this is intentional to make it difficult to get the raw underlying data, even if that data would not reveal specific IPs or privacy
        2. LGFAA1: Warning: even with “CSV with fixed scaling” output is scaled somehow
          1. LGFAM5: among the weekly data, so far I haven’t yet seen a difference more than 5x so still the same order of magnitude (base 10)
          2. LGFAMO: Among the total overall, I’ve seen, for “WordPress” 0 for this‘s fixed CVS vs. 10.5 for this‘s CVS.
        3. .
      2. LGFCBI: in single-term output
        1. LGFCD7: Example:
          1. http://google.com/trends?q=Meetup%2CBigTent&ctab=0&geo=all&date=all&sort=0 shows Meetup is about 100x more popular than BigTent now, but
          2. http://google.com/trends?q=Meetup&ctab=0&geo=all&date=all&sort=0 “CSV with fixed scaling” says it has an overall rating of “2.3” and q[01/30/11    5.55    5.00%] (typical of recent)
          3. http://google.com/trends?q=BigTent&ctab=0&geo=all&date=all&sort=0 says it has an overall rating of “1” and q[01/30/11    31.2    >10%]q (typical of recent)
          4. which would suggest BigTent is overall fully 1/2 as popular (wrong) and recently has been 6x more popular (very wrong)
        2. .
      3. LGFCSU: To fix for this universally
        1. LGFCTF: Seems hard.
        2. LGFCTF: A most universal solution LGUNIV:
          1. LGFCZ2: Rough idea:
            1. LGFCVD: For every order of 10 of magnitude of search, pick a standard search term of that popularity.
            2. LGFCXV: Compare these search terms with each other over time (hopefully you’d need no more than 5 of them, as then that gets tricky, too). Pick say the central (medium-popular) term as the standard and compute all the other standards in terms of this one (how many x greater or less popular, for each point (week) in the past). This data you would cache.
            3. LGFD0G: Now to search a term in question,
              1. LGFD1E: Try it (binary search) against the standard search terms until one finds a standard which scales closest to it.
              2. LGFD2K: Now, for all points of time, convert the term in question’s popularity factor in terms of central term picked above, using the translation table for the standard best matching it.
          2. LGFDCR: If wanting to scale for all time, it seems this method would need to be coded/automated to be practical.
          3. LGFEA5: Strategy LGFEDY: for easy computation (in base 10), try to pick terms which
            1. LGFEBT: don’t change much in popularity and
            2. LGFECA: are, which sorted by size, are each a factor of 10 apart.
          4. LGG1R5: Implementation LGYUMY — works, and not that complicated!
        3. LGFDFE: a “poor Man’s” solution: –as good as the cost (won’t work very well)
          1. LGFDG9: Rough idea:
            1. LGFDH4: Do A most universal solution LGFCTF but try to find a single term against which all terms can compare directly
          2. LGFFDQ: Possible? Not really.
            1. LGFFEF: Well that term must be of medium popularity, from which all terms would be no more than Max Compare LGFDRH factor difference. That then says that all terms are within Max Compare LGFDRH^2 difference.  Since Max Compare LGFDRH is thought to be about 100, then all terms must be within 100^2 difference= 10,000=10^4.
            2. LGFFHR: This is probably, no really, not enough range.
              1. LGG17S: Intuitively this is correct.  Facebook claims to have about .5billion=5*10^8 users; yet there are some websites which have ~5 users. The difference is order 10^8, which 10^4x greater than “100^2 difference= 10,000=10^4”
              2. LGG1KA: From real example LGFVXG , the values range from 29.5 kilo to 8 mili (still a big site), which is a range of 10^6, also greater than 10^4.
          3. .
      4. .
    3. end of Getting the raw “# of searches” (or any absolute scale) from http://Trends.Google.com results.