Blogs, News Sites, and ForumsFirst, we check the top level domain (TLD) name. If the website has a country specific TLD (e.g. 'http://example.fr'), then we set the location to that country. In case a general TLD is used (e.g. .com, .net, .org, etc.), we do a detection based on the geolocation of the IP address that is serving the website. This method is valid in more than 90% of the cases. For websites, we limit the detection to the country level since city level is not relevant.
Social NetworksLocation detection on Twitter is done in multiple steps:
- If the tweet has a location attached (coming from a smartphone, for example), then we use that location.
- Otherwise, we use the location that the user has entered on his profile page.
The location on a Twitter profile page is a free-text field, which means that those locations are not always exact (a lot of people even enter things like "the internet" or "everywhere"). In those cases, we use heuristics to determine the most likely location. If, for example, only a city name was filled in and that city exists in multiple countries, then the city with the biggest population will be chosen. Similar heuristics are applied for people that fill in multiple locations in that field.
Note that this is not always 100% accurately determined. Sometimes, words that have a normal meaning in a language may be interpreted as a geographical location.
We work with a 'confidence threshold' to decide whether or not to assign a tweet a specific location or label it as unknown.
Foursquare is an easy one since this platform always provides us with a very accurate localisation.
Location detection on Facebook is not possible since Facebook does not allow third parties to retrieve the location. This is a huge limitation, but unfortunately, something that cannot be fixed. This also means that location filters such as 'country' are not applicable for Facebook mentions. Also, when your keyword search contains a 'country' condition, this may limit the incoming mentions from Facebook as well.
Social Networks such as Google+, FriendFeed, and others also lack location information. For social mentions we don't use IP addresses or domains to determine a location. If there's no location information available we'll determine it as Unknown. (but this can be changed manually)