Patente de Google revela secretos
La patente 20050071741 fechada el 31 de Marzo del 2005 revela interesantes métodos que usa Google para recabar información de sitios web. Recabar tanta información es una tarea de enormes proporciones, y poderla indexar para presentarla de una manera coherente lo es aún mas. Lo que me llama la atención de esta patente es como la palabra “spam” aparece 20 veces.
There are several factors that may affect the quality of the results generated by a search engine. For example, some web site producers use SPAMming techniques to artificially inflate their rank. Also, “stale” documents (i.e., those documents that have not been updated for a period of time and, thus, contain stale data) may be ranked higher than “fresher” documents (i.e., those documents that have been more recently updated and, thus, contain more recent data).
Certain signals may be used to distinguish between illegitimate and legitimate domains. For example, domains can be renewed up to a period of 10 years. Valuable (legitimate) domains are often paid for several years in advance, while doorway (illegitimate) domains rarely are used for more than a year. Therefore, the date when a domain expires in the future can be used as a factor in predicting the legitimacy of a domain and, thus, the documents associated therewith.
¿Cómo ven? Nunca se me hubiera ocurrido que registrar un dominio por un año pudiera interpretarse como que es ilegítimo. Claro, como Google ahora tiene permitido registrar dominios (aunque Google no piense hacerlo), hay que recomendar registrarlos por mas años… perdonen mi paranoia.