searxng/searx
Markus Heiser 9328c66e93 [fix] google news - send CONSENT Cookie to not be redirected
In the EU there exists a "General Data Protection Regulation" [1] aka GDPR (BTW:
very user friendly!) which requires consent to tracking.  To get the consent
from the user, google-news requests are redirected to confirm and get a CONSENT
Cookie from https://consent.google.de/s?continue=...

This patch adds a CONSENT Cookie to the google-news request to avoid
redirection.

The behavior of the CONTENTS cookies over all google engines seems similar but
the pattern is not yet fully clear to me, here are some random samples from my
analysis ..

Using common google search from different domains::

    google.com:        CONSENT=YES+cb.{{date}}-14-p0.de+FX+816
    google.de:         CONSENT=YES+cb.{{date}}-14-p0.de+FX+333
    google.fr:         CONSENT=YES+srp.gws-{{date}}-0-RC2.fr+FX+826

When searching about videos (google-videos)::

    google.es:         CONSENT=YES+srp.gws-{{date}}-0-RC2.es+FX+076
    google.de:         CONSENT=YES+srp.gws-{{date}}-0-RC2.de+FX+171

Google news has only one domain for all languages::

    news.google.com:   CONSENT=YES+cb.{{date}}-14-p0.de+FX+816

Using google-scholar search from different domains::

    scholar.google.de: CONSENT=YES+cb.{{date}}-14-p0.de+FX+333
    scholar.google.fr: does not use such a cookie / did not ask the user
    scholar.google.es: does not use such a cookie / did not ask the user

Interim summary:

  Pattern is unclear and I won't apply the CONSENT cookie to all google engines.
  More experience is need before we generalize the CONSENT cookies over all
  google engines.

Related:

- e9a6ab401 [fix] youtube - send CONSENT Cookie to not be redirected
- https://github.com/benbusby/whoogle-search/issues/311
- https://github.com/benbusby/whoogle-search/issues/243

[1] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-18 13:21:20 +02:00
..
answerers [fix] answers: don't crash when the query is an empty string 2021-03-01 10:52:39 +01:00
data [pylint] searx/data/__init__.py 2021-06-09 18:08:23 +02:00
engines [fix] google news - send CONSENT Cookie to not be redirected 2021-06-18 13:21:20 +02:00
metrics [refactor] metrics.get_reliabilities() - make code more readable 2021-05-22 15:17:18 +02:00
network [mod] move all default settings into searx.settings_defaults 2021-06-01 08:10:15 +02:00
plugins [mod] oscar theme: /preferences : HTML detail order match visual tabs 2021-06-17 15:29:07 +02:00
raise_for_httperror [enh] rewrite and enhance metrics 2021-04-21 16:24:46 +02:00
search [fix] typo: online_dictionnary --> online_dictionary 2021-06-04 15:05:58 +02:00
shared [fix] checker: don't run the checker when uwsgi is not properly configured 2021-01-13 14:07:39 +01:00
static [mod] make node.clean: call the "clean" script from the packages.json 2021-06-16 16:04:58 +02:00
templates [mod] oscar theme: /preferences : HTML detail order match visual tabs 2021-06-17 15:29:07 +02:00
translations [enh] update translations from transifex 2021-03-27 19:10:54 +01:00
__init__.py [mod] move hook to set Unix thread name into searx.unixthreadname 2021-06-08 15:54:11 +02:00
autocomplete.py [httpx] replace searx.poolrequests by searx.network 2021-04-12 17:25:56 +02:00
exceptions.py [enh] add raise_for_httperror 2020-12-11 14:37:08 +01:00
external_bang.py [mod] add utils/fetch_external_bangs.py 2021-02-24 18:48:36 +01:00
external_urls.py [enh] openstreetmap / map template: improve results 2021-06-09 18:08:23 +02:00
flaskfix.py [mod] refactor: move Flask proxy fix to searx.flaskfix module 2021-06-08 15:54:11 +02:00
languages.py Update searx.data - update_languages.py 2021-03-05 10:56:46 +00:00
preferences.py [mod] move all default settings into searx.settings_defaults 2021-06-01 08:10:15 +02:00
query.py [enh] autocomplete refactoring, autocomplete on external bangs 2021-03-01 19:12:32 +01:00
results.py [fix] offline engine: don't crash on time recording 2021-05-22 15:17:18 +02:00
settings.yml [fix] typo in a searx/settings.yml 2021-06-16 16:51:28 +02:00
settings_defaults.py [pylint] searx/__init__.py & searx/settings_defaults.py 2021-06-01 16:03:19 +02:00
settings_loader.py [fix] unit test: don't load /etc/searx/settings.yml 2021-05-18 17:23:21 +02:00
settings_robot.yml [yamllint] searx/settings_robot.yml 2021-06-05 17:41:24 +02:00
testing.py Bump pylint from 2.7.4 to 2.8.2 2021-05-03 15:45:30 +02:00
unixthreadname.py [mod] move hook to set Unix thread name into searx.unixthreadname 2021-06-08 15:54:11 +02:00
utils.py [fix] strip spaces from searx user agent 2021-06-09 18:08:23 +02:00
version.py [enh] release v1.0.0 2021-03-27 20:30:08 +01:00
webadapter.py [enh] add ability to send engine data to subsequent requests 2021-03-06 12:12:35 +01:00
webapp.py [coding-style] searx/webapp.py - normalize indentations 2021-06-10 09:35:00 +02:00
webutils.py [mod] move all default settings into searx.settings_defaults 2021-06-01 08:10:15 +02:00