searxng/searx/plugins/ahmia_filter.py
Markus Heiser 82fd0dac60 [mod] lower memory footprint by lazy loading JSON data
This patch implements lazy loading of the JSON data.

Motivation: in most requests not all JSON data is needed, but loaded.  By
example these four JSON files:

- currencies.json ~550KB
- engine_descriptions.json ~1,3MB
- external_bangs.json ~1,3MB
- osm_keys_tags.json ~ 2,2MB

most often not used and consume a lot of memory and BTW they also extend the
time required to instantiate a walker.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-04-29 19:02:00 +02:00

29 lines
854 B
Python

# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
from hashlib import md5
from searx import data
name = "Ahmia blacklist"
description = "Filter out onion results that appear in Ahmia's blacklist. (See https://ahmia.fi/blacklist)"
default_on = True
preference_section = 'onions'
ahmia_blacklist = None
def on_result(_request, _search, result):
if not result.get('is_onion') or not result.get('parsed_url'):
return True
result_hash = md5(result['parsed_url'].hostname.encode()).hexdigest()
return result_hash not in ahmia_blacklist
def init(_app, settings):
global ahmia_blacklist # pylint: disable=global-statement
if not settings['outgoing']['using_tor_proxy']:
# disable the plugin
return False
ahmia_blacklist = data.ahmia_blacklist_loader()
return True