Commit graph

52 commits

Author SHA1 Message Date
Markus Heiser ba8959ad7c [fix] typos / reported by @kianmeng in searx PR-3366
[PR-3366] https://github.com/searx/searx/pull/3366

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-27 18:32:14 +02:00
Alexandre FLAMENT dd0887be18 xpath engine: change raise_for_httperror to no_result_for_http_status
no_result_for_http_status contains a list of HTTP status.
These HTTP status are seen an empty result list.
In other cases an exception is thrown as usual.

Previously raise_for_httperror were ignoring all HTTP error,
which make defective engines invisible in the stats.
2022-09-04 09:07:28 +02:00
Markus Heiser a15dfa5ee1 [fix] engine woxikon.de - don't raise exception on empty result list
Woxikon expects a word in German, so with query "foo" the site finds nothing and
respons a 404:

    httpx.HTTPStatusError: Client error '404 Not Found' \
      for url 'https://synonyme.woxikon.de/synonyme/foo.php'

[1] https://github.com/searxng/searxng/issues/1543#issuecomment-1193317054

Closes: https://github.com/searxng/searxng/issues/1543
Suggested-by: @allendema [1]
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-04 09:07:28 +02:00
Allen dae8a08089
[fix[ Update only cookies/headers 2022-04-17 11:29:23 +02:00
Allen 67fb6fba84
[lint] Remove whitespace
From GH GUI
2022-04-17 10:42:25 +02:00
Allen 155333f625
[enh] Allow passing headers/cookies from settings.yml
Example:

   - engine: xpath
   - search_url: example.org
   - headers: {'example_header': 'example_header'}
   - cookies: {'safesearch': 'off'}
2022-04-16 17:42:04 +02:00
Markus Heiser 3d96a9839a [format.python] initial formatting of the python code
This patch was generated by black [1]::

    make format.python

[1] https://github.com/psf/black

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:26:22 +01:00
Alexandre Flament 0e42db9da1 [mod] xpath engine: remove logging of the requested URL 2021-09-11 10:13:16 +02:00
Markus Heiser f0059b80ed [pylint] engines: drop no longer needed 'missing-function-docstring'
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 13:26:59 +02:00
Markus Heiser 82847df300 [fix] add 'categories' to PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES
androp no longer needed (see line 591 in 7b235a1)::

    # pylint: disable=undefined-variable

Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 10:29:38 +02:00
Markus Heiser aecfb2300d [mod] one logger per engine - drop obsolete logger.getChild
Remove the no longer needed `logger = logger.getChild(...)` from engines.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06 18:05:46 +02:00
Markus Heiser 9ff881f937 [fix] remove minimum length of content for XPath engine
Instead of raising an exception and therefore hiding all results of the engine.

It make sense to remove that requirement in order to allow the implementation of
search engines that do not always have a description.  In fact some search
engines that in 99% of the case have a description like Brave Search or Mojeek
crash completely if they for some reason included a result with no description.

To test this patch try Mojeek:

    !mjk xyz

before and after the patch.

Suggested-by: 0xhtml in https://github.com/searx/searx/discussions/2933
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-04 12:41:23 +02:00
Markus Heiser 84a943f867 [enh] XPath engine - add time safe-search support
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-23 22:26:18 +02:00
Markus Heiser 6bfe3fd033 [enh] XPath engine - add time range support
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-23 16:49:30 +02:00
Markus Heiser 1933577c8e [enh] XPath engine - add ISO 639-1 {lang} replacement to search-URL
BTW: remove obsolte params['query'] and not needed paging condition.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-23 15:05:36 +02:00
Markus Heiser 8cd544b2a6 [doc] add documentation about the XPath engine
- pylint searx/engines/xpath.py
- fix indentation of some long lines
- add logging
- add doc-strings

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-23 11:48:21 +02:00
Markus Heiser ffcebf5e12 [enh] xpath engine - add request parameter 'soft_max_redirects'
Make 'soft_max_redirects' configurable per Xpath engine::

    - name : <engine-name>
      engine : xpath
      soft_max_redirects: 1
      ...

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-17 15:04:55 +02:00
Alexandre Flament a4dcfa025c [enh] engines: add about variable
move meta information from comment to the about variable
so the preferences, the documentation can show these information
2021-01-14 20:57:17 +01:00
Alexandre Flament d41cafd5f3 [fix] xpath, mojeek: fix commit 58d72f2692
before commit 58d72f2, category was not set in xpath.py,
so searx/engines/__init__py was setting the category to ['general']

the commit 58d72f2 set the category to [] which is not replaced by searx/engines/__init__.py
consequence: the mojeek engine is hidden in the preferences.

this commit revert the xpath.py change.

close #2368
2020-12-10 10:52:06 +01:00
Alexandre Flament ad72803ed9 [mod] xpath, 1337x, acgsou, apkmirror, archlinux, arxiv: use eval_xpath_* functions 2020-12-03 10:22:48 +01:00
Alexandre Flament 58d72f2692 [mod] pylint: minor code change to allow pylint globally
This commit is only a step, it doesn't fix all the issues reported by pylint
2020-11-03 11:35:53 +01:00
a01200356 c3daa08537 [enh] Add onions category with Ahmia, Not Evil and Torch
Xpath engine and results template changed to account for the fact that
archive.org doesn't cache .onions, though some onion engines migth have
their own cache.

Disabled by default. Can be enabled by setting the SOCKS proxies to
wherever Tor is listening and setting using_tor_proxy as True.

Requires Tor and updating packages.

To avoid manually adding the timeout on each engine, you can set
extra_proxy_timeout to account for Tor's (or whatever proxy used) extra
time.
2020-10-25 17:59:05 -07:00
Alexandre Flament 2006eb4680 [mod] move extract_text, extract_url to searx.utils 2020-10-02 18:13:56 +02:00
Dalf 1022228d95 Drop Python 2 (1/n): remove unicode string and url_utils 2020-09-10 10:39:04 +02:00
xywei 1d4657b714
Fix relative urls that do not start with '/' 2020-07-23 11:12:19 -05:00
Dalf 85b3723345 [mod] speed optimization
compile XPath only once
avoid redundant call to urlparse
get_locale(webapp.py): avoid useless call to request.accept_languages.best_match
2019-11-15 09:33:15 +01:00
Alexandre Flament f34b5cedb1
[fix] fixes google play engines (#1651)
update commit 87baa74a86
2019-07-25 09:31:47 +02:00
Venca24 87baa74a86 [fix] fixes google play engines and adds thumbnails to their results (#1612)
fix google play apps, google play apps, google play music engines

xpath engine: thumbnail_xpath can define an optional thumbnail
2019-07-25 07:46:41 +02:00
Marc Abonce Seguin 343e555ee9 [fix] append http if no scheme is provided in xpath's extact_url
This solves a bug with Yahoo where some results don't specify
a protocol.
2018-04-08 20:35:34 -05:00
Adam Tauber 1972a044a3 [fix] produce valid urls if scheme is missing 2017-05-22 15:48:37 +02:00
Adam Tauber 52e615dede [enh] py3 compatibility 2017-05-15 12:02:30 +02:00
David A Roberts 7492997c51 [fix] allow empty content 2017-01-17 21:14:33 +10:00
Alexandre Flament 90e1db3e5c [fix] extract_text: use html.tostring instead html_to_text. Fix #711 2016-12-31 13:56:09 +01:00
David A Roberts 1e9dab08e6 [fix] behaviour for page_size>1 and first_page_num>0
eg. pageno=1,21,41,... instead of 20,40,60,...
2016-08-14 22:10:25 +10:00
Kirill Isakov bacc9a3df1 Add paging support to XPath & Erowid engines 2016-03-28 19:15:03 +06:00
Adam Tauber bd22e9a336 [fix] pep8 compatibilty 2016-01-18 12:47:31 +01:00
Cqoicebordel 44c9216c49 Sanitize extract_text 2015-01-25 20:04:44 +01:00
potato 6f535b6fae [fix] error when xpath_results in extraxt_text is _ElementUnicodeResult instead of _ElementStringResult 2014-03-04 19:43:41 +01:00
asciimoo c1d7d30b8e [mod] len() removed from conditions 2014-02-11 13:13:51 +01:00
asciimoo b647244abf [fix] function parameters 2014-01-30 03:10:20 +01:00
asciimoo 3dcb835910 [fix] function parameters 2014-01-30 02:36:05 +01:00
asciimoo fe82637eac [enh] importable url extractor 2014-01-30 02:32:58 +01:00
asciimoo 59eeeaab87 [fix] html tag removal 2014-01-23 11:08:08 +01:00
asciimoo b2492c94f4 [fix] pep/flake8 compatibility 2014-01-20 02:31:20 +01:00
asciimoo 060ea4d2f5 [fix] whitespaces removed 2014-01-12 18:48:38 +01:00
Dalf 3dc3fc7770 [mod][fix] xpath engine simplified, yahoo engine never returns truncated urls 2014-01-05 14:06:52 +01:00
dalf 664c039b38 xpath engine: bug fix 2013-12-30 22:34:35 +01:00
asciimoo e50a72b0e3 [enh] suggestion support for xpath engine 2013-11-13 19:33:09 +01:00
asciimoo 17bf00ee42 [enh] removing result html tags 2013-11-09 18:39:20 +01:00
asciimoo 7965da55a7 [fix] urlparsing fix 2013-10-27 12:01:03 +01:00