Commit graph

54 commits

Author SHA1 Message Date
Bart Schuurmans 3aefbb548e Allow serving BookWyrm on a non-standard port 2024-04-24 15:30:47 +02:00
Joeri de Ruiter 2920973961 Some small improvements to annotations 2023-07-28 20:54:03 +02:00
Joeri de Ruiter f07d7b02f1 Type annotations and related changes for bookwyrm.connectors 2023-07-28 17:43:32 +02:00
Wesley Aptekar-Cassels 3e78e398c0 Switch from priority queues to function-based queues
Fixes: #2907
2023-07-20 12:25:30 -04:00
Mouse Reeve cbb027c56c
Merge pull request #2778 from ranok/upstream_pr
Move the search request logic into the AbstractConnector
2023-04-25 16:20:24 -07:00
Jacob Torrey 84834eb5d3 Run bw-dev black to fix formatting
Signed-off-by: Jacob Torrey <jacob@jacobtorrey.com>
2023-04-17 15:06:41 +00:00
Wesley Aptekar-Cassels 1048638e30 Stop ignoring task results
This is essentially a revert of 9cbff312a. The commit was at the advice
of the Celery docs for optimization, but I've since decided that the
downsides in terms of making things harder to debug (it makes Flower
nearly useless, for instance) are bigger than the upsides in performance
gain (which seem extremely small in practice, given how long our tasks
take, and the number of tasks we have).
2023-04-07 21:51:44 -04:00
Josh Soref 06fa1adc27 spelling: arbitrary
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2023-04-04 20:02:54 -04:00
Jacob Torrey f9c75a43ae Fixing pylint issues
Signed-off-by: Jacob Torrey <jacob@jacobtorrey.com>
2023-04-04 16:46:32 +00:00
Jacob Torrey 797d339132 Move the search request logic into the AbstractConnector to allow for more flexibility
Signed-off-by: Jacob Torrey <jacob@jacobtorrey.com>
2023-04-04 16:03:37 +00:00
Wesley Aptekar-Cassels 9cbff312a5 Ignore Celery task results
Since we don't use the results of our Celery tasks (all of them return
None implicitly), it's prudent to set the ignore_result flag, for a
potential performance improvement. See the Celery docs for details [1].

We could do this with the global CELERY_IGNORE_RESULT setting, but it
offers more flexibility if we want to use task results in the future to
set it on a per-task basis.

[1]: https://docs.celeryq.dev/en/stable/userguide/tasks.html#ignore-results-you-don-t-want
2023-03-08 02:12:13 -05:00
André Jaenisch 530d7de309
Use variable instead of string
Signed-off-by: André Jaenisch <andre.jaenisch@posteo.de>
2022-11-13 16:59:05 +01:00
Mouse Reeve 5706028656 Log failing to connect as info instead of exception
These are normal, expected errors, and while we should probably
re-evaluate the connectors in some way, pending that, there's no need to
log these as unepected errors, which causes confusion and clutters my
error logging.
2022-07-11 08:47:18 -07:00
Mouse Reeve d149e57494 Split expand book data task into per-edition tasks
Loading every edition in one task takes ages, and produces a large task
that clogs up the queue. This will create more, smaller tasks that will
finish more quickly.
2022-05-31 12:41:57 -07:00
Mouse Reeve c3b35760a2 Updates test mocks for remote search 2022-05-31 09:37:54 -07:00
Mouse Reeve 969db13ff2 Safely return None in remote search return_first 2022-05-31 08:49:23 -07:00
Mouse Reeve a053f20961 Re-implements return first option
Since we get all the results quickly now, this aggregates all the
results that came back and sorts them by confidence, and returns the
highest confidence result. The confidences aren't great on free text
search, but conceptually that's how it should work at least.

It may make sense to aggregate the search results in all contexts, but
I'll propose that in a separate PR.
2022-05-31 08:20:59 -07:00
Mouse Reeve 98ed03b6b4 Python formatting and test update 2022-05-30 17:00:34 -07:00
Mouse Reeve 83ee5a756f Filter intentaire results by confidence 2022-05-30 16:42:37 -07:00
Mouse Reeve 525e2a591d More error handing
Adds logging and error handling for some of the numerous ways a request
could fail (the remote site is down, the url is blocked, etc).

I also have the results boxes open by default, which makes it more
legible imo.
2022-05-30 12:40:13 -07:00
Mouse Reeve 45f2199c71 Gather and wait on async requests
This sends out the request tasks all at once and then aggregates the
results, instead of just running them one after another asynchronously.
2022-05-30 12:05:22 -07:00
Mouse Reeve 5e81ec75fb Set request headers in async search get request
Gotta ask for json
2022-05-30 11:19:16 -07:00
Mouse Reeve 9a9cef7766 Verify url before async search
The database lookup doesn't work during the asyn process, so this change
loops through the connectors and grabs the formatted urls before sending
it to the async handler.
2022-05-30 11:16:05 -07:00
Mouse Reeve 0adda36da7 Remove search endpoints from Connector
Instead of having individual search functions that make individual
requests, the connectors will always be searched asynchronously
together. The process_seach_response combines the parse and format
functions, which could probably be merged into one over-rideable
function.

The current to-do on this is to remove Inventaire search results that
are below the confidence threshhold after search, which used to happen
in the `search` function.
2022-05-30 10:37:24 -07:00
Mouse Reeve 9c03bf782e Make an async request to all search connectors
This is the untest first pass at re-arranging remote search to work in
parallel rather than sequence. It moves a couple functions around
(raise_not_valid_url, for example, needs to be in connector_manager.py
now to avoid circular imports). It adds a function to Connector objects
that generates a search result (either to the isbn endpoint or the free
text endpoint) based on the query, which was previously done as part of
the search.

I also lowered the timeout to 8 seconds by default.
2022-05-30 10:15:22 -07:00
Mouse Reeve 72d6a4ce52 Log info, not exception, for expected errors 2022-03-11 14:55:54 -08:00
Mouse Reeve b18c69e186 Make search timeouts configurable 2022-01-07 07:42:05 -08:00
Mouse Reeve 5dd2aac600 Merge branch 'main' into search-refactor 2021-09-30 10:41:30 -07:00
Mouse Reeve acfb1bb376 Updating string format synatx part 2 2021-09-18 11:32:00 -07:00
Mouse Reeve fbe05623ff Updates first_search_result functionality 2021-09-16 11:07:36 -07:00
Mouse Reeve 1f06d1a1d8 Removes local connector 2021-09-14 15:26:36 -07:00
Mouse Reeve aa91361fe4 Fixes celery kwarg for queue 2021-09-07 17:09:44 -07:00
Mouse Reeve de3f18655c Set priorities on tasks 2021-09-07 16:33:43 -07:00
Mouse Reeve 9e5c7053e9 More pylint fixes 2021-06-18 14:29:24 -07:00
Mouse Reeve cf3869ad32 Adds timeouts to get requests 2021-06-17 12:34:54 -07:00
Mouse Reeve 9b42bba236 Filter out inactive connectors 2021-05-11 11:34:58 -07:00
Mouse Reeve df2c1f0723
Merge branch 'main' into fixes-search-display 2021-05-10 13:29:39 -07:00
Mouse Reeve 1844dd6b20 Only include result blobs with results in search results 2021-05-10 13:01:11 -07:00
Mouse Reeve 13dc5efe71 More comprehensive tests for connector search 2021-05-10 12:53:36 -07:00
Mouse Reeve 5cd974b78d Python formatting 2021-05-10 10:03:05 -07:00
Mouse Reeve f2d985e583 Uses one set of search logic for all results or just first 2021-05-10 09:57:53 -07:00
Mouse Reeve f2a6cfb4f3 Remove deduplication of external search results 2021-04-30 16:04:27 -07:00
Mouse Reeve 1edd00a0d1 Merge branch 'main' into list-fixes 2021-04-26 09:44:55 -07:00
Mouse Reeve 3ade2d3bb1 New version of black, new whitespace 2021-04-26 09:15:42 -07:00
Mouse Reeve 0f6b5cc6be Filter list search results to hide already added books 2021-04-26 08:02:30 -07:00
Mouse Reeve 954958b6f9 Handle arbitrary errors in isbn search 2021-04-07 10:54:00 -07:00
Mouse Reeve f11d64f984 Handle all connector errors in search 2021-04-07 08:09:47 -07:00
Mouse Reeve 8ea60c66a3 Create connectors to federated bookwyrm servers
This got messed up when I refactored how connectors work! Poor
bookwyrm.social doesn't have a wyrms.de connector, but this will fix
that.
2021-04-01 17:02:45 -07:00
Mouse Reeve 66b7a3d193 Avoids error on empty search query 2021-03-31 12:03:58 -07:00
Mouse Reeve 414dd6bd20 Adds isbn search test to connector manager 2021-03-13 10:01:17 -08:00