Changelog
=========
Important UPDATE:
As of February 2023, the overcomingbias blog has moved to substack.
As a result, this web scraper no longer works.
Apologies for the inconvenience!
Versions follow the `Semantic Versioning 2.0.0 `_
standard.
obscraper 0.8.2 (2022-12-17)
****************************
Improvements
############
- Build with hatchling.
obscraper 0.8.1 (2022-07-07)
****************************
Improvements
############
- Use HTTP/2 client for improved performance.
obscraper 0.8.0 (2022-05-27)
****************************
Improvements
############
- No longer fail for posts where comments appear to be disabled. (I.e. posts without a
``disqus_id``.) This applies to 4 posts:
- ``2007/12/welcome-to-overcoming-bias``
- ``2008/08/about-the-future-of-humanity-institute``
- ``2008/08/yudkowskys-book``
- ``2009/02/the-most-important-thing``
Trivial/Internal Changes
########################
- Use absolute rather than relative imports for clarity.
obscraper 0.7.0 (2022-03-14)
****************************
Breaking changes
################
- Post ``name`` is now formatted without the leading slash. E.g.:
- Before: /2009/02/the-most-important-thing
- After: 2009/02/the-most-important-thing
Bug fixes
#########
- Fixed small bug matching URLs in regular expressions.
obscraper 0.6.0 (2022-03-11)
****************************
Breaking changes
################
- Updated API: the ``internal_links`` and ``external_links`` attributes of
:ref:`Post ` are now lists (possibly containing duplicates) rather than
dictionaries.
obscraper 0.5.0 (2022-02-10)
****************************
Major update.
Improvements
############
- Asynchronous execution: internals now execute requests and postprocessing
asynchronously using `trio `_. This is at least
10% faster than the previous multithreaded version.
- Improved tests: migrated all tests to pytest. Added more systematic testing of random
posts.
- Sessions: internals now use (asynchronous) sessions, reducing the load on the
overcomingbias server and increasing download speed.
Breaking changes
################
- Updated interface for consistency and clarity:
- ``grab_edit_dates`` is now :ref:`get_edit_dates `
- ``get_votes`` and ``get_comments`` are now :ref:`get_vote_counts `
and :ref:`get_comment_counts `
- Updated behaviour of :ref:`get_all_posts ` to return None when the post
could not be retrieved.
- Removed outdated ``max_workers`` argument from public API functions.
Trivial / internal changes
##########################
- Source code now follows the `black `_ format.
obscraper 0.4.0 (2022-02-06)
****************************
Features
########
- Added `logging `_
functionality, and documentation in the
:doc:`Getting Started ` guide.
Bug fixes
#########
- :ref:`AttributeNotFoundError ` exceptions are now
caught when downloading multiple posts. This prevents crashes on "broken"
posts, e.g. 2009/02/the-most-important-thing.
obscraper 0.3.0 (2022-02-03)
****************************
Breaking Changes
################
- :ref:`get_all_posts `,
:ref:`get_posts_by_edit_date ` and
*grab_edit_dates* now return post names rather than
post URLs in their keys.
- "Short" URLs - the form overcomingbias.com/?p=12345 - are no longer accepted.
This might change again in the future.
Features
########
- Add :ref:`get_post_by_name ` and
:ref:`get_posts_by_names ` to the public API.
- Add :ref:`OB_POST_URL_PATTERN ` to the public API.
- Add :ref:`url_to_name ` and :ref:`name_to_url `
to the public API.
Improved Documentation
######################
- Add information on exceptions raised by public API functions.
Trivial / internal changes
##########################
- Most internal interfaces now use post names rather than URLs.
obscraper 0.2.0 (2022-01-19)
****************************
Breaking Changes
################
- :ref:`get_posts_by_urls ` will now fail when a post
attribute can not be extracted from the post HTML, since this situation is
technically a bug. Previously it returned None.
- The :ref:`Post ` name attribute now contains the year and month of
publication, as in URLs. E.g. 'jobs-explain-lots' becomes
'2010/09/jobs-explain-lots'. This ensures the post URL can be reconstructed
from the post name.
Improvements
############
- Let users specify the maximum number of threads used to download posts, via
the ``max_workers`` optional argument.
- Remove repeated whitespace within the text, when getting post text as
plaintext.
Trivial/Internal Changes
########################
- :ref:`Post ` now represents the post URL as a property rather than
an attribute.
obscraper 0.1.3 (2022-01-18)
*****************************
First public release!
For the initial list of features, see :doc:`Getting Started `
and :doc:`Public API Reference `.
.. Entry title format: obscraper 1.2.3 (release date)
.. Entry items:
.. Breaking Changes = backward-incompatible changes
.. Deprecations = functionality marked as deprecated
.. Features = Added new features
.. Improvements = Improvements to existing features
.. Bug Fixes
.. Improved Documentation
.. Trivial/Internal Changes