Changelog ========= Versions follow the `Semantic Versioning 2.0.0 `_ standard. obscraper 0.8.2 (2022-12-17) **************************** Improvements ############ - Build with hatchling. obscraper 0.8.1 (2022-07-07) **************************** Improvements ############ - Use HTTP/2 client for improved performance. obscraper 0.8.0 (2022-05-27) **************************** Improvements ############ - No longer fail for posts where comments appear to be disabled. (I.e. posts without a ``disqus_id``.) This applies to 4 posts: - ``2007/12/welcome-to-overcoming-bias`` - ``2008/08/about-the-future-of-humanity-institute`` - ``2008/08/yudkowskys-book`` - ``2009/02/the-most-important-thing`` Trivial/Internal Changes ######################## - Use absolute rather than relative imports for clarity. obscraper 0.7.0 (2022-03-14) **************************** Breaking changes ################ - Post ``name`` is now formatted without the leading slash. E.g.: - Before: /2009/02/the-most-important-thing - After: 2009/02/the-most-important-thing Bug fixes ######### - Fixed small bug matching URLs in regular expressions. obscraper 0.6.0 (2022-03-11) **************************** Breaking changes ################ - Updated API: the ``internal_links`` and ``external_links`` attributes of :ref:`Post ` are now lists (possibly containing duplicates) rather than dictionaries. obscraper 0.5.0 (2022-02-10) **************************** Major update. Improvements ############ - Asynchronous execution: internals now execute requests and postprocessing asynchronously using `trio `_. This is at least 10% faster than the previous multithreaded version. - Improved tests: migrated all tests to pytest. Added more systematic testing of random posts. - Sessions: internals now use (asynchronous) sessions, reducing the load on the overcomingbias server and increasing download speed. Breaking changes ################ - Updated interface for consistency and clarity: - ``grab_edit_dates`` is now :ref:`get_edit_dates ` - ``get_votes`` and ``get_comments`` are now :ref:`get_vote_counts ` and :ref:`get_comment_counts ` - Updated behaviour of :ref:`get_all_posts ` to return None when the post could not be retrieved. - Removed outdated ``max_workers`` argument from public API functions. Trivial / internal changes ########################## - Source code now follows the `black `_ format. obscraper 0.4.0 (2022-02-06) **************************** Features ######## - Added `logging `_ functionality, and documentation in the :doc:`Getting Started ` guide. Bug fixes ######### - :ref:`AttributeNotFoundError ` exceptions are now caught when downloading multiple posts. This prevents crashes on "broken" posts, e.g. 2009/02/the-most-important-thing. obscraper 0.3.0 (2022-02-03) **************************** Breaking Changes ################ - :ref:`get_all_posts `, :ref:`get_posts_by_edit_date ` and *grab_edit_dates* now return post names rather than post URLs in their keys. - "Short" URLs - the form overcomingbias.com/?p=12345 - are no longer accepted. This might change again in the future. Features ######## - Add :ref:`get_post_by_name ` and :ref:`get_posts_by_names ` to the public API. - Add :ref:`OB_POST_URL_PATTERN ` to the public API. - Add :ref:`url_to_name ` and :ref:`name_to_url ` to the public API. Improved Documentation ###################### - Add information on exceptions raised by public API functions. Trivial / internal changes ########################## - Most internal interfaces now use post names rather than URLs. obscraper 0.2.0 (2022-01-19) **************************** Breaking Changes ################ - :ref:`get_posts_by_urls ` will now fail when a post attribute can not be extracted from the post HTML, since this situation is technically a bug. Previously it returned None. - The :ref:`Post ` name attribute now contains the year and month of publication, as in URLs. E.g. 'jobs-explain-lots' becomes '2010/09/jobs-explain-lots'. This ensures the post URL can be reconstructed from the post name. Improvements ############ - Let users specify the maximum number of threads used to download posts, via the ``max_workers`` optional argument. - Remove repeated whitespace within the text, when getting post text as plaintext. Trivial/Internal Changes ######################## - :ref:`Post ` now represents the post URL as a property rather than an attribute. obscraper 0.1.3 (2022-01-18) ***************************** First public release! For the initial list of features, see :doc:`Getting Started ` and :doc:`Public API Reference `. .. Entry title format: obscraper 1.2.3 (release date) .. Entry items: .. Breaking Changes = backward-incompatible changes .. Deprecations = functionality marked as deprecated .. Features = Added new features .. Improvements = Improvements to existing features .. Bug Fixes .. Improved Documentation .. Trivial/Internal Changes